Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: db compaction cmd #4329

Merged
merged 21 commits into from
Sep 23, 2023
Merged

feat: db compaction cmd #4329

merged 21 commits into from
Sep 23, 2023

Conversation

istae
Copy link
Member

@istae istae commented Sep 20, 2023

Checklist

  • I have read the coding guide.
  • My change requires a documentation update, and I have done it.
  • I have added tests to cover my changes.
  • I have filled out the description and linked the related issues.

Description

Adds a compaction cmd to resize sharky to smallest size possible.
Also includes a before and after chunk validation step to verify chunks are moved around properly.

  • turn validation off by default

closes #3844

Open API Spec Version Changes (if applicable)

Motivation and Context (Optional)

Related Issue (Optional)

Screenshots (if appropriate):

@istae istae marked this pull request as draft September 20, 2023 18:00
@istae istae marked this pull request as ready for review September 21, 2023 21:51
Copy link
Contributor

@mrekucci mrekucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the comments, LGTM!

cmd/bee/cmd/db.go Outdated Show resolved Hide resolved
return nil
})

sort.Slice(items, func(i, j int) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the new slices.SortFunc() instead.

pkg/storer/internal/chunkstore/helpers.go Outdated Show resolved Hide resolved
pkg/storer/compact_test.go Outdated Show resolved Hide resolved
return nil
}

eg, ctx := errgroup.WithContext(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to listen for this context cancellation if you are thinking of doing a clean cancellation of this command.

If we dont the wait will just wait till all the routines are done. Not sure if this is expected. If yes we should clearly mention in the doc that users should not cancel the command.

Instead of this, I think it might be better to have the validate as a separate command. Users can run it before/after or anytime they feel there is some issue. Also we can add some good messages regarding the inconsistencies so maybe we can take some actions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the main purpose of the validation is to test the compaction feature and it will be off by default.
i will make the fixes for better ctx handling

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is validation could be useful as a command in itself. To identify if there are inconsistencies in the localstore. So in this case users can run it separately if ever needed.

pkg/storer/compact.go Outdated Show resolved Hide resolved
@istae
Copy link
Member Author

istae commented Sep 23, 2023

a bee node (that has suffered a sharky leak) with a localstore size of ~66GB dropped to ~22GB after compaction.
Notice that the number of invalid chunks has NOT increased and localstore is stable post compaction.

"time"="2023-09-23 02:26:20.200724" "level"="info" "logger"="node" "msg"="performing chunk validation before compaction"
"time"="2023-09-23 02:26:42.403098" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=100000
"time"="2023-09-23 02:27:02.872280" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=200000
"time"="2023-09-23 02:27:20.200688" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=300000
"time"="2023-09-23 02:27:36.217710" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=400000
"time"="2023-09-23 02:27:51.042027" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=500000
"time"="2023-09-23 02:28:11.737064" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=600000
"time"="2023-09-23 02:28:24.289508" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="85048f042d469913b1e9c52b951234c06b7033910006a877fb05ca2fb23f1387" "error"="invalid chunk"
"time"="2023-09-23 02:28:30.985229" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="8505a2d4db170e27120d2b64c0b357ef0ea1e4ccdf7dd74a99ac526272ff7741" "error"="invalid chunk"
"time"="2023-09-23 02:28:31.754522" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=700000
"time"="2023-09-23 02:28:49.807665" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=800000
"time"="2023-09-23 02:29:06.289304" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=900000
"time"="2023-09-23 02:29:21.509837" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1000000
"time"="2023-09-23 02:29:35.428648" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="8511a2d9e4cc8ae9d9a4bd579b4122d0a426fba4ae5e7443937da1be47560937" "error"="invalid chunk"
"time"="2023-09-23 02:29:39.179024" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1100000
"time"="2023-09-23 02:29:56.955386" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1200000
"time"="2023-09-23 02:30:13.522047" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1300000
"time"="2023-09-23 02:30:31.203372" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1400000
"time"="2023-09-23 02:30:47.490728" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1500000
"time"="2023-09-23 02:30:58.774330" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="8520da049c445aec206346482d7423c8ff3f7684767f3751d4dbe5f50534475f" "error"="invalid chunk"
"time"="2023-09-23 02:31:05.720674" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1600000
"time"="2023-09-23 02:31:23.365979" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1700000
"time"="2023-09-23 02:31:38.719460" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1800000
"time"="2023-09-23 02:31:38.901151" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="8528361ca3105df7a06d9abf3f97f5268936fad60793a866781aeaf4379a7c4a" "error"="invalid chunk"
"time"="2023-09-23 02:31:54.862914" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1900000
"time"="2023-09-23 02:32:09.958068" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2000000
"time"="2023-09-23 02:32:27.310772" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2100000
"time"="2023-09-23 02:32:47.128251" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2200000
"time"="2023-09-23 02:33:04.680943" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2300000
"time"="2023-09-23 02:33:20.407936" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2400000
"time"="2023-09-23 02:33:35.047811" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2500000
"time"="2023-09-23 02:33:51.511406" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2600000
"time"="2023-09-23 02:34:12.349854" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2700000
"time"="2023-09-23 02:34:32.818353" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2800000
"time"="2023-09-23 02:34:52.999160" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2900000
"time"="2023-09-23 02:34:56.525976" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="854ba42497f9879a0042e6001514a3353dd58519779083ceaae45e0ac8e7773b" "error"="invalid chunk"
"time"="2023-09-23 02:35:11.954824" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3000000
"time"="2023-09-23 02:35:30.898367" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3100000
"time"="2023-09-23 02:35:50.245677" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3200000
"time"="2023-09-23 02:36:09.179196" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3300000
"time"="2023-09-23 02:36:27.803434" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3400000
"time"="2023-09-23 02:36:45.995618" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3500000
"time"="2023-09-23 02:37:05.079217" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3600000
"time"="2023-09-23 02:37:24.639192" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3700000
"time"="2023-09-23 02:37:43.028797" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3800000
"time"="2023-09-23 02:38:00.444619" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3900000
"time"="2023-09-23 02:38:17.208223" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4000000
"time"="2023-09-23 02:38:17.787232" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="856ef5cf30c24399e369610671052f9db1df0941885cd4717534a33264bd8ca2" "error"="invalid chunk"
"time"="2023-09-23 02:38:34.673343" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4100000
"time"="2023-09-23 02:38:52.382613" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4200000
"time"="2023-09-23 02:39:09.260097" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4300000
"time"="2023-09-23 02:39:27.139983" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4400000
"time"="2023-09-23 02:39:44.307579" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4500000
"time"="2023-09-23 02:40:04.331218" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4600000
"time"="2023-09-23 02:40:25.628782" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4700000
"time"="2023-09-23 02:40:44.557506" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4800000
"time"="2023-09-23 02:41:00.104167" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4900000
"time"="2023-09-23 02:41:14.523536" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=5000000
"time"="2023-09-23 02:41:15.856616" "level"="info" "logger"="node" "msg"="validation finished" "duration"="14m55.655916536s"
"time"="2023-09-23 02:41:15.859522" "level"="info" "logger"="node" "msg"="starting compaction"
"time"="2023-09-23 02:41:38.058518" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=0 "slot"=156572
"time"="2023-09-23 02:42:00.471406" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=1 "slot"=156264
"time"="2023-09-23 02:42:23.262206" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=2 "slot"=156666
"time"="2023-09-23 02:42:46.313548" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=3 "slot"=156459
"time"="2023-09-23 02:43:09.076357" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=4 "slot"=156320
"time"="2023-09-23 02:43:33.086359" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=5 "slot"=156543
"time"="2023-09-23 02:43:57.174295" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=6 "slot"=156710
"time"="2023-09-23 02:44:21.897143" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=7 "slot"=156565
"time"="2023-09-23 02:44:45.914152" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=8 "slot"=156771
"time"="2023-09-23 02:45:10.254976" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=9 "slot"=156528
"time"="2023-09-23 02:45:35.030026" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=10 "slot"=156267
"time"="2023-09-23 02:45:59.565924" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=11 "slot"=156582
"time"="2023-09-23 02:46:25.294683" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=12 "slot"=156364
"time"="2023-09-23 02:46:51.408964" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=13 "slot"=156642
"time"="2023-09-23 02:47:17.648952" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=14 "slot"=156686
"time"="2023-09-23 02:47:42.111917" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=15 "slot"=156525
"time"="2023-09-23 02:48:06.714285" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=16 "slot"=156408
"time"="2023-09-23 02:48:33.031526" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=17 "slot"=156401
"time"="2023-09-23 02:48:57.416129" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=18 "slot"=156514
"time"="2023-09-23 02:49:23.349749" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=19 "slot"=156647
"time"="2023-09-23 02:49:47.971505" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=20 "slot"=156933
"time"="2023-09-23 02:50:13.486480" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=21 "slot"=156569
"time"="2023-09-23 02:50:39.957041" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=22 "slot"=156557
"time"="2023-09-23 02:51:04.770892" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=23 "slot"=156648
"time"="2023-09-23 02:51:31.161118" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=24 "slot"=156664
"time"="2023-09-23 02:51:57.409147" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=25 "slot"=156319
"time"="2023-09-23 02:52:23.624589" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=26 "slot"=156606
"time"="2023-09-23 02:52:49.669821" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=27 "slot"=156446
"time"="2023-09-23 02:53:15.003591" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=28 "slot"=155920
"time"="2023-09-23 02:53:42.515976" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=29 "slot"=156636
"time"="2023-09-23 02:54:08.243407" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=30 "slot"=156906
"time"="2023-09-23 02:54:35.572576" "level"="info" "logger"="node" "msg"="shard truncated" "shard"=31 "slot"=156636
"time"="2023-09-23 02:54:35.660689" "level"="info" "logger"="node" "msg"="compaction finished" "duration"="13m19.801288401s"
"time"="2023-09-23 02:54:35.661028" "level"="info" "logger"="node" "msg"="performing chunk validation after compaction"
"time"="2023-09-23 02:54:52.656536" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=100000
"time"="2023-09-23 02:55:09.262727" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=200000
"time"="2023-09-23 02:55:25.411944" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=300000
"time"="2023-09-23 02:55:40.674761" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=400000
"time"="2023-09-23 02:55:54.923233" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=500000
"time"="2023-09-23 02:56:13.990012" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=600000
"time"="2023-09-23 02:56:26.111880" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="85048f042d469913b1e9c52b951234c06b7033910006a877fb05ca2fb23f1387" "error"="invalid chunk"
"time"="2023-09-23 02:56:32.421749" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="8505a2d4db170e27120d2b64c0b357ef0ea1e4ccdf7dd74a99ac526272ff7741" "error"="invalid chunk"
"time"="2023-09-23 02:56:33.125982" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=700000
"time"="2023-09-23 02:56:49.966943" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=800000
"time"="2023-09-23 02:57:05.270539" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=900000
"time"="2023-09-23 02:57:19.350117" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1000000
"time"="2023-09-23 02:57:32.008056" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="8511a2d9e4cc8ae9d9a4bd579b4122d0a426fba4ae5e7443937da1be47560937" "error"="invalid chunk"
"time"="2023-09-23 02:57:35.253085" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1100000
"time"="2023-09-23 02:57:51.046171" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1200000
"time"="2023-09-23 02:58:05.691354" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1300000
"time"="2023-09-23 02:58:21.372901" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1400000
"time"="2023-09-23 02:58:35.987555" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1500000
"time"="2023-09-23 02:58:46.438938" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="8520da049c445aec206346482d7423c8ff3f7684767f3751d4dbe5f50534475f" "error"="invalid chunk"
"time"="2023-09-23 02:58:52.950346" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1600000
"time"="2023-09-23 02:59:09.346532" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1700000
"time"="2023-09-23 02:59:23.560762" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1800000
"time"="2023-09-23 02:59:23.731780" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="8528361ca3105df7a06d9abf3f97f5268936fad60793a866781aeaf4379a7c4a" "error"="invalid chunk"
"time"="2023-09-23 02:59:38.398292" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=1900000
"time"="2023-09-23 02:59:52.263419" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2000000
"time"="2023-09-23 03:00:08.569685" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2100000
"time"="2023-09-23 03:00:27.235958" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2200000
"time"="2023-09-23 03:00:43.488942" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2300000
"time"="2023-09-23 03:00:57.749113" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2400000
"time"="2023-09-23 03:01:10.890950" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2500000
"time"="2023-09-23 03:01:26.193947" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2600000
"time"="2023-09-23 03:01:46.052818" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2700000
"time"="2023-09-23 03:02:05.690999" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2800000
"time"="2023-09-23 03:02:24.436162" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=2900000
"time"="2023-09-23 03:02:27.726892" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="854ba42497f9879a0042e6001514a3353dd58519779083ceaae45e0ac8e7773b" "error"="invalid chunk"
"time"="2023-09-23 03:02:42.390285" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3000000
"time"="2023-09-23 03:02:59.666113" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3100000
"time"="2023-09-23 03:03:16.689446" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3200000
"time"="2023-09-23 03:03:33.599671" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3300000
"time"="2023-09-23 03:03:50.738107" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3400000
"time"="2023-09-23 03:04:07.661766" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3500000
"time"="2023-09-23 03:04:25.235829" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3600000
"time"="2023-09-23 03:04:42.989893" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3700000
"time"="2023-09-23 03:05:00.177758" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3800000
"time"="2023-09-23 03:05:16.775401" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=3900000
"time"="2023-09-23 03:05:32.359279" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4000000
"time"="2023-09-23 03:05:32.903403" "level"="info" "logger"="node" "msg"="invalid chunk" "address"="856ef5cf30c24399e369610671052f9db1df0941885cd4717534a33264bd8ca2" "error"="invalid chunk"
"time"="2023-09-23 03:05:48.334333" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4100000
"time"="2023-09-23 03:06:04.411790" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4200000
"time"="2023-09-23 03:06:19.875481" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4300000
"time"="2023-09-23 03:06:36.393436" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4400000
"time"="2023-09-23 03:06:51.886748" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4500000
"time"="2023-09-23 03:07:10.765926" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4600000
"time"="2023-09-23 03:07:30.759797" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4700000
"time"="2023-09-23 03:07:49.185008" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4800000
"time"="2023-09-23 03:08:05.297317" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=4900000
"time"="2023-09-23 03:08:19.486048" "level"="info" "logger"="node" "msg"="..still validating chunks" "count"=5000000
"time"="2023-09-23 03:08:20.745375" "level"="info" "logger"="node" "msg"="validation finished" "duration"="13m45.084338602s"

@istae istae merged commit 9ce6f3c into master Sep 23, 2023
10 checks passed
@istae istae deleted the compaction branch September 23, 2023 09:20
Copy link
Member

@zelig zelig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few minor issues.
Plus I wonder if this shoudl be manually invoked process at all.
Shall we not consider doing compaction regularly

if err != nil {
return err
}
defer func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this defer not come before the NewRecovery call?

end := lastUsedSlot

for start < end {
if slots[start] == nil { // free
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to make sure end is always set to a used slot otherwise you end up with suboptimal truncation:

imagine

0 1 2 3 4 5 6 7 8 9 ........
+ + + + + + + - - + - - - - - 

after the first move at start = 7; end = 9, the loop terminates at start = 8; end = 8 instead of end = 7

i suggest:

end := lastUsedSlot
for start  := uint(32); start < end;  {
      if slots[end] == nil {
          end--
          continue
      }
      if slots[start] != nil {
         start++
         continue
     }
     MOVE
     start++
     end--
}

return nil
}

eg := errgroup.Group{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use an errgroup if you dont care about errors?
use waitgroup instead

}

count := 0
_ = chunkstore.Iterate(store, func(item *chunkstore.RetrievalIndexItem) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you ignoring the errors then you should not use errgroup but waitgroup

// TestCompact creates two batches and puts chunks belonging to both batches.
// The first batch is then expired, causing free slots to accumulate in sharky.
// Next, sharky is compacted, after which, it is tested that valid chunks can still be retrieved.
func TestCompact(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is quite a strange test for compaction. I mean it is totally unnecessary to have such a complex test. Why should compaction know about batches or eviction etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

a new compaction/defragment cmd for sharky
4 participants