-
Notifications
You must be signed in to change notification settings - Fork 338
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: db compaction cmd #4329
feat: db compaction cmd #4329
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the comments, LGTM!
return nil | ||
}) | ||
|
||
sort.Slice(items, func(i, j int) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the new slices.SortFunc()
instead.
pkg/storer/compact.go
Outdated
return nil | ||
} | ||
|
||
eg, ctx := errgroup.WithContext(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need to listen for this context cancellation if you are thinking of doing a clean cancellation of this command.
If we dont the wait will just wait till all the routines are done. Not sure if this is expected. If yes we should clearly mention in the doc that users should not cancel the command.
Instead of this, I think it might be better to have the validate as a separate command. Users can run it before/after or anytime they feel there is some issue. Also we can add some good messages regarding the inconsistencies so maybe we can take some actions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main purpose of the validation is to test the compaction feature and it will be off by default.
i will make the fixes for better ctx handling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant is validation could be useful as a command in itself. To identify if there are inconsistencies in the localstore. So in this case users can run it separately if ever needed.
a bee node (that has suffered a sharky leak) with a localstore size of ~66GB dropped to ~22GB after compaction.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few minor issues.
Plus I wonder if this shoudl be manually invoked process at all.
Shall we not consider doing compaction regularly
if err != nil { | ||
return err | ||
} | ||
defer func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this defer not come before the NewRecovery call?
end := lastUsedSlot | ||
|
||
for start < end { | ||
if slots[start] == nil { // free |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need to make sure end
is always set to a used slot otherwise you end up with suboptimal truncation:
imagine
0 1 2 3 4 5 6 7 8 9 ........
+ + + + + + + - - + - - - - -
after the first move at start = 7; end = 9
, the loop terminates at start = 8; end = 8
instead of end = 7
i suggest:
end := lastUsedSlot
for start := uint(32); start < end; {
if slots[end] == nil {
end--
continue
}
if slots[start] != nil {
start++
continue
}
MOVE
start++
end--
}
return nil | ||
} | ||
|
||
eg := errgroup.Group{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why use an errgroup if you dont care about errors?
use waitgroup instead
} | ||
|
||
count := 0 | ||
_ = chunkstore.Iterate(store, func(item *chunkstore.RetrievalIndexItem) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you ignoring the errors then you should not use errgroup but waitgroup
// TestCompact creates two batches and puts chunks belonging to both batches. | ||
// The first batch is then expired, causing free slots to accumulate in sharky. | ||
// Next, sharky is compacted, after which, it is tested that valid chunks can still be retrieved. | ||
func TestCompact(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is quite a strange test for compaction. I mean it is totally unnecessary to have such a complex test. Why should compaction know about batches or eviction etc.
Checklist
Description
Adds a compaction cmd to resize sharky to smallest size possible.
Also includes a before and after chunk validation step to verify chunks are moved around properly.
closes #3844
Open API Spec Version Changes (if applicable)
Motivation and Context (Optional)
Related Issue (Optional)
Screenshots (if appropriate):