-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: file trimming improvement #7852
Merged
edsiper
merged 6 commits into
tiger-1.8.15
from
leonardo-tiger-file-trimming-improvement
Sep 12, 2023
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
8f9c00d
lib: chunkio: made file trimming optional
leonardo-albertovich 161a9b0
storage: added an option to control the file trimming behavior
leonardo-albertovich 7504c24
input_chunk: removed unnecessary chunk size limit
leonardo-albertovich c2e514b
lib: chunkio: added content length validation
leonardo-albertovich ac9dd07
lib: chunkio: fixed checksum overwrite issue
leonardo-albertovich 7bb1746
lib: chunkio: fixed file mapping size
leonardo-albertovich File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically here, it's because the behavior of the cio code was telling the file system very conflicting things. It was saying that a file was going to be X MB size, and then saying it was going to be Y bytes (log message length) in size... which leads the file system to make allocation decisions based on those bits of information. Decisions like "oh, this file isn't going to grow, I can put this other thing here", which is an assumption that's invalidated as soon as the log file grows. Given an appropriate loop over an appropriate amount of time, the fragmentation starts to hurt.
I'm not sure what file system would be have well with the previous behavior of "allocate to X MB, truncate to Y bytes, keep appending with an explicit truncate until you get to X MB", but I doubt it's behaving well on purpose. i.e. I don't know why you would keep around this behavior even as an option.
I'm not super familiar with fluent-bit and cio, but if this code is now ending up going "oh, the config says this file will be 2MB, so just allocate 2MB and don't truncate it down unless we are 100% sure it will not grow", then I think that will give the file system allocator the best opportunity to make good decisions.
If it is doing that, and you reserve enough space at the end of a log file to be able to write a log message of "oh no, out of disk space", then you probably have a nice "graceful" way to deal with ENOSPC conditions. (well, copy-on-write and thin provisioning is a whole other thing, but from what I understand here, that will be pretty unlikely for any individual file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fluent-bit doesn't pre-allocate the expected file size yet (that's one opt-in improvement we're considering for the current version). Instead, when cio detects that the file is not large enough to fit the contents it wants to append it grows the file in
8 * page_size
(wihch in most cases means 32kb) increments until it reaches the required size.With the current chunk file size limit implementation in fluent-bit pre-allocating files doesn't guarantee that we won't have to increase their size because the limit is imposed after the contents are appended but that could be easily fixed.
Thanks for taking a look at this, if you have any questions or remarks please let me know, I'd be glad to go over it with you to be sure that there are no corner cases that weren't addressed,