Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump floor segment size to 16MB. #14189

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Feb 1, 2025

This bumps the floor segment size from 2MB (TieredMergePolicy) / 1.6MB (LogByteSizeMergePolicy) to 16MB in Lucene 11.

My motivation is that such small segment sizes don't make index structures actually helpful vs. linear scans, so we should avoid them. Furthermore, there has been progress on merging rules for segments below the floor size, in particular merge policies no longer perform quadratic merging (#900) so this change will not make indexing/merging absurdly slow if an application flushes tiny segments.

This bumps the floor segment size from 2MB (`TieredMergePolicy`) / 1.6MB
(`LogByteSizeMergePolicy`) to 16MB in Lucene 11.

My motivation is that such small segment sizes don't make index
structures actually helpful vs. linear scans, so we should avoid them.
Furthermore, there has been progress on merging rules for segments below
the floor size, in particular merge policies no longer perform quadratic
merging (apache#900) so this change will not make indexing/merging absurdly
slow if an application flushes tiny segments.
@jpountz jpountz added this to the 11.0.0 milestone Feb 1, 2025
Copy link
Member

@mikemccand mikemccand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could make this change in 10.x too? It is not really a bwc break, improving merge selection...

@jpountz
Copy link
Contributor Author

jpountz commented Feb 3, 2025

Thanks for the feedback, I was hesitating. Let's pull this in 10.2 then.

@jpountz
Copy link
Contributor Author

jpountz commented Feb 3, 2025

For reference, this is roughly a 10x increase of the floor segment size, so given that TieredMergePolicy defaults to 10 segments per tier, indexes should have about 10 fewer segments after this change.

Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants