Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Operator Statistics Storage Using Iceberg #3222

Merged
merged 19 commits into from
Feb 4, 2025
Merged

Add Operator Statistics Storage Using Iceberg #3222

merged 19 commits into from
Feb 4, 2025

Conversation

yunyad
Copy link
Collaborator

@yunyad yunyad commented Jan 21, 2025

This PR handles the storage conversion from the original MongoDB storage to Iceberg for storing execution result statistics. The migration is motivated by the limitations and constraints of MongoDB, and this PR introduces functionality to store runtime statistics using Iceberg.

We store three types of statistical information: min, max, and non-null value counts, if available. For string and boolean types, we do not include min and max values.

Related issue: apache/iceberg#12112

Copy link
Collaborator

@bobbai00 bobbai00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments

Copy link
Collaborator

@bobbai00 bobbai00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments

@chenlica
Copy link
Collaborator

@yunyad Add a detailed description for this PR.

@yunyad yunyad self-assigned this Jan 27, 2025
@yunyad yunyad requested a review from bobbai00 January 30, 2025 19:08
Copy link
Collaborator

@bobbai00 bobbai00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments

Copy link
Collaborator

@bobbai00 bobbai00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments

@yunyad yunyad requested a review from bobbai00 February 4, 2025 08:08
Copy link
Collaborator

@bobbai00 bobbai00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add some minors, LGTM!

Please update the PR description to include the general idea of how this implementation is done, and the future PR plans.

@yunyad yunyad merged commit 0784207 into master Feb 4, 2025
8 checks passed
@yunyad yunyad deleted the yunyad-iceberg branch February 4, 2025 20:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants