-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Always add round robin repartitioning to leaves (data sources), benefitting unbalanced / small datasets #13707
base: main
Are you sure you want to change the base?
Conversation
1b8e770
to
5fa27b0
Compare
5fa27b0
to
8ca56a2
Compare
Co-authored-by: Berkay Şahin <[email protected]>
Would this also result in parallel execution of plans based on VALUES? |
WDYM, when Maybe it would help for e.g. parallelizing joins or similar queries, but it would be nice to show this with some benchmarks :). |
@Dandandan that was a question, not a suggestion. VALUES can turn into larger data sets -- for example with sequence + unnest -- but i don't see a need to optimize around this at this point. Just being curious, whether this round robin above leaves gets also inserted above values. |
Clear! I think it might be the case already, as |
ValuesExec is single partition currently, but I would like to extend it to multi partitiones #12905 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The performance improvement is impressive 🚀
I feel this change benefits some workload, but might harm locality in others:
for example input is almost sorted with few inversions, if it's simply split evenly, locality can be exploited (when doing aggregation each partition can have lower cardinality), but round-robin shuffling would do the opposite
However it's fine to proceed since we can add a configuration for it in the future
@@ -836,7 +836,10 @@ fn add_roundrobin_on_top( | |||
n_target: usize, | |||
) -> Result<DistributionContext> { | |||
// Adding repartition is helpful: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Adding repartition is helpful: | |
// Always perform repartitioning on data sources, as they may be imbalanced or consist of too many small partitions. |
A 'why' comment here can help understanding
Yes I agree it might hurt performance in those cases (lower cardinality / data correlation). Even without it, we might hurt performance a bit by repartitioning (memory / cpu locality), but couldn't find yet a benchmark showing this. Round robin can be disabled (e.g. before executing a query) already if someone would like to tweak it, and we can avoid it if sorting/partitioning is known. I think it might be interesting as well to look in improving round-robin repartitionexec in the future: if all partitions still are yielding batches, we can consider keeping them to the current partition, changing to round-robin (or similar) only when one of the partitions is exhausted, so we can benefit from both better execution as benefitting from lower cardinality in case of correlated data. |
Which issue does this PR close?
Closes #.
Rationale for this change
We have some logic to not introduce RoundRobin repartitioning whenever number of child partitions is already greater than input partitions.
This is however not optimal when input data is not perfectly balanced, when the number of batches in some partitions is lower than in other partitions. Adding an extra round robin
RepartitionExec
helps to get more performance out of these cases at a negligible cost.The performance improvement for some cases can be quite high:
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?