-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve SortPreservingMerge::enable_round_robin_repartition docs #13826
Merged
Merged
Changes from 3 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
55cc4d8
Clarify SortPreservingMerge::enable_round_robin_repartition docs
alamb 16e315a
tweaks
alamb 57fddea
Improve comments more
alamb 65a64f2
Merge remote-tracking branch 'apache/main' into alamb/refine_spm-docs
alamb de10c08
clippy
alamb 19fa523
fix doc link
alamb 1cc4fa9
Merge remote-tracking branch 'apache/main' into alamb/refine_spm-docs
alamb File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,7 +15,7 @@ | |
// specific language governing permissions and limitations | ||
// under the License. | ||
|
||
//! Defines the sort preserving merge plan | ||
//! [`SortPreservingMergeExec`] merges multiple sorted streams into one sorted stream. | ||
|
||
use std::any::Any; | ||
use std::sync::Arc; | ||
|
@@ -38,10 +38,22 @@ use log::{debug, trace}; | |
|
||
/// Sort preserving merge execution plan | ||
/// | ||
/// This takes an input execution plan and a list of sort expressions, and | ||
/// provided each partition of the input plan is sorted with respect to | ||
/// these sort expressions, this operator will yield a single partition | ||
/// that is also sorted with respect to them | ||
/// # Overview | ||
/// | ||
/// This operator implements a K-way merge. It is used to merge multiple sorted | ||
/// streams into a single sorted stream and is highly optimized. | ||
/// | ||
/// ## Inputs: | ||
/// | ||
/// 1. A list of sort expressions | ||
/// 2. An input plan, where each partition is sorted with respect to | ||
/// these sort expressions. | ||
/// | ||
/// ## Output: | ||
/// | ||
/// 1. A single partition that is also sorted with respect to the expressions | ||
/// | ||
/// ## Diagram | ||
/// | ||
/// ```text | ||
/// ┌─────────────────────────┐ | ||
|
@@ -55,12 +67,12 @@ use log::{debug, trace}; | |
/// ┌─────────────────────────┐ │ └───────────────────┘ └─┬─────┴───────────────────────┘ | ||
/// │ ╔═══╦═══╗ │ │ | ||
/// │ ║ B ║ E ║ ... │──┘ │ | ||
/// │ ╚═══╩═══╝ │ Note Stable Sort: the merged stream | ||
/// └─────────────────────────┘ places equal rows from stream 1 | ||
/// │ ╚═══╩═══╝ │ Stable sort if `enable_round_robin_repartition=false`: | ||
/// └─────────────────────────┘ the merged stream places equal rows from stream 1 | ||
/// Stream 2 | ||
/// | ||
/// | ||
/// Input Streams Output stream | ||
/// Input Partitions Output Partition | ||
/// (sorted) (sorted) | ||
/// ``` | ||
/// | ||
|
@@ -70,7 +82,7 @@ use log::{debug, trace}; | |
/// the output and inputs are not polled again. | ||
#[derive(Debug, Clone)] | ||
pub struct SortPreservingMergeExec { | ||
/// Input plan | ||
/// Input plan with sorted partitions | ||
input: Arc<dyn ExecutionPlan>, | ||
/// Sort expressions | ||
expr: LexOrdering, | ||
|
@@ -80,7 +92,9 @@ pub struct SortPreservingMergeExec { | |
fetch: Option<usize>, | ||
/// Cache holding plan properties like equivalences, output partitioning etc. | ||
cache: PlanProperties, | ||
/// Configuration parameter to enable round-robin selection of tied winners of loser tree. | ||
/// Use round-robin selection of tied winners of loser tree | ||
/// | ||
/// See [`Self::with_round_robin_repartition`] for more information. | ||
enable_round_robin_repartition: bool, | ||
} | ||
|
||
|
@@ -105,6 +119,14 @@ impl SortPreservingMergeExec { | |
} | ||
|
||
/// Sets the selection strategy of tied winners of the loser tree algorithm | ||
/// | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is the actual public facing API descriptin change |
||
/// If true (the default) equal output rows are placed in the merged stream | ||
/// in round robin fashion. This approach consumes input streams at more | ||
/// even rates when there are many rows with the same sort key. | ||
/// | ||
/// If false, equal output rows are always placed in the merged stream in | ||
/// the order of the inputs, resulting in potentially slower execution but a | ||
/// stable output order. | ||
pub fn with_round_robin_repartition( | ||
mut self, | ||
enable_round_robin_repartition: bool, | ||
|
@@ -128,7 +150,8 @@ impl SortPreservingMergeExec { | |
self.fetch | ||
} | ||
|
||
/// This function creates the cache object that stores the plan properties such as schema, equivalence properties, ordering, partitioning, etc. | ||
/// Creates the cache object that stores the plan properties | ||
/// such as schema, equivalence properties, ordering, partitioning, etc. | ||
fn compute_properties( | ||
input: &Arc<dyn ExecutionPlan>, | ||
ordering: LexOrdering, | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this related description and updated it to describe the current state of the code, rather than the changes that were made previously