Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not add redundant subquery ordering into plan #30

Closed
wants to merge 47 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
d5d6cda
do not add redundant subquery ordering into plan
mertak-synnada Aug 6, 2024
eeabaf1
format code
mertak-synnada Aug 6, 2024
85e23e7
add license
mertak-synnada Aug 6, 2024
bd63098
fix test cases with sort plan removing
mertak-synnada Aug 6, 2024
930c204
fix comment
mertak-synnada Aug 6, 2024
9d5f875
keep sorting on ordering mode test cases
mertak-synnada Aug 6, 2024
2d40a8d
protect test intentions with order + limit
mertak-synnada Aug 6, 2024
b6bc6d4
protect test intentions with order + limit
mertak-synnada Aug 6, 2024
510b16c
Tmp
mustafasrepo Aug 7, 2024
6ef4369
Minor changes
mustafasrepo Aug 7, 2024
c3efafc
Minor changes
mustafasrepo Aug 7, 2024
2d1b48f
Merge remote-tracking branch 'refs/remotes/origin/bug_fix/enforce_sor…
mertak-synnada Aug 7, 2024
2bf220d
Minor changes
mustafasrepo Aug 7, 2024
eb83917
Implement top down recursion with delete check
mustafasrepo Aug 7, 2024
0b66b15
Minor changes
mustafasrepo Aug 7, 2024
9d3a972
Merge remote-tracking branch 'refs/remotes/origin/bug_fix/enforce_sor…
mertak-synnada Aug 7, 2024
c769f9f
Minor changes
mustafasrepo Aug 7, 2024
07dca3a
Merge remote-tracking branch 'refs/remotes/origin/bug_fix/enforce_sor…
mertak-synnada Aug 7, 2024
9192ca9
initialize fetch() api for execution plan
mertak-synnada Aug 7, 2024
0ad7063
Address reviews
mustafasrepo Aug 7, 2024
3661f06
Update comments
mustafasrepo Aug 7, 2024
60967c1
Minor changes
mustafasrepo Aug 7, 2024
6b87c4c
Make test deterministic
mustafasrepo Aug 7, 2024
a029d6f
add supports limit push down to union exec
mertak-synnada Aug 8, 2024
74041e7
support limit push down with multi children cases
mertak-synnada Aug 8, 2024
1d73ddb
fix typos
mertak-synnada Aug 8, 2024
8dd7e0a
Add fetch info to the statistics
mustafasrepo Aug 8, 2024
23a33df
optimize tpch test plans
mertak-synnada Aug 8, 2024
15423ae
Enforce distribution use inexact count estimate also.
mustafasrepo Aug 8, 2024
94fb83d
Minor changes
mustafasrepo Aug 8, 2024
9053b9f
Minor changes
mustafasrepo Aug 8, 2024
54fc4b2
Merge remote-tracking branch 'refs/remotes/origin/bug_fix/enforce_sor…
mertak-synnada Aug 8, 2024
501f403
Merge branch 'refs/heads/bug_fix/enforce_sorting' into feature/Sort-F…
mertak-synnada Aug 8, 2024
ebda00a
Merge remote-tracking branch 'refs/remotes/origin/apache_main' into f…
mertak-synnada Aug 14, 2024
27342ff
merge with apache main
mertak-synnada Aug 14, 2024
1d04db8
format code
mertak-synnada Aug 14, 2024
ec67b36
fix doc paths
mertak-synnada Aug 14, 2024
4564f4b
fix doc paths
mertak-synnada Aug 14, 2024
8e7d1df
Merge branch 'refs/heads/apache_main' into feature/Sort-For-Subqueries
mertak-synnada Aug 14, 2024
b139b02
remove redundant code block
mertak-synnada Aug 14, 2024
4472e15
if partition count is 1 put GlobalLimitExec
mertak-synnada Aug 14, 2024
eb96912
fix test cases
mertak-synnada Aug 14, 2024
128676e
Apply suggestions from code review
ozankabak Aug 14, 2024
782487c
fix syntax errors
mertak-synnada Aug 14, 2024
ff227a2
Simplify branches
ozankabak Aug 15, 2024
07820cc
Merge branch 'refs/heads/apache_main' into feature/Sort-For-Subqueries
mertak-synnada Aug 19, 2024
c6a9abc
remove redundant limit plans from merge
mertak-synnada Aug 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -1730,6 +1730,8 @@ pub(crate) mod tests {
config.optimizer.repartition_file_min_size = $REPARTITION_FILE_MIN_SIZE;
config.optimizer.prefer_existing_sort = $PREFER_EXISTING_SORT;
config.optimizer.prefer_existing_union = $PREFER_EXISTING_UNION;
// Use a small batch size, to trigger RoundRobin in tests
config.execution.batch_size = 1;

// NOTE: These tests verify the joint `EnforceDistribution` + `EnforceSorting` cascade
// because they were written prior to the separation of `BasicEnforcement` into
Expand Down
3 changes: 1 addition & 2 deletions datafusion/core/src/physical_optimizer/enforce_sorting.rs
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,7 @@ use crate::physical_plan::{Distribution, ExecutionPlan, InputOrderMode};

use datafusion_common::plan_err;
use datafusion_common::tree_node::{Transformed, TransformedResult, TreeNode};
use datafusion_physical_expr::Partitioning;
use datafusion_physical_expr::{PhysicalSortExpr, PhysicalSortRequirement};
use datafusion_physical_expr::{Partitioning, PhysicalSortExpr, PhysicalSortRequirement};
use datafusion_physical_plan::repartition::RepartitionExec;
use datafusion_physical_plan::sorts::partial_sort::PartialSortExec;
use datafusion_physical_plan::ExecutionPlanProperties;
Expand Down
18 changes: 10 additions & 8 deletions datafusion/core/src/physical_optimizer/sort_pushdown.rs
Original file line number Diff line number Diff line change
Expand Up @@ -37,18 +37,18 @@ use datafusion_physical_expr::{
LexRequirementRef, PhysicalSortExpr, PhysicalSortRequirement,
};

/// This is a "data class" we use within the [`EnforceSorting`] rule to push
/// down [`SortExec`] in the plan. In some cases, we can reduce the total
/// computational cost by pushing down `SortExec`s through some executors. The
/// object carries the parent required ordering, fetch value of the parent node as its data.
///
/// [`EnforceSorting`]: crate::physical_optimizer::enforce_sorting::EnforceSorting
#[derive(Default, Clone)]
pub struct ParentRequirements {
ordering_requirement: Option<Vec<PhysicalSortRequirement>>,
fetch: Option<usize>,
}

/// This is a "data class" we use within the [`EnforceSorting`] rule to push
/// down [`SortExec`] in the plan. In some cases, we can reduce the total
/// computational cost by pushing down `SortExec`s through some executors. The
/// object carries the parent required ordering as its data.
///
/// [`EnforceSorting`]: crate::physical_optimizer::enforce_sorting::EnforceSorting
pub type SortPushDown = PlanContext<ParentRequirements>;

/// Assigns the ordering requirement of the root node to the its children.
Expand Down Expand Up @@ -187,8 +187,10 @@ fn pushdown_requirement_to_children(
Ok(None)
}
} else if plan.fetch().is_some()
&& plan.maintains_input_order().len() == 1
&& plan.maintains_input_order()[0]
&& plan
.maintains_input_order()
.iter()
.all(|maintain| *maintain)
&& plan.supports_limit_pushdown()
{
let output_req = PhysicalSortRequirement::from_sort_exprs(
Expand Down
2 changes: 1 addition & 1 deletion datafusion/physical-plan/src/limit.rs
Original file line number Diff line number Diff line change
Expand Up @@ -816,7 +816,7 @@ mod tests {
#[tokio::test]
async fn test_row_number_statistics_for_local_limit() -> Result<()> {
let row_count = row_number_statistics_for_local_limit(4, 10).await?;
assert_eq!(row_count, Precision::Exact(10));
assert_eq!(row_count, Precision::Exact(40));

Ok(())
}
Expand Down