Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Projection Optimization Rule #3

Open
wants to merge 105 commits into
base: apache_main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 78 commits
Commits
Show all changes
105 commits
Select commit Hold shift + click to select a range
1c02a68
Draft PR is moved to new repo
berkaysynnada Feb 15, 2024
e8bb633
Add new reproducer test
mustafasrepo Feb 16, 2024
fd29eea
Simplifications
mustafasrepo Feb 16, 2024
4233028
Simplifications
mustafasrepo Feb 16, 2024
410f44a
Simplifications
mustafasrepo Feb 16, 2024
60fea27
Simplifications
mustafasrepo Feb 16, 2024
e98326d
Simpifications
mustafasrepo Feb 16, 2024
2af385a
Simplifications
mustafasrepo Feb 16, 2024
56b3136
Minor changes
mustafasrepo Feb 16, 2024
26542ba
tpch fails
berkaysynnada Feb 20, 2024
10000fb
Delete docs.yaml
metesynnada Feb 22, 2024
6c76423
Merge pull request #5 from synnada-ai/ci-action-fixing
mustafasrepo Feb 22, 2024
04143ab
All tests pass, ready to be cleaned-up
berkaysynnada Feb 22, 2024
4eea8da
Merge branch 'main' into feature/optimize-projections
berkaysynnada Feb 22, 2024
599516b
Merge branch 'apache:main' into main
mustafasrepo Feb 23, 2024
b9a2dfb
Prevent projection removals causing uncertain column swaps
berkaysynnada Feb 23, 2024
b20b65c
Merge branch 'apache:main' into main
mustafasrepo Feb 26, 2024
f75cf6b
Minor changes
berkaysynnada Feb 26, 2024
b931c4e
Merge branch 'main' into feature/optimize-projections
berkaysynnada Feb 26, 2024
8b06852
Minor changes
berkaysynnada Feb 26, 2024
e632f7a
Update optimize_projections.rs
berkaysynnada Feb 27, 2024
3bda474
Mini Review before unwraps - Part 1
metesynnada Feb 27, 2024
4b4054b
Continue on review - Part 2
metesynnada Feb 27, 2024
5fb5202
Adding comments - Review Part 3
metesynnada Feb 27, 2024
1bb2d7e
Clone's and unwrap's are removed
berkaysynnada Mar 1, 2024
c8f39fe
Update optimize_projections.rs
berkaysynnada Mar 1, 2024
79c8ce5
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Mar 1, 2024
aa296dc
add yaml
berkaysynnada Mar 1, 2024
8b46775
projection update
berkaysynnada Mar 1, 2024
4ccab8b
Pushdown through limits
berkaysynnada Mar 1, 2024
23e4954
fix clippy
berkaysynnada Mar 1, 2024
c9dce82
ProjectionMapping refactor
berkaysynnada Mar 4, 2024
5fe50fc
Sync all joins
berkaysynnada Mar 4, 2024
6ed6878
Adapt test changes
berkaysynnada Mar 4, 2024
f757233
Remove duplications
berkaysynnada Mar 4, 2024
5e2859d
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Mar 5, 2024
12c4307
Update optimize_projections.rs
berkaysynnada Mar 5, 2024
d6b511b
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Mar 12, 2024
0aa1426
Fix after merge
berkaysynnada Mar 12, 2024
fefd91f
Inherit hashjoin projection
berkaysynnada Mar 12, 2024
464d7bb
Minor changes
berkaysynnada Mar 12, 2024
ca5d843
Project csv only with requirements
berkaysynnada Mar 13, 2024
b586311
to check diff
berkaysynnada Mar 15, 2024
0bdd3b7
too many open files!
berkaysynnada Mar 18, 2024
5b4e107
tests pass
berkaysynnada Mar 19, 2024
1ad9303
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Mar 19, 2024
e384bb1
get upstream changes
berkaysynnada Mar 19, 2024
023614e
test changes
berkaysynnada Mar 19, 2024
e2a41e4
Unify hashjoins
berkaysynnada Mar 20, 2024
b28e0cd
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Mar 20, 2024
b209391
Addressing mete's latest feedback
berkaysynnada Mar 20, 2024
a1f7d30
Fixing the bug during benchmarks
berkaysynnada Mar 21, 2024
088f72d
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Mar 21, 2024
f5e58e1
Update cte.slt
berkaysynnada Mar 21, 2024
7f77d50
tests added
berkaysynnada Mar 22, 2024
9d11497
checked till try_remove
berkaysynnada Mar 22, 2024
886e50b
Update optimize_projections.rs
berkaysynnada Mar 29, 2024
c3673a0
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Mar 29, 2024
37b4a0c
Update optimize_projections.rs
berkaysynnada Mar 29, 2024
9ad9834
Fix projection renaming
berkaysynnada Apr 15, 2024
52e6867
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Apr 15, 2024
28fcbdc
agg fixes
berkaysynnada Apr 15, 2024
73147b5
test updates
berkaysynnada Apr 15, 2024
906b73a
fixing count mismatch
berkaysynnada Apr 15, 2024
cac1d1b
Update aggregate_statistics.rs
berkaysynnada Apr 15, 2024
2f60c16
catch different names
berkaysynnada Apr 15, 2024
b650e00
Fix after merge
berkaysynnada Apr 15, 2024
ffebca5
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Apr 16, 2024
155a53f
fix after merge
berkaysynnada Apr 16, 2024
4d02e9e
fixing tpch queries
berkaysynnada Apr 17, 2024
74b8bcd
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Apr 17, 2024
23b306e
Update physical_planner.rs
berkaysynnada Apr 17, 2024
0bc2421
Update optimize_projections.rs
berkaysynnada Apr 18, 2024
09bfda2
Remove unnecessary projections
berkaysynnada Apr 18, 2024
cef1c36
Update optimize_projections.rs
berkaysynnada Apr 18, 2024
4f33cf0
Fixing all tests
berkaysynnada Apr 19, 2024
679b754
Self review part 1
berkaysynnada Apr 19, 2024
26cd56c
Self review part 2
berkaysynnada Apr 19, 2024
c63be56
Review
ozankabak Apr 24, 2024
41ccce5
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Apr 26, 2024
2128475
before test complete
berkaysynnada Apr 29, 2024
bb06baf
joins are fixed
berkaysynnada Apr 29, 2024
ee8dfa6
Update optimize_projections.rs
berkaysynnada Apr 29, 2024
213119d
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Apr 29, 2024
16d5e5e
Ready for review
berkaysynnada Apr 29, 2024
c87dbde
Investigation
berkaysynnada Apr 29, 2024
bd8f2b2
Revert "Investigation"
berkaysynnada Apr 29, 2024
5d5cf88
Review Part 2
ozankabak Apr 29, 2024
b826d29
Remove whitespace diffs
ozankabak Apr 29, 2024
fb8bf5a
Do not pushdown projections making calculations
berkaysynnada Apr 30, 2024
4315494
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada Apr 30, 2024
1611ca1
Update optimize_projections.rs
berkaysynnada Apr 30, 2024
1ff7d19
Update module comments for optimize_projections.rs
ozankabak Apr 30, 2024
8e2f487
Review
ozankabak Apr 30, 2024
7396744
Unify similar plans
berkaysynnada May 3, 2024
5dd4289
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada May 3, 2024
8bdd12f
Update optimize_projections.rs
berkaysynnada May 3, 2024
8fd5570
Refactor ExprMapping
ozankabak May 6, 2024
2c7af01
Use error macros
ozankabak May 6, 2024
c5d8d97
Simplify try_projection_insertion
ozankabak May 6, 2024
4b91806
Update optimize_projections.rs
berkaysynnada May 15, 2024
d6c7ad2
Merge branch 'apache_main' into feature/optimize-projections
berkaysynnada May 16, 2024
d209193
Fix the bug
berkaysynnada May 17, 2024
e782be9
Update cte.slt
berkaysynnada May 21, 2024
21568c2
Update cte.slt
berkaysynnada May 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,6 @@ use datafusion_physical_plan::placeholder_row::PlaceholderRowExec;
#[derive(Default)]
pub struct AggregateStatistics {}

/// The name of the column corresponding to [`COUNT_STAR_EXPANSION`]
const COUNT_STAR_NAME: &str = "COUNT(*)";

impl AggregateStatistics {
#[allow(missing_docs)]
pub fn new() -> Self {
Expand Down Expand Up @@ -144,7 +141,7 @@ fn take_optimizable(node: &dyn ExecutionPlan) -> Option<Arc<dyn ExecutionPlan>>
fn take_optimizable_table_count(
agg_expr: &dyn AggregateExpr,
stats: &Statistics,
) -> Option<(ScalarValue, &'static str)> {
) -> Option<(ScalarValue, String)> {
if let (&Precision::Exact(num_rows), Some(casted_expr)) = (
&stats.num_rows,
agg_expr.as_any().downcast_ref::<expressions::Count>(),
Expand All @@ -158,7 +155,7 @@ fn take_optimizable_table_count(
if lit_expr.value() == &COUNT_STAR_EXPANSION {
return Some((
ScalarValue::Int64(Some(num_rows as i64)),
COUNT_STAR_NAME,
casted_expr.name().to_owned(),
));
}
}
Expand Down Expand Up @@ -427,7 +424,7 @@ pub(crate) mod tests {
/// What name would this aggregate produce in a plan?
fn column_name(&self) -> &'static str {
match self {
Self::CountStar => COUNT_STAR_NAME,
Self::CountStar => "COUNT(*)",
Self::ColumnA(_) => "COUNT(a)",
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,12 @@ impl PhysicalOptimizerRule for CombinePartialFinalAggregate {
AggregateMode::Partial
) && can_combine(
(
agg_exec.group_by(),
agg_exec.group_expr(),
agg_exec.aggr_expr(),
agg_exec.filter_expr(),
),
(
input_agg_exec.group_by(),
input_agg_exec.group_expr(),
input_agg_exec.aggr_expr(),
input_agg_exec.filter_expr(),
),
Expand All @@ -88,7 +88,7 @@ impl PhysicalOptimizerRule for CombinePartialFinalAggregate {
};
AggregateExec::try_new(
mode,
input_agg_exec.group_by().clone(),
input_agg_exec.group_expr().clone(),
input_agg_exec.aggr_expr().to_vec(),
input_agg_exec.filter_expr().to_vec(),
input_agg_exec.input().clone(),
Expand Down
4 changes: 2 additions & 2 deletions datafusion/core/src/physical_optimizer/convert_first_last.rs
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ fn get_common_requirement_of_aggregate_input(
if let Some(aggr_exec) = plan.as_any().downcast_ref::<AggregateExec>() {
let input = aggr_exec.input();
let mut aggr_expr = try_get_updated_aggr_expr_from_child(aggr_exec);
let group_by = aggr_exec.group_by();
let group_by = aggr_exec.group_expr();
let mode = aggr_exec.mode();

let input_eq_properties = input.equivalence_properties();
Expand Down Expand Up @@ -113,7 +113,7 @@ fn get_common_requirement_of_aggregate_input(
InputOrderMode::Linear
};
let projection_mapping =
ProjectionMapping::try_new(group_by.expr(), &input.schema())?;
ProjectionMapping::try_new(group_by.expr().to_vec(), &input.schema())?;

let cache = AggregateExec::compute_properties(
input,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -461,7 +461,7 @@ fn reorder_aggregate_keys(
) -> Result<PlanWithKeyRequirements> {
let parent_required = &agg_node.data;
let output_columns = agg_exec
.group_by()
.group_expr()
.expr()
.iter()
.enumerate()
Expand All @@ -474,15 +474,15 @@ fn reorder_aggregate_keys(
.collect::<Vec<_>>();

if parent_required.len() == output_exprs.len()
&& agg_exec.group_by().null_expr().is_empty()
&& agg_exec.group_expr().null_expr().is_empty()
&& !physical_exprs_equal(&output_exprs, parent_required)
{
if let Some(positions) = expected_expr_positions(&output_exprs, parent_required) {
if let Some(agg_exec) =
agg_exec.input().as_any().downcast_ref::<AggregateExec>()
{
if matches!(agg_exec.mode(), &AggregateMode::Partial) {
let group_exprs = agg_exec.group_by().expr();
let group_exprs = agg_exec.group_expr().expr();
let new_group_exprs = positions
.into_iter()
.map(|idx| group_exprs[idx].clone())
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ impl LimitedDistinctAggregation {
// We found what we want: clone, copy the limit down, and return modified node
let new_aggr = AggregateExec::try_new(
*aggr.mode(),
aggr.group_by().clone(),
aggr.group_expr().clone(),
aggr.aggr_expr().to_vec(),
aggr.filter_expr().to_vec(),
aggr.input().clone(),
Expand Down Expand Up @@ -116,7 +116,7 @@ impl LimitedDistinctAggregation {
if let Some(parent_aggr) =
match_aggr.as_any().downcast_ref::<AggregateExec>()
{
if !parent_aggr.group_by().eq(aggr.group_by()) {
if !parent_aggr.group_expr().eq(aggr.group_expr()) {
// a partial and final aggregation with different groupings disqualifies
// rewriting the child aggregation
rewrite_applicable = false;
Expand Down
2 changes: 1 addition & 1 deletion datafusion/core/src/physical_optimizer/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,10 @@ pub mod enforce_distribution;
pub mod enforce_sorting;
pub mod join_selection;
pub mod limited_distinct_aggregation;
mod optimize_projections;
pub mod optimizer;
pub mod output_requirements;
pub mod pipeline_checker;
mod projection_pushdown;
pub mod pruning;
pub mod replace_with_order_preserving_variants;
mod sort_pushdown;
Expand Down
Loading