-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add evaluate_demo
and range_analysis_demo
to Expr examples
#8377
Conversation
/// DataFusion also has APIs for analyzing predicates (boolean expressions) to | ||
/// determine any ranges restrictions on the inputs required for the predicate | ||
/// evaluate to true. | ||
fn range_analysis_demo() -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really powerful feature of DataFusion and I don't think it is widely understood yet
@@ -111,6 +115,22 @@ impl ExprBoundaries { | |||
distinct_count: col_stats.distinct_count.clone(), | |||
}) | |||
} | |||
|
|||
/// Create `ExprBoundaries` that represent no known bounds for all the columns `schema` | |||
pub fn try_new_unknown(schema: &Schema) -> Result<Vec<Self>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was added to make the demo easier to write (I ported it from IOx downstream)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might unbounded be more obvious a name than unknown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might unbounded be more obvious a name than unknown?
I agree -- will change.
cc @ozankabak / @berkaysynnada / @metesynnada as this PR adds examples to some of the great features you have added |
These are really good demos. The utility functions you've added also make sense to me. Thanks, @alamb. |
@@ -111,6 +115,22 @@ impl ExprBoundaries { | |||
distinct_count: col_stats.distinct_count.clone(), | |||
}) | |||
} | |||
|
|||
/// Create `ExprBoundaries` that represent no known bounds for all the columns `schema` | |||
pub fn try_new_unknown(schema: &Schema) -> Result<Vec<Self>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might unbounded be more obvious a name than unknown?
Co-authored-by: Raphael Taylor-Davies <[email protected]>
…e#8377) * Add `evaluate_demo` and `range_analysis_demo` to Expr examples * Prettier * Update datafusion-examples/examples/expr_api.rs Co-authored-by: Raphael Taylor-Davies <[email protected]> * rename ExprBoundaries::try_new_unknown --> ExprBoundaries::try_new_unbounded --------- Co-authored-by: Raphael Taylor-Davies <[email protected]>
Which issue does this PR close?
Part of #7013
Rationale for this change
As part of updating DataFusion internally in IOx https://github.com/influxdata/influxdb_iox/pull/9428 , I found that we had code that used the range analysis code directly and thus had to be changed due to #8276
I found that DataFusion has a much nicer interface to do this
analyze
but I didn't think it was particularly obvious how to do soIt also came up on #8306 (comment) that it was non trivial to figure out how to evaluate expressions, so I added an example of how to do that as well
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?