-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle type coercion in signature for ApproxPercentileCont
#12274
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,9 +23,8 @@ use std::fmt::Debug; | |
use arrow::{datatypes::DataType, datatypes::Field}; | ||
use arrow_schema::DataType::{Float64, UInt64}; | ||
|
||
use datafusion_common::{not_impl_err, plan_err, Result}; | ||
use datafusion_common::{not_impl_err, Result}; | ||
use datafusion_expr::function::{AccumulatorArgs, StateFieldsArgs}; | ||
use datafusion_expr::type_coercion::aggregates::NUMERICS; | ||
use datafusion_expr::utils::format_state_name; | ||
use datafusion_expr::{Accumulator, AggregateUDFImpl, Signature, Volatility}; | ||
|
||
|
@@ -63,7 +62,7 @@ impl ApproxMedian { | |
/// Create a new APPROX_MEDIAN aggregate function | ||
pub fn new() -> Self { | ||
Self { | ||
signature: Signature::uniform(1, NUMERICS.to_vec(), Volatility::Immutable), | ||
signature: Signature::user_defined(Volatility::Immutable), | ||
} | ||
} | ||
} | ||
|
@@ -97,11 +96,8 @@ impl AggregateUDFImpl for ApproxMedian { | |
&self.signature | ||
} | ||
|
||
fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> { | ||
if !arg_types[0].is_numeric() { | ||
return plan_err!("ApproxMedian requires numeric input types"); | ||
} | ||
Ok(arg_types[0].clone()) | ||
fn return_type(&self, _arg_types: &[DataType]) -> Result<DataType> { | ||
Ok(DataType::Float64) | ||
} | ||
|
||
fn accumulator(&self, acc_args: AccumulatorArgs) -> Result<Box<dyn Accumulator>> { | ||
|
@@ -116,4 +112,8 @@ impl AggregateUDFImpl for ApproxMedian { | |
acc_args.exprs[0].data_type(acc_args.schema)?, | ||
))) | ||
} | ||
|
||
fn coerce_types(&self, _arg_types: &[DataType]) -> Result<Vec<DataType>> { | ||
Ok(vec![DataType::Float64]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OOC, why this instead of just defining the signature as DataType::Float64? Afaik DF already tries to coerce inputs to the signature There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you are referring to I think I could change the signature to |
||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like a change in behavior -- with this PR now median always returns float but before it returned the same type as its input
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I change the result to f64 now.
I think it is fine to have f64 for median value. I check the result of Duckdb, they have double for integer, although they have decimal for decimal input, but since we doesn't support decimal for approx_median so there is no regression. We could support decimal case later on