-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add Date32
/Date64
in aggregate fuzz testing
#13041
Conversation
Thanks @LeslieKid -- I plan to review this tomorrow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @LeslieKid -- I think this PR is ready to be rebased on main and up for review
@@ -392,6 +392,26 @@ impl RecordBatchGenerator { | |||
Float64Type | |||
) | |||
} | |||
DataType::Date32 => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
unsafe { | ||
std::ptr::read(&adjusted_value as *const i64 as *const N) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to use unsafe -- perhaps adjusted_value.try_into() would work....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried to use try_into().
But maybe it requires N: TryFrom<i64>
? This would limit other primitive types like f32 in PrimitiveArrayGenerator
.
And because of the A: ArrowPrimitiveType<Native = N>
and A::DATA_TYPE
is Date64, I think it can use unsafe
here as an alternative.
fdbe9c4
to
e5c70c6
Compare
e5c70c6
to
4a1eec3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @LeslieKid -- this is looking close. If we can just figure out how to remove the unsafe it will be good to go
let date_value = self.rng.gen_range(i64::MIN..=i64::MAX); | ||
let millis_per_day: i64 = 86_400_000; | ||
let adjusted_value = date_value - (date_value % millis_per_day); | ||
// SAFETY: here we can convert i64 to A::Native safely since we determine that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this unsafe
block is only thing I am worried about.
I wonder if you could instead use
let date_value = self.rng.gen_range(i64::MIN..=i64::MAX);
let millis_per_day = 86_400_000;
let adjusted_value = date_value - (date_value % millis_per_day);
And then return adjusted value 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alamb
We really need to avoid using unsafe.
I introduce a trait FromNative
and remove the unsafe
block to solve this problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @LeslieKid -- this is great ❤️
I also verified we have added additional coverage by running
$ cargo llvm-cov --html test --test fuzz aggregate_fuzz
And then looking at the report.
Are you interested in helping expand out the report (e.g. Timestamp
and Time
and Decimal
type support?
Date32
/Date64
in aggregate fuzz testing
OK! I'd love to take this. I will work on it in the next few days. @alamb |
Thanks again @LeslieKid 🙏 🚀 |
Which issue does this PR close?
Part of #12114 .
Rationale for this change
Supporting more types for dataset generator in fuzzer framework is needed according to this comment.
What changes are included in this PR?
PrimitiveArrayGenerator
to make it easier to introduce new primitive types.Are these changes tested?
Are there any user-facing changes?
No.