-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove unsafe Send impl from PriorityMap #12289
Remove unsafe Send impl from PriorityMap #12289
Conversation
It's not necessary to use unsafe Send impl. It's enough to require the referenced trait objects as Send.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me -- thank you @findepi
Hopefully @avantgardnerio can also have a look as I think he did the initial implementation of this topk heap
capacity: usize, | ||
mapper: Vec<(usize, usize)>, | ||
} | ||
|
||
// JUSTIFICATION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@avantgardnerio -- can you please remind me how we tested this / how we can double check that this doesn't cause a performance regression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i didn't verify benchmarks, but per static code anaysis, PriorityMap is required to be Send. if you just remove trait implementation (marker), code won't compile.
With this PR, PriorityMap is still Send. The only difference is that this is now inferred by the compiler, so safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I ran the benchmark, and it was slower than without the optimization. So I made this change and it got however much faster is listed in the comment.
I sounds from @findepi though like the question is now moot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removing unsafe impl Send for PriorityMap {}
line alone gives compilation error, because rustc does not infer PriorityMap
to be Send
main *$ cargo build
Compiling datafusion-physical-plan v41.0.0 (/Users/findepi/repos/datafusion/datafusion/physical-plan)
error[E0277]: `(dyn ArrowHashTable + 'static)` cannot be sent between threads safely
--> datafusion/physical-plan/src/aggregates/mod.rs:249:9
|
249 | / match stream {
250 | | StreamType::AggregateStream(stream) => Box::pin(stream),
251 | | StreamType::GroupedHash(stream) => Box::pin(stream),
252 | | StreamType::GroupedPriorityQueue(stream) => Box::pin(stream),
253 | | }
| |_________^ `(dyn ArrowHashTable + 'static)` cannot be sent between threads safely
|
= help: the trait `std::marker::Send` is not implemented for `(dyn ArrowHashTable + 'static)`, which is required by `GroupedTopKAggregateStream: std::marker::Send`
= note: required for `std::ptr::Unique<(dyn ArrowHashTable + 'static)>` to implement `std::marker::Send`
note: required because it appears within the type `Box<(dyn ArrowHashTable + 'static)>`
--> /Users/findepi/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/boxed.rs:237:12
|
237 | pub struct Box<
| ^^^
note: required because it appears within the type `PriorityMap`
--> datafusion/physical-plan/src/aggregates/topk/priority_map.rs:27:12
|
27 | pub struct PriorityMap {
| ^^^^^^^^^^^
note: required because it appears within the type `GroupedTopKAggregateStream`
--> datafusion/physical-plan/src/aggregates/topk_stream.rs:39:12
|
39 | pub struct GroupedTopKAggregateStream {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
= note: required for the cast from `Pin<Box<GroupedTopKAggregateStream>>` to `Pin<Box<dyn RecordBatchStream + std::marker::Send>>`
error[E0277]: `(dyn ArrowHeap + 'static)` cannot be sent between threads safely
--> datafusion/physical-plan/src/aggregates/mod.rs:249:9
|
249 | / match stream {
250 | | StreamType::AggregateStream(stream) => Box::pin(stream),
251 | | StreamType::GroupedHash(stream) => Box::pin(stream),
252 | | StreamType::GroupedPriorityQueue(stream) => Box::pin(stream),
253 | | }
| |_________^ `(dyn ArrowHeap + 'static)` cannot be sent between threads safely
|
= help: the trait `std::marker::Send` is not implemented for `(dyn ArrowHeap + 'static)`, which is required by `GroupedTopKAggregateStream: std::marker::Send`
= note: required for `std::ptr::Unique<(dyn ArrowHeap + 'static)>` to implement `std::marker::Send`
note: required because it appears within the type `Box<(dyn ArrowHeap + 'static)>`
--> /Users/findepi/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/alloc/src/boxed.rs:237:12
|
237 | pub struct Box<
| ^^^
note: required because it appears within the type `PriorityMap`
--> datafusion/physical-plan/src/aggregates/topk/priority_map.rs:27:12
|
27 | pub struct PriorityMap {
| ^^^^^^^^^^^
note: required because it appears within the type `GroupedTopKAggregateStream`
--> datafusion/physical-plan/src/aggregates/topk_stream.rs:39:12
|
39 | pub struct GroupedTopKAggregateStream {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^
= note: required for the cast from `Pin<Box<GroupedTopKAggregateStream>>` to `Pin<Box<dyn RecordBatchStream + std::marker::Send>>`
For more information about this error, try `rustc --explain E0277`.
error: could not compile `datafusion-physical-plan` (lib) due to 2 previous errors
however, removing unsafe impl Send for PriorityMap {}
plus other changes in this PR keeps PriorityMap
as Send
, so the code works exactly as it does on current main
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the change here is just change the unsafe Send to Safe Send without performance degrade? Looks good to me
yes |
It's not necessary to use unsafe Send impl. It's enough to require the referenced trait objects as Send.