-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support fallible eq and hasher in raw API #456
Comments
Another feature I'd appreciate is to be able to somehow pass a mutable environment through to both closures since I somehow need to reference the virtual machine being used. This currently can't be done because This could be accomplished by either passing an environment into the closures or using a different common trait for equality checks and hashing. impl RawTable {
pub fn find_or_find_insert_slot2<C: ?Sized, E>(
&mut self,
ctx: &mut C,
hash: u64,
mut eq: impl FnMut(&mut C, &T) -> Result<bool, E>,
hasher: impl Fn(&mut C, &T) -> Result<u64, E>,
) -> Result<Result<Bucket<T>, InsertSlot>, E> {
todo!()
}
} EDIT: Here are benchmarks and size comparisons for my branch which includes fallible Benches without the changes
Benches with the changes
Binary size differences when applying my branch to the [gimli-rs/object](https://github.com/gimli-rs/object) repo and doing a release build--- without.txt 2023-08-23 20:46:02.666016800 +0200
+++ with.txt 2023-08-23 20:46:11.031017000 +0200
@@ -1,28 +1,28 @@
-target/release/ar 4715408
+target/release/ar 4715472
target/release/ar.d 3315
target/release/build 4096
target/release/deps 4096
-target/release/dyldcachedump 4723616
+target/release/dyldcachedump 4723664
target/release/dyldcachedump.d 3337
-target/release/elfcopy 4888528
+target/release/elfcopy 4899232
target/release/elfcopy.d 3325
-target/release/elftoefi 4739656
+target/release/elftoefi 4739704
target/release/elftoefi.d 3327
target/release/examples 4096
target/release/incremental 4096
target/release/libobject.d 2965
-target/release/libobject.rlib 11035614
+target/release/libobject.rlib 11051504
target/release/libobject_examples.d 3287
-target/release/libobject_examples.rlib 3826960
-target/release/nm 4867848
+target/release/libobject_examples.rlib 3828668
+target/release/nm 4882424
target/release/nm.d 3315
-target/release/objcopy 5067632
+target/release/objcopy 5077600
target/release/objcopy.d 3325
-target/release/objdump 5046328
+target/release/objdump 5046240
target/release/objdump.d 3325
-target/release/objectmap 4859896
+target/release/objectmap 4859768
target/release/objectmap.d 3329
-target/release/pecopy 4724840
+target/release/pecopy 4724912
target/release/pecopy.d 3323
-target/release/readobj 5334536
+target/release/readobj 5334816
target/release/readobj.d 3325 |
I don't think supporting fallible hasher is worth adding this level of complexity to the API. In most cases this isn't needed, and you can always emulate it by returning a hash value of 0 and recording the error in an However I do think there is value in |
I considered this initially, but would this not result in potentially ill-defined downstream behaviors in fallible cases (same for |
|
Right, so the behavior I'd want to emulate is what would happen if the hash implementation panics today (which is how I believe this PR works). Think of this error propagation as my own "structured panics".
What do you mean? Are raw tables not exception safe?
Doesn't that depend on where it fails? If it fails during insertion (e.g. |
I couldn't produce a modification for a failing use hashbrown::raw::RawTable;
fn main() {
let mut table = RawTable::<()>::new();
unsafe {
// Insert three elements (with the same hash) to fill up initial capacity:
for _ in 0..3u64 {
let Err(slot) = table.find_or_find_insert_slot(0, |_| false, |_| 0) else {
panic!();
};
table.insert_in_slot(0, slot, ());
}
dbg!(table.capacity()); // 3
// Table does not grow when `hasher` fails like this:
let Err(()) = table.find_or_find_insert_slot_with(&mut (), 0, |_, _| Ok(false), |_, _| Err(())) else {
panic!();
};
dbg!(table.capacity()); // 3
// Table does grow when a dummy value is returned instead:
let Err(..) = table.find_or_find_insert_slot(0, |_| false, |_| 0) else {
panic!();
};
dbg!(table.capacity()); // 7
}
} |
Not completely: because we don't store the full hash in the table, we rely on the hasher when rehashing the table in-place. If the hasher panics then we have no way of knowing where to place elements in the table. It's exception-sense in the sense that you won't get UB, but you may lose elements in the table. Lines 2277 to 2280 in bb6521e
|
Right. Thanks for the clarity. So that is what I gathered from reading the implementation as well. The only extent I expect the table to be exception safe is that it doesn't permit anything memory unsafe to happen if we do end up catching a panic. But the exact end state is not guaranteed to be the same as prior to the attempted modification. I am still interested to ensure that my "errors" exhibit the same behaviour as a caught Rust panic, mainly because I want to provide as much feature parity between Rust and Rune as possible. So I still want to provide fallibility to the methods. It's nice that you're interested in passing the context around though (possibly the bigger change). But until fallible methods are supported or the exact behavior of returning "dummy values" is understood and tested* I'll probably end up maintaining a private fork of the raw table with those minor changes. *: To the degree that it's possible to emulate the behavior of an equivalent Rust panic. |
Hi!
I want to use the raw API of
hashbrown
as the underlying table implementation in Rune.One aspect which has proven to be awkward is that the hasher I need to pass into
RawTable::find_or_find_insert_slot
is infallible, which works well for a statically typed language, but less so in Rune where I want to store and deal with dynamically typed entries directly.What I need to do is something like this:
The change needed would be fairly straightforward (I think) but a bit tedious, what I think is needed is to refactor the API into a pair of methods like this (ignore naming):
The signatures leave something to be desired and such a change would have to be propagated through the rest of the API, the majority if which have proven to be internal. I would also hope that the compiler can easily prove the infallible variants are no-ops but this is something that should probably be investigated.
The text was updated successfully, but these errors were encountered: