-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switching to GxHash? #487
Comments
It seems like as of right now gxhash is still unsound, not checking for AES either at compile time or runtime. There also is no fallback to support any other platform. Though for initial experimentation it's probably fine, the latter at least, not sure about the unsoundness. |
I previously looked at the code of GxHash and it seems to be well-optimized for throughput (# of bytes hashed per second). Unfortunately in the case of hash tables what you really want is a low-latency hash function: keys are often just a I ran your benchmarks from #488 on my system (Intel(R) Core(TM) i7-10750H CPU @ 2.60GHz) and the results confirm my suspicions about throughput vs latency. I'm not sure why your results are so different, perhaps there is something different with the ARM implementation.
|
Thanks for your feedback! Unsoundness/fallback are definitely necessary. It will take some time but it can be done since it has been done for
Yes indeed, although not exclusively. I do not know why it performs worse on your i7 CPU but I'll investigate. It's very likely that the specialized write_u32 (and friends) allows for more straightforward hashing for small inputs. If you're open for at least "initial experimentation", I'll try to cover these first points and update this issue. |
I also tested this on an ARM system (Cortex-A72), which gives similar results to my i7 system.
|
Interesting. That may be due to the current implementation being bottlenecked on memory reads while it's less of a problem on the Apple M1 SoC which has very fast memory access. That's just a theory but as you said the GxHash Hasher has to handle u32 and other primitives differently than non Sized inputs. Otherwise, it's doing too many unnecessary operations. |
I runed a benchmark on my Apple M2 Max
|
this is a test on AMD EPYC 7282 16-Core Processor CPU @ 2.0GHz
|
I think I know why the performance looks so different: it's because the AES instructions aren't enabled in the baseline target feature, which means that the compiler cannot inline functions marked with However, for hashbrown's default hasher we want something that works on all platforms and doesn't depend on optional CPU extensions. We don't need "secure" hashing like the standard library has, just something fast. We would use FxHash but the low bits of the hash it produces are poor (if your inputs are multiples of 4, so is the hash). |
AHash is, by design, utilizing AES instructions for high-performance hashing, like GxHash does. A difference however is that aHash has fallbacks so that it runs on every hardware, even when there are no AES hardware acceleration. It is planned to have a fallback as well for GxHash, but it's not ready yet. This is for sure mandatory for an usecase such as an hashbrown default hasher. There is still work in progress before I publish new benchmark results here (ogxd/gxhash#34, ogxd/gxhash#44 and ogxd/gxhash#47). Btw, I noticed that the benchmark may be biased because of inputs being optimized within the hash functions utilized, I have created another issue for this: #493. |
Context
GxHash is a new non-cryptographic hashing algorithm that outperforms all counterparts (of the same class,
fxhash
is slightly faster for tiny inputs but at the cost of much worse distribution/avalanche/collision) for all input sizes on both ARM and X86.GxHash uses AES instrinsics for efficient bit mixing (like ahash does) but makes more extensive use of SIMD operations and ILP. This enables gxhash to have a much smaller bytecode than ahash (it is much simpler) while being faster. GxHash hashes are stable across ARM/X86/X86+AVX2 and the algorithm passes all SMHasher tests (ahash does not, at least on my ARM MacBook).
Benchmark
I ran the benchmarks with gxhash on my ARM MacBook m1 pro and here are the results (using GxHash 2.2.4)
From these results there is about +25/30% performance on lookup from using gxhash compared to ahash. This is using gxhash 2.2.4, but gxhash 3.0.0 is almost ready and performs even better. It would be interesting also to see that we benchmark results look like on other platforms.
I am joining a PR to this issue so that you can test it.
Todo?
GxHash security properties haven't been assessed yet, but given what was said in this issue, I was thinking that a solution for even faster hashing would be welcome.
There are probably a few things to address before gxhash can be considered as a default hasher (portability, no std, ...?) but I would like to get your opinion on this idea first.
The text was updated successfully, but these errors were encountered: