-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merkle: innerHash dominates CPU/RAM consumption by sha256.Sum256(append(bytes...)) but it can be further sped up and allocations reduced as well by using a pool and reusing hashes #43
Comments
This change optimizes SHA256 hashing for leafHash & innerHash by pooling and reusing hashes, which massively reduces on RAM consumption and even CPU time (benchmarking on my noisy machine is skewed but the results can be verified individually) producing results below: ```shell $ benchstat before.txt after.txt name old time/op new time/op delta HashAlternatives/recursive-8 77.6µs ± 2% 77.0µs ± 2% ~ (p=0.165 n=10+10) HashAlternatives/iterative-8 76.3µs ± 1% 76.2µs ± 3% ~ (p=0.720 n=9+10) name old alloc/op new alloc/op delta HashAlternatives/recursive-8 25.4kB ± 0% 6.4kB ± 0% -74.94% (p=0.000 n=10+10) HashAlternatives/iterative-8 28.1kB ± 0% 9.1kB ± 0% -67.78% (p=0.000 n=10+10) name old allocs/op new allocs/op delta HashAlternatives/recursive-8 497 ± 0% 199 ± 0% -59.96% (p=0.000 n=10+10) HashAlternatives/iterative-8 498 ± 0% 200 ± 0% -59.84% (p=0.000 n=10+10) ``` Fixes celestiaorg#43
I've mailed out a PR #44 |
This change optimizes SHA256 hashing for leafHash & innerHash by pooling and reusing hashes, which massively reduces on RAM consumption and even CPU time (benchmarking on my noisy machine is skewed but the results can be verified individually) producing results below: ```shell $ benchstat before.txt after.txt name old time/op new time/op delta HashAlternatives/recursive-8 77.6µs ± 2% 77.0µs ± 2% ~ (p=0.165 n=10+10) HashAlternatives/iterative-8 76.3µs ± 1% 76.2µs ± 3% ~ (p=0.720 n=9+10) name old alloc/op new alloc/op delta HashAlternatives/recursive-8 25.4kB ± 0% 6.4kB ± 0% -74.94% (p=0.000 n=10+10) HashAlternatives/iterative-8 28.1kB ± 0% 9.1kB ± 0% -67.78% (p=0.000 n=10+10) name old allocs/op new allocs/op delta HashAlternatives/recursive-8 497 ± 0% 199 ± 0% -59.96% (p=0.000 n=10+10) HashAlternatives/iterative-8 498 ± 0% 200 ± 0% -59.84% (p=0.000 n=10+10) ``` Fixes celestiaorg#43
This merkle tree implementation is forked from Tendermint/Comet and ideally such optimizations should also be submitted there. Btw, @odeke-em do you know any high quality open source merkle tree implementations as standalone modules? Maintaining yet another merkle tree in go-square doesnt seem like a good idea. Potentially, we could nudge cosmos ecosystem to make one, depend on it and submit all the improvements there. |
This change optimizes SHA256 hashing for leafHash & innerHash by pooling and reusing hashes, which massively reduces on RAM consumption and even CPU time (benchmarking on my noisy machine is skewed but the results can be verified individually) producing the results below: ```shell $ benchstat before.txt after.txt name old time/op new time/op delta HashAlternatives/recursive-8 77.6µs ± 2% 77.0µs ± 2% ~ (p=0.165 n=10+10) HashAlternatives/iterative-8 76.3µs ± 1% 76.2µs ± 3% ~ (p=0.720 n=9+10) name old alloc/op new alloc/op delta HashAlternatives/recursive-8 25.4kB ± 0% 6.4kB ± 0% -74.94% (p=0.000 n=10+10) HashAlternatives/iterative-8 28.1kB ± 0% 9.1kB ± 0% -67.78% (p=0.000 n=10+10) name old allocs/op new allocs/op delta HashAlternatives/recursive-8 497 ± 0% 199 ± 0% -59.96% (p=0.000 n=10+10) HashAlternatives/iterative-8 498 ± 0% 200 ± 0% -59.84% (p=0.000 n=10+10) ``` Fixes celestiaorg#43
I've gotten jaded from trying to contribute back to tendermint repositories; am not sure I have the time and bandwidth to try to push up that hill, but at least I've highlighted what y'all need to do :-) As for high quality Merkle tree implementations, I had found some but I am dashing for dinner though I'll look at my list tomorrow. |
We've reverted back to using the merkle tree in the tendermint/cometbft library. So this issue can be closed. If there are any other good suitors as a merkle library, I would also be interested to hear of them |
Summary of Bug
I was auditing the code for both security vulnerabilities and performance improvements and after reading through the literature on HashFromByteSlicesIterative at
go-square/merkle/tree.go
Line 62 in c20eda9
Analysis
If we examine the code for innerHash
go-square/merkle/hash.go
Lines 23 to 31 in c20eda9
we can notice these interesting properties
go-square/merkle/hash.go
Line 29 in c20eda9
that code uses a global call sha256.Sum256 whose code creates a new digest everytime and discards it per https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/crypto/sha256/sha256.go;l=253
and right there is our opportunity!
Remedy and optimization
We can reuse hashes and reset them each time, but even more, we can avoid the call to append each time but instead invoke
h.Write(byteslice)
by this code
Results
When we use that new routine, voila the results are remarkable
of which we shaved off 1+ second comparably against leafHash :-)
/cc @elias-orijtech @rootulp @liamsi @musalbas
Version
4135f35
Steps to Reproduce
Please read the above!
For Admin Use
The text was updated successfully, but these errors were encountered: