-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dilithium/ML-DSA Stack Optimizations #340
Conversation
Hi Junhao, thanks for reaching out. I will try to make the files including your work into the stack-version be similar-looking to the ones from your PR, so it will not be confusing but we still can have separate versions. Then both PRs should be merge-able. What do you think? |
Hi @dop-amin , thnaks for your work! Yes, I think then these two PRs could be merged into |
87adbfe
to
ac6c1ef
Compare
* Based on ideas from https://eprint.iacr.org/2022/323.pdf, based on code by Matthias J. Kannwischer * Sample A on-the-fly * Compressed c * Schoolbook mul for ct1
* Note: Reverts poly_uniform_pointwise_montgomery_polywadd_stack to prior state
* On-the-fly matrix generation * Schoolbook for ct1 * Challenge compression
* Stack friendly hint decoding * Eliminate second full poly * Remove K-loop from hint unpacking
* Minor clean up
a488505
to
d401a15
Compare
Thank you @dop-amin. That all looks very good to me. |
The stack usage is identical to my measurements and the cycle counts also match my expectation. |
Thanks! |
I see that all the files under dilithium3/m4fstack are actual files and not symlinks. |
Hi, indeed this slipped through. The issue is addressed in #342. |
This PR adds stack optimizations for ML-DSA based on the ideas from the paper "Dilithium for Memory Constrained Devices" and code already written by @mkannwischer in a separate branch. It also superseeds the changes in #222.
I attach numbers on the stack utilization below.
Some remaining questions:
How to handle my addition of the results from the paper "Revisiting Keccak and Dilithium Implementations on ARMv7-M" to this branch in the light of Revisiting Keccak and Dilithium Implementations on ARMv7-M #338?