Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dilithium/ML-DSA Stack Optimizations #340

Merged
merged 32 commits into from
Apr 16, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
44e901c
Init dilithium3 stack optimized variant
dop-amin Mar 15, 2024
80c9e07
Start stack optimization [Passing]
dop-amin Mar 15, 2024
5c5b868
Compress w
dop-amin Mar 15, 2024
926e957
Eliminate z, y
dop-amin Mar 15, 2024
302f7f2
Eliminate cp
dop-amin Mar 15, 2024
3c36dbe
Eliminate s1, s2
dop-amin Mar 15, 2024
f71e025
Eliminate second poly needed for A*y
dop-amin Mar 15, 2024
deeabab
Inline sampling uniform and uniform_gamma1
dop-amin Mar 18, 2024
cbc29cf
Inline hint generation
dop-amin Mar 18, 2024
8468d60
Inline polyw subtraction
dop-amin Mar 18, 2024
b4505e7
Refactor decompose to high/lowbits
dop-amin Mar 18, 2024
f5a8a65
Inline Keccak state
dop-amin Mar 18, 2024
10d4766
Shared buffer for polynomials
dop-amin Mar 18, 2024
2804237
rm 257 FFT
dop-amin Mar 18, 2024
d30a766
Union for small and big poly
dop-amin Mar 18, 2024
a37b5a6
Eliminate some smaller buffers
dop-amin Mar 18, 2024
2bd00ad
Remove asym small mul
dop-amin Mar 18, 2024
77a7572
Stack friendly uniform_gamma1 w/o add
dop-amin Mar 18, 2024
6609f82
Stack optimized Dilithium{2,5}
dop-amin Mar 18, 2024
59724a7
Switch to Plantard-based 769 NTT
dop-amin Mar 19, 2024
0dd789b
First batch of stack opt for Verify
dop-amin Mar 20, 2024
a8c993f
On-the-fly unpacking for z, h
dop-amin Mar 20, 2024
b7ded84
Compress w
dop-amin Mar 20, 2024
e6e164b
rm tmp poly, subtract on wcomp
dop-amin Mar 20, 2024
6ef4fbc
Verify Stack Optimizations
dop-amin Mar 30, 2024
9870bec
rm buffers/unionize in Verify
dop-amin Mar 31, 2024
1d21996
Stack opt key pair
dop-amin Apr 8, 2024
76b16c1
Overlap buffers
dop-amin Apr 8, 2024
e718f2e
Stack optimized challenge generation
dop-amin Apr 8, 2024
a37b311
Match 769 Plantard to m4f code
dop-amin Apr 9, 2024
d401a15
update skiplist
mkannwischer Apr 15, 2024
c013920
update benchmarks
mkannwischer Apr 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Stack optimized challenge generation
  • Loading branch information
dop-amin authored and mkannwischer committed Apr 15, 2024
commit e718f2eb3d4728e246ea3b5ecd9c848e9f017124
2 changes: 1 addition & 1 deletion crypto_sign/dilithium3/m4fstack/sign.c
Original file line number Diff line number Diff line change
Expand Up @@ -384,7 +384,7 @@ int crypto_sign_verify(const uint8_t *sig,
shake256_inc_absorb(&s256, mu, CRHBYTES);

/* Matrix-vector multiplication; compute Az - c2^dt1 */
poly_challenge(&p, sig);
poly_challenge_stack(&p, sig);
poly_challenge_compress(ccomp, &p);

for (size_t k_idx = 0; k_idx < K; k_idx++) {
Expand Down
46 changes: 46 additions & 0 deletions crypto_sign/dilithium3/m4fstack/stack.c
Original file line number Diff line number Diff line change
Expand Up @@ -666,4 +666,50 @@ void pack_sk_tr(unsigned char sk[CRYPTO_SECRETKEYBYTES],
for (unsigned int i = 0; i < TRBYTES; ++i) {
sk[i] = tr[i];
}
}

/*************************************************
* Name: challenge
*
* Description: Implementation of H. Samples polynomial with TAU nonzero
* coefficients in {-1,1} using the output stream of
* SHAKE256(seed). Stack optimized.
*
* Arguments: - poly *c: pointer to output polynomial
* - const uint8_t mu[]: byte array containing seed of length SEEDBYTES
**************************************************/
#define CHALLENGE_STACK_BUF_SIZE 8
void poly_challenge_stack(poly *c, const uint8_t seed[SEEDBYTES]) {
unsigned int i, b, pos;
uint64_t signs;
uint8_t buf[CHALLENGE_STACK_BUF_SIZE];
shake256incctx state;

shake256_inc_init(&state);
shake256_inc_absorb(&state, seed, SEEDBYTES);
shake256_inc_finalize(&state);
shake256_inc_squeeze(buf, CHALLENGE_STACK_BUF_SIZE, &state);
signs = 0;
for(i = 0; i < 8; ++i)
{
signs |= (uint64_t)buf[i] << 8*i;
}
pos = 8;

for(i = 0; i < N; ++i)
c->coeffs[i] = 0;
for(i = N-TAU; i < N; ++i) {
do {
if(pos >= CHALLENGE_STACK_BUF_SIZE) {
shake256_inc_squeeze(buf, CHALLENGE_STACK_BUF_SIZE, &state);
pos = 0;
}

b = buf[pos++];
} while(b > i);

c->coeffs[i] = c->coeffs[b];
c->coeffs[b] = 1 - 2*(signs & 1);
signs >>= 1;
}
}
1 change: 1 addition & 0 deletions crypto_sign/dilithium3/m4fstack/stack.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ void unpack_sk_s2(smallpoly *a, const uint8_t *sk, size_t idx);
void poly_uniform_pointwise_montgomery_polywadd_stack(uint8_t wcomp[3*N], poly *b, const uint8_t seed[SEEDBYTES], uint16_t nonce, shake128incctx *state);
void poly_uniform_gamma1_stack(poly *a, const uint8_t seed[CRHBYTES], uint16_t nonce, shake256incctx *state);
void poly_uniform_gamma1_add_stack(poly *a, poly *b, const uint8_t seed[CRHBYTES], uint16_t nonce, shake256incctx *state);
void poly_challenge_stack(poly *c, const uint8_t seed[SEEDBYTES]);

size_t poly_make_hint_stack(poly *a, poly *t, uint8_t w[768]);
int unpack_sig_h_indices(uint8_t h_i[OMEGA], unsigned int * number_of_hints, unsigned int idx, const unsigned char sig[CRYPTO_BYTES]);
Expand Down