Adds HAETAE #313

mmoeller23 · 2023-11-23T10:24:01Z

This commit implements the post-quantum signature scheme HAETAE from

The stack strategy can be selected in config.h by setting STACK_STRATEGY
to the appropriate value (run "make clean" after the change).

0 or undefined: Optimized for speed (default).
1: Disable buffers for the polynomials of the verification key in
crypto_sign_keypair() and crypto_sign(). This reduces speed,
as the key needs to be recomputed after each rejection.
2: In addition to 1, sample the hyperball in multiple passes, such
that some intermediate values are computed on demand, rather
than being buffered. This roughly doubles the runtime of crypto_sign().

M4F version corresponds to reference version of 2023-10-21.

…ministically * Move challenge seed generation from crypto_sign() to poly_challenge(). * Sample the random byte b deterministically inside of polyfixveclk_sample_hyperball(). It is used to: * determine the sign in hyperball sampling (bit mask 0x01) * reject with 50% odds in the overlap region (bit mask 0x02) * M4F version corresponds to reference version of 2023-11-20.

This implementation offers different stack strategies: * 0: Optimized for speed. * 1: Does not buffer the polynomials of the verification key in crypto_sign_keypair() and crypto_sign_signature(), thus reducing stack usage at the cost of some speed. * 2: In addition to 1, the hyperballs are sampled in multiple passes in crypto_sign_signature(), which reduces the stack usage for temporary variables. This roughly doubles the execution time of crypto_sign_signature().

The clean implementation is only minimally changed from the reference implementation to conform with the PQM4 API. The clean implementation would run out of memory for HAETAE3 and HAETAE5 and is therefore not added for those modes.

This commit implements the post-quantum signature scheme HAETAE from https://eprint.iacr.org/2023/624 https://kpqc.cryptolab.co.kr/haetae The stack strategy can be chosen config.h by setting STACK_STRATEGY to the appropriate value (run "make clean" when changing it). * 0 or undefined: Optimized for speed (default). * 1: Disable buffers for the polynomials of the verification key in crypto_sign_keypair() and crypto_sign(). This reduces speed, as the key needs to be recomputed after each rejection. * 2: In addition to 1, sample the hyperball in multiple passes, such that some intermediate values are computed on demand, rather than being buffered. This roughly doubles the runtime of crypto_sign(). The scheme HAETAE2 contains a reference implementation, which has been renamed from "clean" in previous commits to "ref". The reference implementation would run out of memory for schemes HAETAE3 and HAETAE5 and is therefore not included for these schemes.

This commit implements the post-quantum signature scheme HAETAE from https://eprint.iacr.org/2023/624 https://kpqc.cryptolab.co.kr/haetae The stack strategy can be selected in config.h by setting STACK_STRATEGY to the appropriate value (run "make clean" after the change). * 0 or undefined: Optimized for speed (default). * 1: Disable buffers for the polynomials of the verification key in crypto_sign_keypair() and crypto_sign(). This reduces speed, as the key needs to be recomputed after each rejection. * 2: In addition to 1, sample the hyperball in multiple passes, such that some intermediate values are computed on demand, rather than being buffered. This roughly doubles the runtime of crypto_sign().

rpls · 2023-11-23T10:51:13Z

Thanks! I'll test it this week. Btw., any chance you also have a suitable pure C implementation for mupq?

rpls · 2023-11-24T08:07:04Z

Tests pass, but for Testvector-test we need a pure C implementation in mupq.

mmoeller23 · 2023-11-27T08:39:45Z

Tests pass, but for Testvector-test we need a pure C implementation in mupq.

The reference implementation is included in the immediately preceding commit b48968e in the HAETAE2 directory. I did not include it, as the reference implementation only works for HAETAE2 on the embedded system; for HAETAE3 and HAETAE5 it runs fine on the host, but runs out of memory on the embedded system.

In that commit, haetae2 works fine with the Testvector-test. If you copy the ref subdirectory to the haetae3 and haetae5 directories, respectively, and adjust HAETAE_MODE in the config.h files the host implementation will produce the proper testvectors. However, when running on the testboard the reference implementation will run out of memory and not return in these modes.

How do we proceed from here? Does the constallation outlined above work for you, or do we need to have pure C-implementations for all modes, which are able to run on the limited resources of the embedded system? In the latter case, the code will have to deviate substantially from the reference implementation and will be closer to the M4F version.

markuskrausz · 2023-11-28T09:36:18Z

Tests pass, but for Testvector-test we need a pure C implementation in mupq.

The reference implementation is included in the immediately preceding commit b48968e in the HAETAE2 directory. I did not include it, as the reference implementation only works for HAETAE2 on the embedded system; for HAETAE3 and HAETAE5 it runs fine on the host, but runs out of memory on the embedded system.

In that commit, haetae2 works fine with the Testvector-test. If you copy the ref subdirectory to the haetae3 and haetae5 directories, respectively, and adjust HAETAE_MODE in the config.h files the host implementation will produce the proper testvectors. However, when running on the testboard the reference implementation will run out of memory and not return in these modes.

How do we proceed from here? Does the constallation outlined above work for you, or do we need to have pure C-implementations for all modes, which are able to run on the limited resources of the embedded system? In the latter case, the code will have to deviate substantially from the reference implementation and will be closer to the M4F version.

The memory was only a limitation on the stm32f4discovery for HAETAE3 and 5 right?
With the nucleo-l4r5zi and its 640KB of RAM, this should not be an issue.

The Testvector-test probably runs on the host anyway?!

stack usage (keypair/sign/verify): * haetae2: 26152 / 83128 / 29856

Add slightly modified reference implementations to haetae2, Add slightly modified reference implementations to haetae2, haetae3 and haetae5 with lower stack memory footprint than the original reference implementation. This enables the test vector comparison for all schemes. CAVEAT: This commit modifies the following PQM4 core files * ldscripts/stm32f4discovery.ld * ldscripts/stm32f4discovery_fullram.ld * mk/stm32f4discovery.mk The two load scripts are modified as recommended in [issue 310](mupq#310 (comment)). The make file is modified to use full ram for the implementations m4f and ref of scheme haetae5, as they would run out of memory otherwise, similar to dilithium5. The stack memory footprint was reduced by: * Storing A1 using uint16 instead of int32, halving its footprint * Grouping some vectors inside `crypto_sign_signature()`, whose periods of liveliness do not overlap, into unions. The modification is light enough to easily verify consistency with the reference implementation.

Add slightly modified reference implementations to haetae2, haetae3 and haetae5, labeled as `ref`, with lower stack memory footprint than the original reference implementation. This enables running testvectors.py for all schemes. CAVEAT: This commit modifies the following PQM4 core files * ldscripts/stm32f4discovery.ld * ldscripts/stm32f4discovery_fullram.ld * mk/stm32f4discovery.mk The two load scripts are modified as recommended in [issue 310](mupq#310 (comment)). The make file is modified to use full ram for the implementations m4f and ref of scheme haetae5, as they would run out of memory otherwise, similar to dilithium5. The stack memory footprint was reduced by: * Storing A1 using uint16 instead of int32, halving its footprint * Grouping some vectors inside `crypto_sign_signature()`, whose periods of liveliness do not overlap, into unions. The modification is light enough to easily verify consistency with the reference implementation.

mmoeller23 · 2023-12-01T08:49:00Z

I have added slightly modified reference implementations to all schemes, testvectors.py works now.

CAVEAT: Commit f7aedf0 includes modifications to PQM4 core files that are required to make this work.

Applied the patch from issue 310 to
- ldscripts/stm32f4discovery.ld
- ldscripts/stm32f4discovery_fullram.ld
Patched mk/stm32f4discovery.mk for haetae5 to use the full RAM model for both implementations, just like dilithium5.

rpls · 2023-12-01T09:34:19Z

Could you add the ref implementations in mupq instead? All portable pure C stuff should go there.

The pure C reference implementations were removed from this pull request. A corresponding pull request in MUPQ/MUPQ has been initiated: mupq/mupq#131

mmoeller23 · 2023-12-01T10:45:20Z

I have removed the pure C reference implementation from this pull request and initiated a new pull request at mupq

mupq/mupq#131

mmoeller23 · 2023-12-01T12:02:29Z

Please do not pull this at the moment

mmoeller23 · 2023-12-01T14:51:10Z

Please do not pull this at the moment

All good, you can pull again.

markuskrausz · 2023-12-04T08:18:09Z

This implements the HAETAE specification 2.0 and corresponds to the M4 implementation discussed in the 2.0 specification document.

mmoeller23 added 7 commits November 21, 2023 00:01

initial commit of HAETAE

564ac86

M4F version corresponds to reference version of 2023-10-21.

add clean implementation for HAETAE2

0bdc33e

The clean implementation is only minimally changed from the reference implementation to conform with the PQM4 API. The clean implementation would run out of memory for HAETAE3 and HAETAE5 and is therefore not added for those modes.

Merge branch 'haetae_dev' into haetae

fe44f74

mmoeller23 added 3 commits December 1, 2023 09:14

add ref implementation for haetae2

258a11f

stack usage (keypair/sign/verify): * haetae2: 26152 / 83128 / 29856

Move reference implementations to MUPQ/MUPQ

8719b8e

The pure C reference implementations were removed from this pull request. A corresponding pull request in MUPQ/MUPQ has been initiated: mupq/mupq#131

rpls merged commit 4ad3ef6 into mupq:master Jan 7, 2024

rpls mentioned this pull request Jan 17, 2024

Add HAETAE #273

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds HAETAE #313

Adds HAETAE #313

mmoeller23 commented Nov 23, 2023

rpls commented Nov 23, 2023

rpls commented Nov 24, 2023

mmoeller23 commented Nov 27, 2023

markuskrausz commented Nov 28, 2023

mmoeller23 commented Dec 1, 2023

rpls commented Dec 1, 2023

mmoeller23 commented Dec 1, 2023

mmoeller23 commented Dec 1, 2023

mmoeller23 commented Dec 1, 2023

markuskrausz commented Dec 4, 2023

Adds HAETAE #313

Adds HAETAE #313

Conversation

mmoeller23 commented Nov 23, 2023

rpls commented Nov 23, 2023

rpls commented Nov 24, 2023

mmoeller23 commented Nov 27, 2023

markuskrausz commented Nov 28, 2023

mmoeller23 commented Dec 1, 2023

rpls commented Dec 1, 2023

mmoeller23 commented Dec 1, 2023

mmoeller23 commented Dec 1, 2023

mmoeller23 commented Dec 1, 2023

markuskrausz commented Dec 4, 2023