Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it Constant-Time #43

Merged
merged 17 commits into from
Dec 21, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
4775a10
add dudect as git submodule based dependency
itzmeanjan Dec 17, 2023
dde96f4
use function parameters, instead of template parameters for computing…
itzmeanjan Dec 17, 2023
f4ce1d2
setup build infra for dudect based constant-time testing
itzmeanjan Dec 17, 2023
b5c9a95
add dudect based constant-timeness test for kyber512 KEM
itzmeanjan Dec 17, 2023
06fae52
test whether sampling of secret polynomial vector is timing leakage f…
itzmeanjan Dec 20, 2023
1d16c8e
get rid of division by non-power-of-2 value
itzmeanjan Dec 20, 2023
2609575
don't rely on result on comparison operator for reducing by prime mod…
itzmeanjan Dec 20, 2023
3551f37
refactor constant-time {byte array comparison and conditional memcpy}…
itzmeanjan Dec 20, 2023
2a7bbfd
test whether internal functions of Kyber512 KEM are timing leakage fr…
itzmeanjan Dec 20, 2023
3bda8f9
rename functions for sake of better readability
itzmeanjan Dec 20, 2023
9a621ff
integrate dudect based all timing leakage tests under single test sce…
itzmeanjan Dec 20, 2023
a62ad39
make it easy to run all dudect binaries
itzmeanjan Dec 20, 2023
7e5f3a7
explicitly declare common type
itzmeanjan Dec 20, 2023
28196dd
update how dudect tests are executed
itzmeanjan Dec 21, 2023
af36cb9
add timing leakage tests for kyber768 KEM
itzmeanjan Dec 21, 2023
1bf3d64
add dudect based timing leakage tests for kyber1024 KEM
itzmeanjan Dec 21, 2023
ace1a3c
mention about availability of `dudect` based timing leakage test
itzmeanjan Dec 21, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,6 @@
[submodule "subtle"]
path = subtle
url = https://github.com/itzmeanjan/subtle.git
[submodule "dudect"]
path = dudect
url = https://github.com/oreparaz/dudect.git
19 changes: 18 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,28 @@ UBSAN_FLAGS = -g -O1 -fno-omit-frame-pointer -fno-optimize-sibling-calls -fsanit

SHA3_INC_DIR = ./sha3/include
SUBTLE_INC_DIR = ./subtle/include
DUDECT_INC_DIR = ./dudect/src
I_FLAGS = -I ./include
DEP_IFLAGS = -I $(SHA3_INC_DIR) -I $(SUBTLE_INC_DIR)
DUDECT_DEP_IFLAGS = $(DEP_IFLAGS) -I $(DUDECT_INC_DIR)

SRC_DIR = include
KYBER_SOURCES := $(wildcard $(SRC_DIR)/*.hpp)
BUILD_DIR = build
DUDECT_BUILD_DIR = $(BUILD_DIR)/dudect
ASAN_BUILD_DIR = $(BUILD_DIR)/asan
UBSAN_BUILD_DIR = $(BUILD_DIR)/ubsan

TEST_DIR = tests
DUDECT_TEST_DIR = $(TEST_DIR)/dudect
TEST_SOURCES := $(wildcard $(TEST_DIR)/*.cpp)
DUDECT_TEST_SOURCES := $(wildcard $(DUDECT_TEST_DIR)/*.cpp)
TEST_OBJECTS := $(addprefix $(BUILD_DIR)/, $(notdir $(patsubst %.cpp,%.o,$(TEST_SOURCES))))
ASAN_TEST_OBJECTS := $(addprefix $(ASAN_BUILD_DIR)/, $(notdir $(patsubst %.cpp,%.o,$(TEST_SOURCES))))
UBSAN_TEST_OBJECTS := $(addprefix $(UBSAN_BUILD_DIR)/, $(notdir $(patsubst %.cpp,%.o,$(TEST_SOURCES))))
TEST_LINK_FLAGS = -lgtest -lgtest_main
TEST_BINARY = $(BUILD_DIR)/test.out
DUDECT_TEST_BINARIES := $(addprefix $(DUDECT_BUILD_DIR)/, $(notdir $(patsubst %.cpp,%.out,$(DUDECT_TEST_SOURCES))))
ASAN_TEST_BINARY = $(ASAN_BUILD_DIR)/test.out
UBSAN_TEST_BINARY = $(UBSAN_BUILD_DIR)/test.out

Expand All @@ -37,6 +43,9 @@ PERF_BINARY = $(BUILD_DIR)/perf.out

all: test

$(DUDECT_BUILD_DIR):
mkdir -p $@

$(ASAN_BUILD_DIR):
mkdir -p $@

Expand All @@ -49,6 +58,8 @@ $(BUILD_DIR):
$(SHA3_INC_DIR):
git submodule update --init

$(DUDECT_INC_DIR): $(SHA3_INC_DIR)

$(SUBTLE_INC_DIR): $(SHA3_INC_DIR)

$(BUILD_DIR)/%.o: $(TEST_DIR)/%.cpp $(BUILD_DIR) $(SHA3_INC_DIR) $(SUBTLE_INC_DIR)
Expand All @@ -63,6 +74,9 @@ $(UBSAN_BUILD_DIR)/%.o: $(TEST_DIR)/%.cpp $(UBSAN_BUILD_DIR) $(SHA3_INC_DIR) $(S
$(TEST_BINARY): $(TEST_OBJECTS)
$(CXX) $(OPT_FLAGS) $(LINK_FLAGS) $^ $(TEST_LINK_FLAGS) -o $@

$(DUDECT_BUILD_DIR)/%.out: $(DUDECT_TEST_DIR)/%.cpp $(DUDECT_BUILD_DIR) $(SHA3_INC_DIR) $(SUBTLE_INC_DIR) $(DUDECT_INC_DIR)
$(CXX) $(CXX_FLAGS) $(WARN_FLAGS) $(OPT_FLAGS) $(I_FLAGS) $(DUDECT_DEP_IFLAGS) $(LINK_FLAGS) $< -o $@

$(ASAN_TEST_BINARY): $(ASAN_TEST_OBJECTS)
$(CXX) $(ASAN_FLAGS) $^ $(TEST_LINK_FLAGS) -o $@

Expand All @@ -72,6 +86,9 @@ $(UBSAN_TEST_BINARY): $(UBSAN_TEST_OBJECTS)
test: $(TEST_BINARY)
./$< --gtest_shuffle --gtest_random_seed=0

dudect_test: $(DUDECT_TEST_BINARIES)
$(foreach binary,$^,timeout 3.0m ./$(binary);)

asan_test: $(ASAN_TEST_BINARY)
./$< --gtest_shuffle --gtest_random_seed=0

Expand Down Expand Up @@ -100,5 +117,5 @@ perf: $(PERF_BINARY)
clean:
rm -rf $(BUILD_DIR)

format: $(KYBER_SOURCES) $(TEST_SOURCES) $(BENCHMARK_SOURCES)
format: $(KYBER_SOURCES) $(TEST_SOURCES) $(DUDECT_TEST_SOURCES) $(BENCHMARK_SOURCES)
clang-format -i $^
93 changes: 76 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
> [!CAUTION]
> This Kyber implementation is conformant with Kyber specification https://pq-crystals.org/kyber/data/kyber-specification-round3-20210804.pdf and I also *try* to make it constant-time but be informed that it is not yet audited. *If you consider using it in production, be careful !*
> This Kyber implementation is conformant with Kyber specification https://pq-crystals.org/kyber/data/kyber-specification-round3-20210804.pdf and I also *try* to make it timing leakage free, using **dudect** (see https://github.com/oreparaz/dudect) but be informed that it is not yet audited. *If you consider using it in production, be careful !*

# kyber
CRYSTALS-Kyber: Post-Quantum Public-key Encryption &amp; Key-establishment Algorithm
Expand All @@ -10,22 +10,19 @@ Kyber is being standardized by NIST as post-quantum secure key encapsulation mec

Kyber offers an *IND-CCA2-secure* Key Encapsulation Mechanism - its security is based on the hardness of solving the learning-with-errors (LWE) problem in module (i.e. structured) lattices.

Kyber Key Encapsulation Mechanism is built on top of *IND-CPA-secure Kyber Public Key Encryption*, where two communicating parties, both generating their key pairs, while publishing their public keys to each other, can encrypt fixed length ( = 32 -bytes ) message using peer's public key. Cipher text can be decrypted by corresponding secret key ( which is private to the keypair owner ) and 32 -bytes message can be recovered back. Then a slightly tweaked Fujisaki–Okamoto (FO) transform is applied on *IND-CPA-secure Kyber PKE* - giving us the *IND-CCA2-secure KEM* construction. In KEM scheme, two parties interested in establishing a secure communication channel over public & insecure channel, can generate a shared secret key ( of arbitrary byte length ) from a key derivation function ( i.e. KDF which is SHAKE256 Xof in this context ) which is obtained by both of these parties as result of seeding SHAKE256 Xof with same secret. This secret is 32 -bytes and that's what is communicated by sender to receiver using underlying Kyber PKE scheme.
Kyber Key Encapsulation Mechanism is built on top of *IND-CPA-secure Kyber Public Key Encryption*, where two communicating parties, both generating their key pairs, while publishing only their public keys to each other, can encrypt fixed length ( = 32 -bytes ) message using peer's public key. Cipher text can be decrypted by corresponding secret key ( which is private to the keypair owner ) and 32 -bytes message can be recovered back. Then a slightly tweaked Fujisaki–Okamoto (FO) transform is applied on *IND-CPA-secure Kyber PKE* - giving us the *IND-CCA2-secure KEM* construction. In KEM scheme, two parties interested in establishing a secure communication channel over public & insecure channel, can generate a shared secret key ( of arbitrary byte length ) from a key derivation function ( i.e. KDF which is SHAKE256 Xof in this context ) which is obtained by both of these parties as result of seeding SHAKE256 Xof with same secret. This secret is 32 -bytes and that's what is communicated by sender to receiver using underlying Kyber PKE scheme.

Algorithm | Input | Output
--- | :-: | --:
KEM KeyGen | - | Public Key and Secret Key
Encapsulation | Public Key | Cipher Text and SHAKE256 KDF
Decapsulation | Secret Key and Cipher Text | SHAKE256 KDF

Here I'm maintaining `kyber` - a header-only and easy-to-use ( see more in [usage](#usage) ) C++ library implementing Kyber KEM, supporting Kyber-{512, 768, 1024} parameter sets, as defined in table 1 of Kyber specification. `sha3` and `subtle` are two dependencies of this library, which are pinned to specific git commits, using git submodule.
Here I'm maintaining `kyber` - a header-only and easy-to-use ( see more in [usage](#usage) ) C++ library implementing Kyber KEM, supporting Kyber-{512, 768, 1024} parameter sets, as defined in table 1 of Kyber specification. `sha3`, `subtle` and `dudect` (for timing leakage tests) are dependencies of this library, which are pinned to specific git commits, using git submodule.

> [!NOTE]
> Find Kyber specification https://pq-crystals.org/kyber/data/kyber-specification-round3-20210804.pdf - this is the document that I followed when implementing Kyber. I suggest you go through the specification to get an in-depth understanding of Kyber PQC suite.

> [!NOTE]
> Find progress of NIST PQC standardization effort @ https://csrc.nist.gov/projects/post-quantum-cryptography.

## Prerequisites

- A C++ compiler with C++20 standard library such as `clang++`/ `g++`.
Expand Down Expand Up @@ -74,37 +71,99 @@ make ubsan_test -j # Run tests with UndefinedBehaviourSanitizer enabled
```

```bash
Note: Randomizing tests' orders with a seed of 61247 .
Note: Randomizing tests' orders with a seed of 50193 .
[==========] Running 10 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 10 tests from KyberKEM
[ RUN ] KyberKEM.ArithmeticOverZq
[ OK ] KyberKEM.ArithmeticOverZq (116 ms)
[ RUN ] KyberKEM.NumberTheoreticTransform
[ OK ] KyberKEM.NumberTheoreticTransform (0 ms)
[ OK ] KyberKEM.ArithmeticOverZq (126 ms)
[ RUN ] KyberKEM.Kyber768KeygenEncapsDecaps
[ OK ] KyberKEM.Kyber768KeygenEncapsDecaps (0 ms)
[ RUN ] KyberKEM.Kyber512KeygenEncapsDecaps
[ OK ] KyberKEM.Kyber512KeygenEncapsDecaps (0 ms)
[ RUN ] KyberKEM.Kyber768KnownAnswerTests
[ OK ] KyberKEM.Kyber768KnownAnswerTests (8 ms)
[ RUN ] KyberKEM.Kyber512KnownAnswerTests
[ OK ] KyberKEM.Kyber512KnownAnswerTests (5 ms)
[ RUN ] KyberKEM.CompressDecompressZq
[ OK ] KyberKEM.CompressDecompressZq (94 ms)
[ OK ] KyberKEM.CompressDecompressZq (98 ms)
[ RUN ] KyberKEM.Kyber1024KnownAnswerTests
[ OK ] KyberKEM.Kyber1024KnownAnswerTests (13 ms)
[ RUN ] KyberKEM.Kyber768KeygenEncapsDecaps
[ OK ] KyberKEM.Kyber768KeygenEncapsDecaps (0 ms)
[ RUN ] KyberKEM.Kyber512KeygenEncapsDecaps
[ OK ] KyberKEM.Kyber512KeygenEncapsDecaps (0 ms)
[ RUN ] KyberKEM.NumberTheoreticTransform
[ OK ] KyberKEM.NumberTheoreticTransform (0 ms)
[ RUN ] KyberKEM.PolynomialSerialization
[ OK ] KyberKEM.PolynomialSerialization (0 ms)
[ RUN ] KyberKEM.Kyber1024KeygenEncapsDecaps
[ OK ] KyberKEM.Kyber1024KeygenEncapsDecaps (0 ms)
[----------] 10 tests from KyberKEM (238 ms total)
[----------] 10 tests from KyberKEM (253 ms total)

[----------] Global test environment tear-down
[==========] 10 tests from 1 test suite ran. (238 ms total)
[==========] 10 tests from 1 test suite ran. (253 ms total)
[ PASSED ] 10 tests.
```

In case you're interested in running timing leakage tests using `dudect`, execute following

> ![NOTE]
> `dudect` is integrated into this library implementation of Kyber KEM to find any sort of timing leakages. It checks for constant-timeness of all *vital* functions including Fujisaki-Okamoto transform, used in decapsulation step. It doesn't check constant-timeness of function which samples public matrix `A`, because that fails the check anyway, due to use of uniform rejection sampling. As matrix `A` is public, it's not critical that it must be *strictly* constant-time.

```bash
make dudect_test -j # Only on x86_64 machine
# Each executable is run for at max 3 mins.
```

> ![TIP]
> `dudect` documentation says if `t` statistic is < 10, we're *probably* good, yes **probably**. You may want to read `dudect` documentation @ https://github.com/oreparaz/dudect. Also you might find the original paper @ https://ia.cr/2016/1123 interesting.

```bash
meas: 0.10 M, max t: +2.35, max tau: 7.27e-03, (5/tau)^2: 4.73e+05. For the moment, maybe constant time.
meas: 0.12 M, max t: +1.89, max tau: 5.57e-03, (5/tau)^2: 8.06e+05. For the moment, maybe constant time.
meas: 3.10 M, max t: +2.48, max tau: 1.41e-03, (5/tau)^2: 1.26e+07. For the moment, maybe constant time.
meas: 2.07 M, max t: +1.72, max tau: 1.20e-03, (5/tau)^2: 1.75e+07. For the moment, maybe constant time.
meas: 2.10 M, max t: +1.66, max tau: 1.14e-03, (5/tau)^2: 1.91e+07. For the moment, maybe constant time.
meas: 6.01 M, max t: +1.67, max tau: 6.82e-04, (5/tau)^2: 5.37e+07. For the moment, maybe constant time.
meas: 7.31 M, max t: +1.67, max tau: 6.18e-04, (5/tau)^2: 6.54e+07. For the moment, maybe constant time.
meas: 7.96 M, max t: +2.04, max tau: 7.22e-04, (5/tau)^2: 4.80e+07. For the moment, maybe constant time.
meas: 9.41 M, max t: +1.70, max tau: 5.54e-04, (5/tau)^2: 8.14e+07. For the moment, maybe constant time.
meas: 9.89 M, max t: +1.59, max tau: 5.05e-04, (5/tau)^2: 9.78e+07. For the moment, maybe constant time.
meas: 0.99 M, max t: +2.16, max tau: 2.18e-03, (5/tau)^2: 5.28e+06. For the moment, maybe constant time.
meas: 0.14 M, max t: +2.04, max tau: 5.44e-03, (5/tau)^2: 8.45e+05. For the moment, maybe constant time.
meas: 2.31 M, max t: +2.90, max tau: 1.90e-03, (5/tau)^2: 6.89e+06. For the moment, maybe constant time.
meas: 3.03 M, max t: +3.55, max tau: 2.04e-03, (5/tau)^2: 5.99e+06. For the moment, maybe constant time.
meas: 3.56 M, max t: +3.23, max tau: 1.71e-03, (5/tau)^2: 8.56e+06. For the moment, maybe constant time.
meas: 4.18 M, max t: +2.42, max tau: 1.18e-03, (5/tau)^2: 1.78e+07. For the moment, maybe constant time.
meas: 7.16 M, max t: +2.40, max tau: 8.96e-04, (5/tau)^2: 3.12e+07. For the moment, maybe constant time.
meas: 8.25 M, max t: +2.21, max tau: 7.68e-04, (5/tau)^2: 4.24e+07. For the moment, maybe constant time.
meas: 9.20 M, max t: +2.27, max tau: 7.48e-04, (5/tau)^2: 4.47e+07. For the moment, maybe constant time.
meas: 10.23 M, max t: +2.45, max tau: 7.66e-04, (5/tau)^2: 4.26e+07. For the moment, maybe constant time.
meas: 6.93 M, max t: +2.54, max tau: 9.65e-04, (5/tau)^2: 2.69e+07. For the moment, maybe constant time.
meas: 7.49 M, max t: +2.54, max tau: 9.30e-04, (5/tau)^2: 2.89e+07. For the moment, maybe constant time.
meas: 8.04 M, max t: +2.16, max tau: 7.61e-04, (5/tau)^2: 4.32e+07. For the moment, maybe constant time.
meas: 8.57 M, max t: +2.08, max tau: 7.10e-04, (5/tau)^2: 4.96e+07. For the moment, maybe constant time.
meas: 9.15 M, max t: +2.03, max tau: 6.72e-04, (5/tau)^2: 5.54e+07. For the moment, maybe constant time.
meas: 0.15 M, max t: +1.80, max tau: 4.60e-03, (5/tau)^2: 1.18e+06. For the moment, maybe constant time.
meas: 8.04 M, max t: +1.90, max tau: 6.70e-04, (5/tau)^2: 5.57e+07. For the moment, maybe constant time.
meas: 10.31 M, max t: +2.04, max tau: 6.35e-04, (5/tau)^2: 6.20e+07. For the moment, maybe constant time.
meas: 10.38 M, max t: +2.05, max tau: 6.35e-04, (5/tau)^2: 6.19e+07. For the moment, maybe constant time.
meas: 9.19 M, max t: +1.99, max tau: 6.56e-04, (5/tau)^2: 5.80e+07. For the moment, maybe constant time.
meas: 9.24 M, max t: +2.04, max tau: 6.69e-04, (5/tau)^2: 5.58e+07. For the moment, maybe constant time.
meas: 1.02 M, max t: +1.98, max tau: 1.97e-03, (5/tau)^2: 6.47e+06. For the moment, maybe constant time.
meas: 2.10 M, max t: +2.10, max tau: 1.45e-03, (5/tau)^2: 1.19e+07. For the moment, maybe constant time.
meas: 1.40 M, max t: +1.81, max tau: 1.52e-03, (5/tau)^2: 1.08e+07. For the moment, maybe constant time.
meas: 1.41 M, max t: +2.21, max tau: 1.86e-03, (5/tau)^2: 7.22e+06. For the moment, maybe constant time.
meas: 1.81 M, max t: +2.95, max tau: 2.19e-03, (5/tau)^2: 5.20e+06. For the moment, maybe constant time.
meas: 2.54 M, max t: +2.96, max tau: 1.86e-03, (5/tau)^2: 7.26e+06. For the moment, maybe constant time.
meas: 3.15 M, max t: +2.77, max tau: 1.56e-03, (5/tau)^2: 1.02e+07. For the moment, maybe constant time.
meas: 4.94 M, max t: +2.46, max tau: 1.11e-03, (5/tau)^2: 2.04e+07. For the moment, maybe constant time.
meas: 0.91 M, max t: +2.06, max tau: 2.17e-03, (5/tau)^2: 5.32e+06. For the moment, maybe constant time.
meas: 1.21 M, max t: +2.19, max tau: 1.99e-03, (5/tau)^2: 6.32e+06. For the moment, maybe constant time.
meas: 1.44 M, max t: +2.24, max tau: 1.87e-03, (5/tau)^2: 7.17e+06. For the moment, maybe constant time.
meas: 8.74 M, max t: +2.32, max tau: 7.87e-04, (5/tau)^2: 4.04e+07. For the moment, maybe constant time.
meas: 9.65 M, max t: +2.42, max tau: 7.80e-04, (5/tau)^2: 4.11e+07. For the moment, maybe constant time.
meas: 10.57 M, max t: +2.22, max tau: 6.82e-04, (5/tau)^2: 5.38e+07. For the moment, maybe constant time.
meas: 11.71 M, max t: +2.45, max tau: 7.16e-04, (5/tau)^2: 4.88e+07. For the moment, maybe constant time.
```

## Benchmarking

For benchmarking Kyber KEM routines ( i.e. keygen, encaps and decaps ) for various suggested parameter sets, you have to issue.
Expand Down
16 changes: 8 additions & 8 deletions benchmarks/bench_kem.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@ void
bench_keygen(benchmark::State& state)
{
constexpr size_t slen = 32;
constexpr size_t pklen = kyber_utils::get_kem_public_key_len<k>();
constexpr size_t sklen = kyber_utils::get_kem_secret_key_len<k>();
constexpr size_t pklen = kyber_utils::get_kem_public_key_len(k);
constexpr size_t sklen = kyber_utils::get_kem_secret_key_len(k);

std::vector<uint8_t> d(slen);
std::vector<uint8_t> z(slen);
Expand Down Expand Up @@ -44,9 +44,9 @@ void
bench_encapsulate(benchmark::State& state)
{
constexpr size_t slen = 32;
constexpr size_t pklen = kyber_utils::get_kem_public_key_len<k>();
constexpr size_t sklen = kyber_utils::get_kem_secret_key_len<k>();
constexpr size_t ctlen = kyber_utils::get_kem_cipher_len<k, du, dv>();
constexpr size_t pklen = kyber_utils::get_kem_public_key_len(k);
constexpr size_t sklen = kyber_utils::get_kem_secret_key_len(k);
constexpr size_t ctlen = kyber_utils::get_kem_cipher_len(k, du, dv);
constexpr size_t klen = 32;

std::vector<uint8_t> d(slen);
Expand Down Expand Up @@ -94,9 +94,9 @@ void
bench_decapsulate(benchmark::State& state)
{
constexpr size_t slen = 32;
constexpr size_t pklen = kyber_utils::get_kem_public_key_len<k>();
constexpr size_t sklen = kyber_utils::get_kem_secret_key_len<k>();
constexpr size_t ctlen = kyber_utils::get_kem_cipher_len<k, du, dv>();
constexpr size_t pklen = kyber_utils::get_kem_public_key_len(k);
constexpr size_t sklen = kyber_utils::get_kem_secret_key_len(k);
constexpr size_t ctlen = kyber_utils::get_kem_cipher_len(k, du, dv);
constexpr size_t klen = 32;

std::vector<uint8_t> d(slen);
Expand Down
1 change: 1 addition & 0 deletions dudect
Submodule dudect added at 04235f
17 changes: 10 additions & 7 deletions include/compression.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -12,20 +12,23 @@ namespace kyber_utils {
//
// See top of page 5 of Kyber specification
// https://pq-crystals.org/kyber/data/kyber-specification-round3-20210804.pdf
//
// Following implementation collects inspiration from https://github.com/FiloSottile/mlkem768/blob/cffbfb96c407b3cfc9f6e1749475b673794402c1/mlkem768.go#L395-L425.
template<size_t d>
static inline constexpr field::zq_t
compress(const field::zq_t x)
requires(kyber_params::check_d(d))
{
constexpr uint16_t t0 = 1u << d;
constexpr uint32_t t1 = field::Q >> 1;
constexpr uint16_t mask = (1u << d) - 1;

const uint32_t t2 = x.raw() << d;
const uint32_t t3 = t2 + t1;
const uint16_t t4 = static_cast<uint16_t>(t3 / field::Q);
const uint16_t t5 = t4 & (t0 - 1);
const auto dividend = x.raw() << d;
const auto quotient0 = static_cast<uint32_t>((static_cast<uint64_t>(dividend) * field::R) >> (field::RADIX_BIT_WIDTH * 2));
const auto remainder = dividend - quotient0 * field::Q;

const auto quotient1 = quotient0 + ((((field::Q / 2) - remainder) >> 31) & 1);
const auto quotient2 = quotient1 + (((field::Q + (field::Q / 2) - remainder) >> 31) & 1);

return field::zq_t(t5);
return field::zq_t(static_cast<uint16_t>(quotient2) & mask);
}

// Given an element x ∈ [0, 2^d) | d < round(log2(q)), this routine decompresses
Expand Down
Loading
Loading