From e12620a8a909805755ae1e8b8552c187202a9f3f Mon Sep 17 00:00:00 2001 From: Frederic Lecaille Date: Fri, 2 Aug 2024 11:05:45 +0200 Subject: [PATCH] BUG/MINOR: quic: Too shord datagram during O-RTT handshakes (aws-lc only) By "aws-lc only", one means that this bug was first revealed by aws-lc stack. This does not mean it will not appeared for new versions of other TLS stacks which have never revealed this bug. This bug was reported by Ilya (@chipitsine) in GH #2657 where some QUIC interop tests (resumption, zerortt) could lead to crash with haproxy compiled against aws-lc TLS stack. These crashed were triggered by this BUG_ON() which detects that too short datagrams with at least one ack-eliciting Initial packet inside could be built. <0>2024-07-31T15:13:42.562717+02:00 [01|quic|5|quic_tx.c:739] qc_prep_pkts(): next encryption level : qc@0x61d000041080 idle_timer_task@0x60d000006b80 flags=0x6000058 FATAL: bug condition "first_pkt->type == QUIC_PACKET_TYPE_INITIAL && (first_pkt->flags & (1UL << 0)) && length < 1200" matched at src/quic_tx.c:163 call trace(12): | 0x563ea447bc02 [ba d9 00 00 00 48 8d 35]: main-0x1958ce | 0x563ea4482703 [e9 73 fe ff ff ba 03 00]: qc_send+0x17e4/0x1b5d | 0x563ea4488ab4 [85 c0 0f 85 00 f6 ff ff]: quic_conn_io_cb+0xab1/0xf1c | 0x563ea468e6f9 [48 c7 c0 f8 55 ff ff 64]: run_tasks_from_lists+0x173/0x9c2 | 0x563ea468f24a [8b 7d a0 29 c7 85 ff 0f]: process_runnable_tasks+0x302/0x6e6 | 0x563ea4610893 [83 3d aa 65 44 00 01 0f]: run_poll_loop+0x6e/0x57b | 0x563ea4611043 [48 8b 1d 46 c7 1d 00 48]: main-0x48d | 0x7f64d05fb609 [64 48 89 04 25 30 06 00]: libpthread:+0x8609 | 0x7f64d0520353 [48 89 c7 b8 3c 00 00 00]: libc:clone+0x43/0x5e That said everything was correctly done by qc_prep_ptks() to prevent such a case. But this relied on the hypothesis that the list of encryption levels it used was always built in the same order as follows for 0-RTT sessions: initial, early-data, handshake, application But this order is determined but the order the TLS stack derives the secrets for these encryption levels. For aws-lc, this order is not the same but as follows: initial, handshake, application, early-data During 0-RTT sessions, the server may have to build three ack-eliciting packets (with CRYPTO data inside) to reply to the first client packet: initial, hanshake, application. qc_prep_pkts() adds a PADDING frame to the last built packet for the last encryption level in the list. But after application level encryption, there is early-data encryption level. This prevented qc_prep_pkts() to build a padded applicaiton level last packet to send a 1200-bytes datagram. To fix this, always insert early-data encryption level after the initial encryption level into the encryption levels list when initializing this encryption level from quic_conn_enc_level_init(). Must be backported as far as 2.9. --- src/quic_tls.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/quic_tls.c b/src/quic_tls.c index 5174c2dc9e64c..bac6c0c9da8a9 100644 --- a/src/quic_tls.c +++ b/src/quic_tls.c @@ -243,7 +243,19 @@ static int quic_conn_enc_level_init(struct quic_conn *qc, goto err; } - LIST_APPEND(&qc->qel_list, &qel->list); + /* Ensure early-data encryption is not inserted at the end of this ->qel_list + * list. This would perturbate the sender during handshakes. This latter adds + * PADDING frames to datagrams from the last encryption level in this list, + * for datagram with at least an ack-eliciting Initial packet inside. + * But a QUIC server has nothing to send from this early-data encryption + * level, contrary to the client. + * Here early-data is added after the Initial encryption level which is + * always already present. + */ + if (level == ssl_encryption_early_data) + LIST_APPEND(&qc->iel->list, &qel->list); + else + LIST_APPEND(&qc->qel_list, &qel->list); *el = qel; ret = 1; leave: