The framing format for V5 is based on the internode messaging system as it was redesigned for 4.0 in CASSANDRA-15066. In prior versions, the relationship between Message
and Frame
is 1:1, each CQL message being contained in exactly one frame and every frame containing only a single message. The changes introduced here enable frames to contain multiple small messages and for oversized messages to be broken into multiple frames.
The introduction of this new format, brings an unfortunate naming clash between frames from the previous format, with a 1:1 relationship to messages, and the new outer frames. Taking inspiration from SMTP (RFC-5321), what was previously referred to as a Frame
is better described as an Envelope
which comprises a CQL message
along with some attendant metadata.
In general, a frame
comprises a header
, payload
and trailer
; this proposal introduces two specific frame
formats, compressed
and uncompressed
. In both cases, the payload
is a stream of CQL Envelopes
, each containing a single CQL Message
. In effect, the new framing format is a simple wrapper around the previous protocol.
In all cases, a frame
may or may not be self contained
. If self contained
, the payload
includes
one or more complete CQL envelopes
and can be fully processed immediately. Otherwise, the payload
contains some part of a large CQL envelope
, which has been split into its own sequence of outer frames
. These are expected to be transmitted/received in order, so a processor can accumulate them as they arrive and process them once all have been received.
The header contains length information for the payload
, whether or not the frame is self contained
and a CRC
to protect the integrity of the header
itself. There are slight variations in the header
format between the compressed and uncompressed variants.
The payload is opaque as far as the framing format is concerned, modulo the self contained
variation.
The trailer
contains a CRC
to protect the integrity of the payload
. As the payload
includes the CQL envelopes
including both the message and the attached metadata, such as the stream ID
, flags
and opcode
, this adds a level of protection that previously wasn’t available (for example from the implementation in CASSANDRA-13304).
The uncompressed variant uses a 6 byte header
containing payload length
, self contained flag
and CRC24
for the header
. The max size for the payload is 128KiB
, and is followed by its CRC32
.
1. Payload length (17 bits) 2. isSelfContained flag (1 bit) 3. Header padding (6 bits) 4. CRC24 of the header (24 bits) 5. Payload (up to 2 ^ 17 - 1 bits) 6. Payload CRC32 (32 bits) 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Payload Length |C| | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ CRC24 of Header | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + + | Payload | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CRC32 of Payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The variant with LZ4
compression uses an 8 byte header
, containing both the compressed and uncompressed lengths of the payload
, the self contained flag
and a CRC24
for the header
. As with uncompressed frames, the max payload size is 128KiB
and is followed by a CRC32
trailer. This is the CRC
of the compressed payload
.
1. Compressed length (17 bits) 2. Uncompressed length (17 bits) 3. {@code isSelfContained} flag (1 bit) 4. Header padding (5 bits) 5. CRC24 of Header contents (24 bits) 6. Compressed Payload (up to 2 ^ 17 - 1 bits) 7. CRC32 of Compressed Payload (32 bits) 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Compressed Length | Uncompressed Length +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |C| | CRC24 of Header | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | Compressed Payload | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CRC32 of Compressed Payload | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Other than the enclosure of CQL envelopes
in frames
, the changes required to the protocol itself are minimal. All message exchanges follow the same REQUEST/RESPONSE
specifications as before and no changes are required to either the CQL envelope (frame)
or CQL message
formats. CQL envelope
headers contain some information which is now redundant and currently ignored. The CQL envelope
header contains the protocol version, which was always to some extent redundant as the version is set and enforced at the connection level. It was also previously possible to enable compression at the individual CQL envelope
level. This is no longer an option, the outer framing format being responsible for compression, which is set for the lifetime of a connection and applies to all messages transmitted throughout it. To that end, the compression
flag on the CQL envelope
header is ignored in V5.
-
TODO - note on conditional compression
In order to support both V5 and earlier formats, the V5 outer framing
format is not applied to message exchanges before a STARTUP
exchange is completed. This means that the initial STARTUP
message and any OPTIONS
messages which precede it are expected without the outer framing
. Likewise, the responses returned by the server (SUPPORTED
for OPTIONS
and either READY
or AUTHENTICATE
for STARTUP
) are transmitted without the outer framing
.
After sending the response to a STARTUP
(READY
or AUTHENTICATE
), the server will begin encoding and decoding further transmissions according to the protocol version of that STARTUP
message. Compression of the outer frames
is dictated by the COMPRESSION
option sent in the STARTUP
message. Only LZ4
compression is currently supported for V5.
Note: OPTIONS
requests may be sent by the client at any time in the connection lifecycle, both before and after the STARTUP
exchange. As mentioned, those transmitted before STARTUP
, as well as the SUPPORTED
responses the server returns are not enclosed in outer frames
. Any OPTIONS/SUPPORTED
exchanges after the STARTUP
exchange are formatted according to the protocol version. So in V5, these will be enclosed in outer frames
.