Skip to content

Latest commit

 

History

History
101 lines (74 loc) · 7.88 KB

cql_protocol_V5_framing.asc

File metadata and controls

101 lines (74 loc) · 7.88 KB

Framing Format Changes For Native Protocol V5

The framing format for V5 is based on the internode messaging system as it was redesigned for 4.0 in CASSANDRA-15066. In prior versions, the relationship between Message and Frame is 1:1, each CQL message being contained in exactly one frame and every frame containing only a single message. The changes introduced here enable frames to contain multiple small messages and for oversized messages to be broken into multiple frames.

The introduction of this new format, brings an unfortunate naming clash between frames from the previous format, with a 1:1 relationship to messages, and the new outer frames. Taking inspiration from SMTP (RFC-5321), what was previously referred to as a Frame is better described as an Envelope which comprises a CQL message along with some attendant metadata.

New Frame Format

In general, a frame comprises a header, payload and trailer; this proposal introduces two specific frame formats, compressed and uncompressed. In both cases, the payload is a stream of CQL Envelopes, each containing a single CQL Message. In effect, the new framing format is a simple wrapper around the previous protocol.

In all cases, a frame may or may not be self contained. If self contained, the payload includes one or more complete CQL envelopes and can be fully processed immediately. Otherwise, the payload contains some part of a large CQL envelope, which has been split into its own sequence of outer frames. These are expected to be transmitted/received in order, so a processor can accumulate them as they arrive and process them once all have been received.

The header contains length information for the payload, whether or not the frame is self contained and a CRC to protect the integrity of the header itself. There are slight variations in the header format between the compressed and uncompressed variants.

The payload is opaque as far as the framing format is concerned, modulo the self contained variation.

The trailer contains a CRC to protect the integrity of the payload. As the payload includes the CQL envelopes including both the message and the attached metadata, such as the stream ID, flags and opcode, this adds a level of protection that previously wasn’t available (for example from the implementation in CASSANDRA-13304).

Uncompressed Format

The uncompressed variant uses a 6 byte header containing payload length, self contained flag and CRC24 for the header. The max size for the payload is 128KiB, and is followed by its CRC32.

 1. Payload length               (17 bits)
 2. isSelfContained flag         (1 bit)
 3. Header padding               (6 bits)
 4. CRC24 of the header          (24 bits)
 5. Payload                      (up to 2 ^ 17 - 1 bits)
 6. Payload CRC32                (32 bits)

  0                   1                   2                   3
  0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |          Payload Length         |C|           |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
           CRC24 of Header       |                               |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                               +
 |                                                               |
 +                                                               +
 |                            Payload                            |
 +                                                               +
 |                                                               |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 |                        CRC32 of Payload                       |
 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

LZ4 Compressed Format

The variant with LZ4 compression uses an 8 byte header, containing both the compressed and uncompressed lengths of the payload, the self contained flag and a CRC24 for the header. As with uncompressed frames, the max payload size is 128KiB and is followed by a CRC32 trailer. This is the CRC of the compressed payload.

1. Compressed length            (17 bits)
2. Uncompressed length          (17 bits)
3. {@code isSelfContained} flag (1 bit)
4. Header padding               (5 bits)
5. CRC24 of Header contents     (24 bits)
6. Compressed Payload           (up to 2 ^ 17 - 1 bits)
7. CRC32 of Compressed Payload  (32 bits)

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|        Compressed Length        |     Uncompressed Length
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
    |C|         |                 CRC24 of Header               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               |
+                                                               +
|                      Compressed Payload                       |
+                                                               +
|                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                  CRC32 of Compressed Payload                  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Protocol Changes

Other than the enclosure of CQL envelopes in frames, the changes required to the protocol itself are minimal. All message exchanges follow the same REQUEST/RESPONSE specifications as before and no changes are required to either the CQL envelope (frame) or CQL message formats. CQL envelope headers contain some information which is now redundant and currently ignored. The CQL envelope header contains the protocol version, which was always to some extent redundant as the version is set and enforced at the connection level. It was also previously possible to enable compression at the individual CQL envelope level. This is no longer an option, the outer framing format being responsible for compression, which is set for the lifetime of a connection and applies to all messages transmitted throughout it. To that end, the compression flag on the CQL envelope header is ignored in V5.

  • TODO - note on conditional compression

Protocol Negotiation

In order to support both V5 and earlier formats, the V5 outer framing format is not applied to message exchanges before a STARTUP exchange is completed. This means that the initial STARTUP message and any OPTIONS messages which precede it are expected without the outer framing. Likewise, the responses returned by the server (SUPPORTED for OPTIONS and either READY or AUTHENTICATE for STARTUP) are transmitted without the outer framing.

After sending the response to a STARTUP (READY or AUTHENTICATE), the server will begin encoding and decoding further transmissions according to the protocol version of that STARTUP message. Compression of the outer frames is dictated by the COMPRESSION option sent in the STARTUP message. Only LZ4 compression is currently supported for V5.

Note: OPTIONS requests may be sent by the client at any time in the connection lifecycle, both before and after the STARTUP exchange. As mentioned, those transmitted before STARTUP, as well as the SUPPORTED responses the server returns are not enclosed in outer frames. Any OPTIONS/SUPPORTED exchanges after the STARTUP exchange are formatted according to the protocol version. So in V5, these will be enclosed in outer frames.