Skip to content

Latest commit

 

History

History
100 lines (60 loc) · 16.6 KB

key_agreement.md

File metadata and controls

100 lines (60 loc) · 16.6 KB

Key-Agreement for Secure Communication

Before messages can be exchanged securely in a communication system, the involved parties need to identify & authenticate each other, and then run any needed preliminary steps for further message exchanges. In most modern systems, this preliminary step involves a key-agreement protocol where a shared key is derived such that no single party has an undue influence in the final outcome. Despite being fundamental to most modern communication systems, some variation to this initial step of authentication and key-agreement is still fairly common. These variations can stem from differences in required security guarantees, message type and throughput, liveness of endpoints, compatibility with other protocols, ease of integration with existing systems and so on. Without necessary caution, a design can easily go from "fulfills minimum requirements" to one riddled with security loopholes, costly, unappealing and difficult to scale.

Lets begin by providing a bit more detail on the above mentioned parameters that help pick suitable key-agreement protocol for a communication system.

  1. Message Type and Throughput

    The reason message exchange is done with symmetric encryption is because it incurs less computational cost for messages of arbitrary length compared to doing it with asymmetric encryption. This makes key-agreement protocols attractive for systems requiring larger throughput. Encrypting message with a shared key is useful under private group messaging with several participants or under multi-device per user setting. Instead of sending a message to each participant by encrypting with their public key, a single message can be sent that is readable by all.

  2. Security Requirements

    Messages exchanged in systems using key-agreement protocols are not publicly verifiable (or have deniability property) which means the encrypted message could have been generated by any of the parties in possession of the shared key making it difficult to know the source of encrypted message. Deniability property is favored in cases like secure e2e messaging while it is unacceptable for cryptographic logging schemes, ransomware encryption, etc.

    Additionally end to end encrypted systems implement Forward Secrecy and Post Compromise Security features. These features ensure that the protocol refreshes shared key periodically (also when necessary). That way a bad actor in possession of shared key won't be able to deciper any other message encrypted with a different shared key. This property might also be useful if messages are logged in blockchain, which are like public bulletin boards. Without key refreshing, every single message exchanged so far, which is available in blockchains, can be decrypted.

  3. Liveness of Endpoints

    Senders and receivers may not be online to run key-agreement protocol at the same time. This makes running the protocol difficult because the one who initiates the exchange has to hold secret values securely until the intended recepient can agree and send back an acknowledgement. Messaging in such situation is termed as asynchronous messaging and is common when communicating through smartphones, blockchains, etc. Protocols like Extended Triple Diffie Hellman (X3DH, in short) are designed for asynchronous messaging and has found usage in e2e messaging applications like Signal, Whatsapp and even in web3 applications like XMTP, Lens Protocol, etc.

  4. Protocol Compatibility

    Key-agreement protocols enforce security features necessitated by the usecase but they form only a cog in the overall design. Any other algorithms that precedes or comes down the line should also preserve the features required by the system. For example X3DH is used for key-agreement in Signal Messaging app. It is used in unison with Double Ratchet algorithm, which forms the basis for payload exchange all the while preserving the same security features. Similarly, anonymity and end-to-encryption is preserved by zero knowledge proof based systems.

  5. Ease of Integration

    This feature is loosely connected to key-agreement protocol, and instead aligns more with authentication mechanisms. The term integration here refers to more than technical compatibilty, which was already covered in the point above. Instead, here, we intend to discuss how a suitable protocol helps aid the communication system to more conveniently integrate with other applications and end-users. For example, lets consider a system where if you needed to run a key-agreement protocol, you'd need to be physically together, or consider a system where key-agreements can be run but only with the users which are already on your phone's contact list. Such designs make it difficult to scale the application to a larger userbase which in turn makes the application itself less appealing to new and existing users. Had authentication mechanisms been token gated such that holders of some NFT could authenticate and participate in the communication, then we'd be importing social value derived elsewhere in the ecosystem directly onto our system. In short, technical design should be wary of and facilitate inflow of business value.

Next we'll be taking a closer look on some of these protocols.

Extended Triple Diffie Hellman (X3DH) Protocol

X3DH protocol is a key agreement protocol used by secure messaging applications like Signal. It offers forward secrecy, deniability and is designed to work under asynchronous settings. In the following section, we progress from basic key-agreement protocol up to X3DH protocol describing the benefits of improvements in each step.

A basic Diffie-Hellman (DH) key-agreement protocol involves sender and receiver using their identity keys to calculate a shared key. A slight variation uses signed ephemeral key to calculate shared key. In the following diagrams, A and B are identity keys and a and b are ephemeral keys for a couple of users. The first (left) diagram shows ephemeral keys (a and b) signed by identity keys (A and B) and used for key-agreement. The benefit of using ephemeral key, which is renewed between protocol runs, is that it enables forward secrecy and provides protection against security breach if identity key was compromised.

An improvement to this protocol is shown on the next (right) diagram where instead of using identity key to sign respective ephemeral keys, three Diffie-Hellman shared keys are calculated and then combined to form a final shared key. This latter approach improves deniability and reduces payload size.

otr-current

otr-simplified

The description so far is for synchronous messaging, where both endpoints are online at the same time. For asynchronous settings, a user publishes a pre-key bundle, which the other user uses to establish shared key with the first user. For example, Bob publishes pre-key bundle which consists of the following.

  • Bob's identity key IKB generated once
  • Bob's signed prekey SPKB and prekey signature Sig(IKB, Encode(SPKB)) renewed periodically (weeks, months)
  • A set of Bob's one-time prekeys (OPKB1, OPKB2, OPKB3, ...) with each one for a separate protocol run

The use of ephemeral and one-time prekeys help enforce forward secrecy and avoid protocol replay. Alice fetches the pre-key bundle, uses her identity key, generates a new ephemeral key and calculates a shared key with the steps given below. Here DH(PK1, PK2) represents Elliptic Curve Diffie Hellman Function whose output is a shared key derived from key-pair PK1 and public-key PK2, SK is the combined shared key used for message exchange and KDF is Key-Derivation function.

CalculationSteps Keys KeyDescription
DH1 = DH(IKA, SPKB) IKA Alice's Identity Key
DH2 = DH(EKA, IKB) EKA Alice's Ephemeral Key
DH3 = DH(EKA, SPKB) IKB Bob's Identity Key
DH4 = DH(EKA, OPKB) SPKB Bob's Signed Prekey
SK = KDF(DH1 || DH2 || DH3 || DH4) OPKB Bob's One Time Prekey

Afterwards, Alice provides Bob with a message containing her identity and ephemeral keys, Bob's prekey used and additional information encrypted with shared key. Bob receives the message and will be able to calculate shared key to communicate with Alice further. The protocol is widely used for e2e encrypted messaging and has found its use in web3 applications as well where it has been used to exchange textual message, gameplay interaction signals, computational results and secrets, multimedia files, NFT metadata, application assets, etc.

Private Group Messaging Protocols

Secure group communication means users should be able to communicate such that the message is readable only to a member of the group. Despite its widespread prevalance in messaging applications, a fully secure, efficient and standard method of private group communication is still in works. The difficulty lies primarly with implementing an efficient way of managing groups (i.e. initial key exchange, addition and removal of users, update to existing user keys) in an end-to-end encrypted messaging protocol. An explanation of some known approaches given below will help establish the point.

  1. Pairwise Channels

    Pairwise Channel method means each user forms a pair with every other user and establishes a unique secure channel. A message dropped to the group will have to be separately channeled to every partipant. Assume a group of N users who want to participate in a group chat. Each user will provide his public keys to every other user which means it takes N2 messages in total to exchange all keys. To send a group message, a user will send separate encrypted copies of the same message to all members such that each message can only be read by a specific user, thus taking N rounds for a single message. Now if a user needs to be added, each of the existing users will have to provide their keys to the new user and also obtain the new user's keys taking 2N rounds. For an existing member who wants to update his keys, the broadcast will take N rounds. Finally if a user is removed from the group, the entire key exchange has to be redone taking N2 rounds. Despite being the most secure way of group communication, the protocol is the least scalable. The communication overhead of maintaining secure keys can deny the ability to exchange application messages in a large group.

  1. Sender Keys

    Instead of forming a separate channel to every other member, a user will form a single channel by passing the same symmetric sender key to each of them. Now if a user wants to send a message, he will be able to boardcast a single message encrypted with the sender key. The overhead with this approach lies mainly with the key-exchange which has to be done initially and every time a user is removed from the group. During key-exchange, a user's sender key will have to be securely provided to every other user which can only be done by encrypting with the recepient's respective publick keys. As such the key-exchange step still takes N2 rounds in total considering all participants. The efficiency gained with this approach is realized during exchanging application messages only. This approach to group messaging is the one followed by popular group messaging applications like Signal and Whatsapp.

  1. Trees

    Tree based solutions like Asynchronous Ratcheting Trees, TreeKEM and Ratchet Tree protocols aim to reduce the number of necessary updates during key-exchange, which we saw take O(N2) steps during initial key-exchange or when a member of a group was removed. Though variants of tree based solutions have some differences amongst them, the primary approach has been to arrange the key-value pairs for each of the users on leaf nodes of a binary tree; the intermediate nodes aggregate nodes below them recursively until we reach the root node, which holds the group secret used to derive shared encryption key for securely exchanging messages within the group. Consequently, when a user has to be removed from the group, only some select nodes, totalling to O(log N) in average and O(N) in worst case, need to be updated.

    The solution is very useful especially when the size of groups tend to be very large and frequent change in group members' keys is very likely making it a scalable approach. We can also reason that an efficient group management solution is eseential for a private group messaging system to support integral security guarantees. The ability to add or remove groups or to update existing users' secrets is a fundamental necessity and any solution that can not achieve it efficiently can be susceptible to attacks from bad actors given that there is sufficient incentive to do so. As an added benefit, the shared secret derived from root node can be paired with other approach to provide different security guarantees based on requirement. For example, the shared secret, itself, could be used to exchange all messages by all members making key management easier; the shared secret could be used as a seed to ratcheting protocols that derive and refresh message encryption key periodically, thus, providing added security guarantees.

    As such tree solutions are group management solutions which understand that the need for scalable group management is more of a basic security requirement than a nice-to-have feature. Usecases that do not guarantee that they will not scale and as such won't need efficient solution or do not have other safeguards in place to thwart attacks should consider worst case situations early in their design process.

MLS Protocol defines a standard for private group messaging which defines the following:

  • Ratchet Tree as core alogirthm to represent and manage group
  • Extensions to generate message encryption keys, with secret trees to be specific
  • Message Types (metadata, handshake and application messages) and their formats
  • Control flow for group management, e.g. how changes to group members' list are proposed and commited, how a user can either request being added or can be offered to get added, how authentication credentials' type (or version) and content are agreed upon, etc
  • Enablers of security requirements

MLS Protocol, in itself, is vast enough to require a separate discussion to thoroughly understands its ins and outs. For now, we'll mention few existing resources that cover aspects of it, which are:

Summary

Picking the right key-agreement protocol depends very much on the usecase you're trying to implement it on and what features it requires. Paired with how authentication mechanism, key-agreement forms the initial gateway to onboard the intended user(s) from a pool of potentially malicious actor, and only afterwards can we securely and efficiently build applications on top.