Note: This framework was built while working on the book Patterns of Distributed Systems. I use this to teach replication techniques in the workshops that I conduct.
/* message1 +--------+
+-------------->+ |
| | node2 |
| +-message2-+ |
| | | |
+--+----v+ +--------+
+------+ request-response | |
| | |node1 |
|client| <---------------> | |
| | | |
+------+ +-+----+-+
| ^ +---------+
| | | |
| +--message4--+ node3 |
| | |
+--message3-------> |
+---------+
*/
This is a basic framework for quickly building and testing replication algorithms. It doesn't require any additional setup (e.g., Docker) for setting up a cluster and allows writing simple JUnit tests for testing replication mechanisms. The framework was created to learn and teach various distributed system techniques, enabling the testing of working code while exploring distributed systems concepts. It provides mechanisms for establishing message-passing communication between replicas and test utilities for quickly forming a cluster of replicas, introducing network failures, and asserting the state of the replicas. This repository also contains example code for basic replication algorithms like basic Majority Quorum, Paxos, MultiPaxos and Viewstamped Replication.
This framework allows you to implement replication algorithms quickly and write JUnit tests. It also offers basic ways to introduce faults like process crashes, network disconnections, network delays, and clock skews.
The Replica class implements essential building blocks for a networked service, including:
- Listening on provided IP addresses and ports.
- Managing a single-threaded executor for request handling, Implementing the Singular Update Queue pattern.
- Providing a basic implementation of the Request Waiting List pattern.
- Supporting serialization and deserialization of messages (currently using JSON).
These building blocks are sufficient for implementing and testing any networked service.
Utilities are provided to create multiple Replica
instances. These instances, being Java objects, are easy to inspect and test. Check out QuorumKVStoreTest for an example of how to write tests.
The cluster can be formed by creating multiple instances of the Replica
implementations what you create. For example, a three node cluster
with replicas named "athens", "byzantium" and "cyrene" is
created as following:
class QuorumKVStoreTest {
QuorumKVStore athens;
QuorumKVStore byzantium;
QuorumKVStore cyrene;
@Override
public void setUp() throws IOException {
//no. servers = no. of replicas.
this.nodes = TestUtils.startCluster(Arrays.asList("athens",
"byzantium", "cyrene"),
(name, config, clock, clientConnectionAddress, peerConnectionAddress, peerAddresses) -> new QuorumKVStore(name, config, clock, clientConnectionAddress, peerConnectionAddress, peerAddresses));
athens = nodes.get("athens");
byzantium = nodes.get("byzantium");
cyrene = nodes.get("cyrene");
}
}
The Replica
class allows you to introduce network failures to other nodes, with utility methods for dropping or delaying messages. Examples of testing with introduced network failures can be found in QuorumKVStoreTest.
For example, to drop messages from the node 'athens' to 'byzantium'
athens.dropMessagesTo(byzantium);
The connection can be restablished using
athens.reconnectTo(cyrene);
This repository contains example replication mechanisms, including:
- Basic Read/Write Quorum
- Quorum Consensus
- Single-value Paxos
- Paxos-based Key-Value Store
- Paxos-based Replicated Log
- MultiPaxos
- View Stamped Replication
Explore these algorithms to understand and experiment with different replication techniques.