✅ Shutdown the second replica node for the cassandra partition:
Wait for the node to shutdown.
✅ Restart the first node you shut down:
✅ Start cqlsh and connect to a running node:
What do you think will happen when you try to retrieve the cassandra partition from the videos_by_tag table now?
Answer
The query should return *no data* since you deleted the *data* files for the replica you started and the other replica is not running.✅ Run the query:
SELECT * FROM killrvideo.videos_by_tag WHERE tag = 'cassandra';
✅ Quit cqlsh:
QUIT
✅ Restart the other replica node:
✅ Start cqlsh:
./node1/bin/cqlsh
✅ Set the consistncy level to TWO
:
Solution
CONSISTENCY TWO;
A consistency level of TWO
will cause the coordinator to read both replicas, calculate the checksum of the results, and then check if the data is not in sync between the two replica nodes. If not in sync, Cassandra will then invoke a read repair to replace the parts of the data on the nodes that has the oldest data writetime, or that lacks data completely (especially if someone deleted the data directory).
✅ Execute the following query:
SELECT * FROM killrvideo.videos_by_tag WHERE tag = 'cassandra';
It should return the rows in the cassandra partition:
✅ Quit cqlsh:
QUIT
✅ Shutdown the replica node whose data files you did not delete:
✅ Start cqlsh:
✅ Verify that the consistency level is ONE
:
CONSISTENCY;
✅ Execute the following query:
SELECT * FROM killrvideo.videos_by_tag WHERE tag = 'cassandra';
This time we get our data because the previous invocation of the query caused a read repair writing data to the videos_by_tag table for the node with the deleted data files!