What is Cassandra Read Repair
Cassandra Read Repair, Weak Consistency and Strong Consistency
Read repair is a technique that ensures that all nodes in a Cassandra cluster are synchronized with the latest version of data. For example, Cassandra might detect that several nodes in the cluster are out of sync with older versions of the data requested in a read operation. Upon detection of stale data, Cassandra will mark the nodes with the stale data with a Read Repair flag that will trigger the process of synchronizing the stale nodes with newest version of the data requested. The check for inconsistent data is implemented by comparing the clock value of the data requested in the read operation. Any node with a clock value that is older than the newest data is effectively flagged as being out of sync.
Weak consistency in read operations is a way to optimize the performance of read operations by responding with the requested data before repairing the inconsistent nodes in the cluster. In other words, a read operation would return immediately and trigger an asynchronous process that would repair the stale nodes at a latter time. Clearly this can perform much faster than strong consistency but has the side effect of not always responding with consistent data.
Strong consistency ensures that the data requested is guaranteed to be consistent across the Cassandra cluster. When a read operation is invoked with strong consistency and there is stale data detected in the cluster, the read operation will not respond until the inconsistent nodes have been repaired with the newest data. Obviously this can have a negative impact on the performance of the read operation but has the nice property of always responding with data in a consistent state.