Overview
Consistency in distributed systems defines how data changes propagate across multiple nodes and what guarantees clients receive when reading and writing data. When data exists on multiple servers, networks partition, messages get delayed, and nodes fail, maintaining a coherent view of data becomes complex. Distributed systems must make explicit trade-offs between consistency, availability, and partition tolerance.
Traditional databases provide strong consistency through ACID transactions, where all operations appear to execute in a single location. Distributed systems cannot maintain this illusion without significant costs. The challenge intensifies when systems scale across data centers, geographic regions, or operate under network failures.
Distributed consistency emerged as a distinct field when systems began replicating data across multiple nodes. Early distributed databases attempted to provide the same guarantees as single-node systems through distributed transactions and consensus protocols. As systems scaled to internet proportions, architects recognized that strong consistency often conflicted with availability and performance requirements.
The CAP theorem, formalized by Eric Brewer, crystallized the trade-offs: a distributed system cannot simultaneously provide consistency, availability, and partition tolerance. Systems must choose which guarantees to prioritize based on application requirements. A banking system might prioritize consistency over availability, ensuring accurate balances even if some operations temporarily fail. A social media feed might prioritize availability, accepting that users see slightly stale data.
Modern distributed systems offer a spectrum of consistency models. Strong consistency guarantees that all nodes see the same data simultaneously. Eventual consistency guarantees that given enough time without updates, all nodes converge to the same state. Between these extremes lie causal consistency, monotonic reads, read-your-writes, and session consistency.
# Example: Reading from a distributed cache with eventual consistency
class DistributedCache
def initialize(nodes)
@nodes = nodes
end
def write(key, value)
# Write to all nodes asynchronously
@nodes.each do |node|
node.async_write(key, value)
end
end
def read(key)
# Read from any node - might return stale data
@nodes.sample.read(key)
end
end
# With strong consistency
class StrongCache
def write(key, value)
# Wait for majority acknowledgment (quorum write)
acks = @nodes.map { |node| node.write(key, value) }
raise "Write failed" unless acks.count(true) > @nodes.size / 2
end
def read(key)
# Read from majority to ensure latest value (quorum read)
values = @nodes.map { |node| node.read(key) }
values.max_by { |v| v.version }
end
end
Understanding consistency models enables architects to design systems matching business requirements. An analytics system processing historical data tolerates eventual consistency for better performance. A reservation system preventing double-bookings requires strong consistency despite lower throughput.
Key Principles
Consistency models describe the apparent ordering of operations in a distributed system. When multiple clients read and write data across multiple nodes, consistency models define what values clients observe and when they observe them.
Linearizability represents the strongest consistency model. Operations appear to execute atomically at a single point between invocation and completion. If one client writes a value and completes, any subsequent read by any client returns that value or a newer one. Linearizability makes distributed systems behave like a single machine. The cost involves significant coordination overhead and reduced availability during network partitions.
Sequential consistency relaxes linearizability by allowing operations to execute in any order, provided all processes observe operations in the same order. Each process's operations appear in program order, but different processes might observe different interleavings. Sequential consistency permits more concurrency than linearizability but still requires substantial coordination.
Causal consistency preserves cause-effect relationships. If operation A causally precedes operation B, all processes observe A before B. Operations without causal relationships can be observed in different orders by different processes. Causal consistency captures intuitive dependencies while allowing independent operations to proceed concurrently.
# Causal consistency with vector clocks
class VectorClock
def initialize(node_id, nodes)
@node_id = node_id
@clock = Hash.new(0)
nodes.each { |n| @clock[n] = 0 }
end
def increment
@clock[@node_id] += 1
@clock.dup
end
def update(other_clock)
other_clock.each do |node, timestamp|
@clock[node] = [@clock[node], timestamp].max
end
increment
end
def happens_before?(other_clock)
# This event happens before other if all timestamps <= and at least one <
all_less_equal = @clock.all? { |node, ts| ts <= other_clock[node] }
some_less = @clock.any? { |node, ts| ts < other_clock[node] }
all_less_equal && some_less
end
end
# Usage in a distributed system
class CausalStore
def initialize(node_id, nodes)
@clock = VectorClock.new(node_id, nodes)
@store = {}
end
def write(key, value)
version = @clock.increment
@store[key] = { value: value, version: version }
{ value: value, version: version }
end
def read_with_causal_dependency(key, causal_version)
# Wait until this node has seen all events that causally precede the given version
sleep 0.01 until @clock.happens_before?(causal_version)
@store[key]
end
end
Eventual consistency guarantees that if no new updates occur, all replicas eventually converge to the same value. The system makes no guarantees about intermediate states or convergence time. Eventual consistency maximizes availability and performance by allowing replicas to diverge temporarily. Applications must handle conflicts when concurrent updates produce different values.
Monotonic reads ensure that if a process reads a value, subsequent reads return that value or newer values, never older ones. This prevents reading stale data after seeing fresh data. Monotonic reads support use cases where users expect to see their timeline move forward.
Read-your-writes consistency guarantees that a process always observes its own writes. After writing a value, subsequent reads by that process return the written value or newer values. This matches user expectations in interactive applications where users modify data and immediately view results.
Session consistency extends read-your-writes within a session boundary. Operations within a session observe monotonic reads, read-your-writes, and other guarantees. Different sessions might observe different orderings. Session consistency balances consistency guarantees with performance.
The quorum approach provides tunable consistency by requiring operations to involve a majority of nodes. Write operations succeed when W nodes acknowledge. Read operations succeed when R nodes respond. Setting R + W > N (total nodes) ensures reads overlap with writes, providing strong consistency. Lowering quorums trades consistency for availability.
# Quorum-based read and write
class QuorumDataStore
def initialize(nodes, read_quorum, write_quorum)
@nodes = nodes
@r = read_quorum
@w = write_quorum
raise "Invalid quorum" unless @r + @w > @nodes.size
end
def write(key, value, version)
responses = []
threads = @nodes.map do |node|
Thread.new do
begin
responses << node.write(key, value, version)
rescue => e
responses << nil
end
end
end
threads.each(&:join)
success = responses.compact.size >= @w
raise "Write quorum not met" unless success
true
end
def read(key)
responses = []
threads = @nodes.map do |node|
Thread.new do
begin
responses << node.read(key)
rescue => e
responses << nil
end
end
end
threads.each(&:join)
valid_responses = responses.compact
raise "Read quorum not met" unless valid_responses.size >= @r
# Return value with highest version
valid_responses.max_by { |r| r[:version] }
end
end
Conflict resolution mechanisms handle divergent replicas. Last-write-wins uses timestamps to choose a winner, risking data loss when clocks drift. Version vectors track causality, identifying concurrent updates. Application-specific merge functions combine conflicting values, such as merging shopping cart contents or taking the union of sets.
Consensus algorithms like Paxos and Raft enable multiple nodes to agree on values despite failures. These algorithms elect a leader that coordinates updates, ensuring all nodes see operations in the same order. Consensus provides strong consistency but requires majority agreement, reducing availability during partitions.
Design Considerations
Selecting a consistency model requires analyzing business requirements, failure tolerance, and performance needs. Strong consistency simplifies application logic by providing intuitive semantics but limits availability and throughput. Weaker consistency models improve performance and availability but require applications to handle conflicts and anomalies.
Financial transactions require strong consistency. Banks cannot allow race conditions where concurrent withdrawals overdraw accounts. Double-spending in payment systems causes direct financial loss. These applications choose consistency over availability, accepting that operations might fail during network partitions rather than risking incorrect balances.
User-generated content often tolerates eventual consistency. Social media posts, comments, and likes need not appear instantaneously on all devices. Users accept slight delays in exchange for continuous availability. The system allows concurrent updates, resolving conflicts through application logic like merging comment lists or counting likes.
Collaborative editing benefits from causal consistency. When multiple users edit a document, operations causally dependent on each other must appear in order. User A's edit responding to User B's comment must appear after the comment. Independent edits can merge in any order. Operational transformation or CRDTs handle concurrent edits.
Session-based applications balance consistency with user experience. Web applications maintain session consistency so users see their own actions immediately. Different sessions might observe different states, which users accept because they don't expect real-time global synchronization. Shopping carts, preferences, and draft documents work well with session consistency.
The choice between synchronous and asynchronous replication affects consistency and performance. Synchronous replication waits for replicas to acknowledge writes before confirming to clients, providing strong consistency at the cost of latency. Asynchronous replication returns immediately, improving performance but allowing replicas to lag.
# Design pattern: Configurable consistency
class ConfigurableDataStore
def initialize(nodes)
@nodes = nodes
@consistency_mode = :eventual
end
def set_consistency(mode)
@consistency_mode = mode
end
def write(key, value)
case @consistency_mode
when :strong
write_strong(key, value)
when :eventual
write_eventual(key, value)
when :session
write_session(key, value)
end
end
private
def write_strong(key, value)
# Wait for all nodes
@nodes.each { |n| n.write(key, value) }
end
def write_eventual(key, value)
# Fire and forget to all nodes
@nodes.each do |n|
Thread.new { n.write(key, value) rescue nil }
end
end
def write_session(key, value)
# Write to primary synchronously, others async
primary = @nodes.first
primary.write(key, value)
@nodes[1..].each do |n|
Thread.new { n.write(key, value) rescue nil }
end
end
end
Geographic distribution complicates consistency. Cross-datacenter replication involves significant latency. Synchronous replication across continents adds hundreds of milliseconds to operations. Many systems use asynchronous replication between regions for disaster recovery while maintaining strong consistency within regions.
Read-heavy versus write-heavy workloads favor different strategies. Read-heavy systems benefit from eventual consistency with heavy caching and read replicas. Write-heavy systems need efficient coordination mechanisms. Some systems optimize for one type of operation, accepting suboptimal performance for the other.
Multi-tenancy considerations affect consistency choices. Systems serving multiple customers must ensure one tenant's actions don't affect another's consistency guarantees. Strong consistency per tenant with eventual consistency across tenants often provides good balance.
The failure model shapes consistency guarantees. Systems experiencing only node crashes can use simpler algorithms than those handling Byzantine failures where nodes act maliciously. Network partitions require careful handling to prevent split-brain scenarios where different partitions accept conflicting updates.
Performance versus correctness trade-offs appear throughout consistency design. Caching improves performance but introduces staleness. Optimistic concurrency allows concurrent operations but requires conflict resolution. Pessimistic locking prevents conflicts but reduces parallelism.
# Trade-off example: Optimistic vs pessimistic concurrency
class OptimisticStore
def initialize
@data = {}
@versions = {}
end
def read(key)
{ value: @data[key], version: @versions[key] || 0 }
end
def write(key, value, expected_version)
current_version = @versions[key] || 0
if current_version != expected_version
raise "Conflict: expected version #{expected_version}, got #{current_version}"
end
@data[key] = value
@versions[key] = current_version + 1
end
end
class PessimisticStore
def initialize
@data = {}
@locks = Hash.new { |h, k| h[k] = Mutex.new }
end
def write(key, value)
@locks[key].synchronize do
@data[key] = value
end
end
def read(key)
@locks[key].synchronize do
@data[key]
end
end
end
Implementation Approaches
Distributed systems implement consistency through various architectural patterns and protocols. The choice depends on scale, failure tolerance, and consistency requirements.
Primary-backup replication designates one node as primary, handling all writes. The primary propagates changes to backup replicas. Reads can target any replica, with the primary providing strong consistency and backups providing eventual consistency. If the primary fails, a backup promotes to primary. This approach simplifies consistency but creates a single point of failure and bottleneck.
# Primary-backup pattern
class PrimaryBackupStore
def initialize(primary, backups)
@primary = primary
@backups = backups
@is_primary = false
end
def write(key, value)
raise "Not primary" unless @is_primary
# Write to local storage
@primary.write(key, value)
# Replicate to backups
replication_threads = @backups.map do |backup|
Thread.new do
backup.replicate(key, value) rescue nil
end
end
# Wait for at least one backup to acknowledge
replication_threads.each(&:join)
true
end
def promote_to_primary
@is_primary = true
end
def read(key)
# Read from local node (primary or backup)
@primary.read(key)
end
end
Multi-master replication allows writes to multiple nodes simultaneously. Conflicts occur when different masters accept concurrent writes to the same data. Systems resolve conflicts through timestamps (last-write-wins), version vectors, or application-specific logic. Multi-master replication improves availability and write throughput but complicates conflict resolution.
Chain replication organizes nodes in a chain where writes flow through the head to tail, and reads come from the tail. Updates propagate sequentially, ensuring the tail has all committed updates. Chain replication provides strong consistency with good throughput for read-heavy workloads. Node failures require chain reconfiguration.
Consensus-based systems use algorithms like Raft or Paxos to coordinate updates. Nodes elect a leader that sequences operations. The leader proposes operations, and followers accept or reject. Operations commit when a majority acknowledges. This provides linearizability but requires majority availability.
# Simplified Raft-like consensus
class RaftNode
def initialize(node_id, cluster)
@node_id = node_id
@cluster = cluster
@state = :follower
@current_term = 0
@log = []
@commit_index = 0
end
def request_vote(term, candidate_id)
if term > @current_term
@current_term = term
@state = :follower
return { vote_granted: true, term: @current_term }
end
{ vote_granted: false, term: @current_term }
end
def append_entries(term, leader_id, entries, leader_commit)
if term < @current_term
return { success: false, term: @current_term }
end
@state = :follower
@current_term = term
# Append entries to log
entries.each { |entry| @log << entry }
# Update commit index
if leader_commit > @commit_index
@commit_index = [leader_commit, @log.size - 1].min
end
{ success: true, term: @current_term }
end
def propose(command)
raise "Not leader" unless @state == :leader
entry = { term: @current_term, command: command }
@log << entry
# Send to majority of followers
acks = 0
@cluster.each do |node|
next if node == @node_id
response = node.append_entries(@current_term, @node_id, [entry], @commit_index)
acks += 1 if response[:success]
end
if acks >= @cluster.size / 2
@commit_index = @log.size - 1
return true
end
false
end
end
Eventual consistency with anti-entropy allows replicas to diverge temporarily. Background processes periodically synchronize replicas, detecting and resolving differences. Merkle trees efficiently identify divergent data ranges. Anti-entropy provides high availability and performance while ensuring eventual convergence.
Conflict-free replicated data types (CRDTs) guarantee convergence without coordination. CRDTs define commutative and associative operations that produce the same result regardless of order. Examples include grow-only counters, sets with add/remove operations, and collaborative text editing structures. CRDTs enable highly available systems with automatic conflict resolution.
# CRDT: Grow-only counter
class GCounter
def initialize(node_id)
@node_id = node_id
@counts = Hash.new(0)
end
def increment
@counts[@node_id] += 1
end
def value
@counts.values.sum
end
def merge(other)
other_counts = other.instance_variable_get(:@counts)
other_counts.each do |node, count|
@counts[node] = [@counts[node], count].max
end
end
end
# CRDT: Two-phase set (add and remove)
class TwoPhaseSet
def initialize
@added = Set.new
@removed = Set.new
end
def add(element)
@added.add(element)
end
def remove(element)
@removed.add(element) if @added.include?(element)
end
def include?(element)
@added.include?(element) && !@removed.include?(element)
end
def merge(other)
other_added = other.instance_variable_get(:@added)
other_removed = other.instance_variable_get(:@removed)
@added.merge(other_added)
@removed.merge(other_removed)
end
def to_a
(@added - @removed).to_a
end
end
Hybrid approaches combine multiple strategies. Systems use strong consistency within datacenters and eventual consistency across datacenters. Critical data receives strong consistency while derived or cached data uses weaker models. Applications specify consistency requirements per operation.
Compensation and saga patterns handle distributed transactions without two-phase commit. Long-running transactions execute as a series of local transactions. If later steps fail, compensating transactions undo earlier steps. Sagas trade atomicity for availability, accepting that intermediate states are visible.
Common Patterns
Distributed systems employ recurring patterns to implement consistency guarantees. These patterns address common challenges in maintaining coherent state across nodes.
Read-your-writes pattern ensures users see their own updates immediately. After writing data, the session records the write version. Subsequent reads specify the minimum version, forcing the system to wait until replicas catch up. This pattern matches user expectations in interactive applications.
# Read-your-writes implementation
class ReadYourWritesStore
def initialize(replicas)
@replicas = replicas
@session_writes = {}
end
def write(session_id, key, value)
version = Time.now.to_i
# Write to primary
primary = @replicas.first
primary.write(key, value, version)
# Track session's write version
@session_writes[session_id] ||= {}
@session_writes[session_id][key] = version
# Async replicate to others
@replicas[1..].each do |r|
Thread.new { r.write(key, value, version) }
end
version
end
def read(session_id, key)
required_version = @session_writes.dig(session_id, key) || 0
# Read from any replica that has the required version
@replicas.each do |replica|
data = replica.read(key)
return data if data && data[:version] >= required_version
end
# Fall back to primary if no replica is caught up
@replicas.first.read(key)
end
end
Monotonic reads pattern prevents reading stale data after seeing fresh data. Sessions track the highest version observed. Future reads specify this version as a minimum. The pattern prevents confusing scenarios where data appears to revert to old values.
Sticky sessions route requests from the same client to the same server, simplifying consistency. The server maintains session state and provides consistent views. Load balancers use cookies or connection tracking to implement sticky sessions. This pattern works until server failures force session migration.
Version vectors track causal dependencies across replicas. Each replica maintains a vector of logical clocks. Updates increment the local clock and merge vectors from other replicas. Comparing vectors determines whether events are causally related or concurrent. Version vectors enable causal consistency and detect conflicts.
# Version vector implementation
class VersionVector
def initialize(replicas)
@vector = Hash.new(0)
replicas.each { |r| @vector[r] = 0 }
end
def increment(replica_id)
@vector[replica_id] += 1
self
end
def update(other_vector)
other_vector.each do |replica, version|
@vector[replica] = [@vector[replica], version].max
end
self
end
def compare(other)
less_equal = @vector.all? { |r, v| v <= other[r] }
greater_equal = @vector.all? { |r, v| v >= other[r] }
return :before if less_equal && !greater_equal
return :after if greater_equal && !less_equal
return :equal if less_equal && greater_equal
:concurrent
end
def [](replica)
@vector[replica]
end
def each(&block)
@vector.each(&block)
end
end
# Using version vectors for conflict detection
class VersionedStore
def initialize(replica_id, replicas)
@replica_id = replica_id
@data = {}
@clock = VersionVector.new(replicas)
end
def write(key, value)
@clock.increment(@replica_id)
@data[key] = {
value: value,
version: @clock.dup
}
end
def merge(key, remote_value, remote_version)
local = @data[key]
unless local
@data[key] = { value: remote_value, version: remote_version }
@clock.update(remote_version)
return :merged
end
relation = local[:version].compare(remote_version)
case relation
when :before
@data[key] = { value: remote_value, version: remote_version }
@clock.update(remote_version)
:merged
when :after
:ignored
when :concurrent
# Conflict detected - need application logic
:conflict
end
end
end
Quorum reads and writes tune consistency by requiring majorities. Setting read quorum R and write quorum W such that R + W > N ensures reads see recent writes. Lower quorums increase availability but risk reading stale data. Dynamo-style systems make quorums configurable per operation.
Hinted handoff maintains availability during failures. When a replica is unavailable, another node stores updates intended for it. When the replica recovers, stored updates transfer to it. Hinted handoff prevents data loss during temporary failures while preserving eventual consistency.
Anti-entropy with Merkle trees efficiently synchronizes replicas. Merkle trees create hierarchical hashes of data ranges. Replicas exchange tree hashes to identify divergent ranges, then transfer only differing data. This pattern reduces synchronization overhead in eventually consistent systems.
# Simplified Merkle tree for anti-entropy
class MerkleTree
def initialize(data)
@data = data.sort
@tree = build_tree(@data)
end
def root_hash
@tree[:hash]
end
def find_differences(other_tree, path = [])
return [] if root_hash == other_tree.root_hash
# Recursively find differing ranges
differences = []
if leaf?
differences << { path: path, data: @data }
else
left_diff = @tree[:left].find_differences(
other_tree.instance_variable_get(:@tree)[:left],
path + [:left]
)
right_diff = @tree[:right].find_differences(
other_tree.instance_variable_get(:@tree)[:right],
path + [:right]
)
differences.concat(left_diff).concat(right_diff)
end
differences
end
private
def build_tree(data)
return { hash: hash_data(data), data: data } if data.size <= 2
mid = data.size / 2
left = build_tree(data[0...mid])
right = build_tree(data[mid..])
{
hash: hash_combine(left[:hash], right[:hash]),
left: self.class.new(data[0...mid]),
right: self.class.new(data[mid..])
}
end
def hash_data(data)
Digest::SHA256.hexdigest(data.to_s)
end
def hash_combine(left, right)
Digest::SHA256.hexdigest("#{left}#{right}")
end
def leaf?
@data.size <= 2
end
end
Last-write-wins (LWW) resolves conflicts using timestamps. The update with the highest timestamp wins. LWW requires synchronized clocks and accepts data loss when concurrent writes occur. Despite limitations, LWW's simplicity makes it popular for non-critical data.
Conditional updates prevent lost updates through preconditions. Clients specify expected values or versions when writing. Operations fail if preconditions don't match, allowing clients to retry with current data. This pattern implements optimistic concurrency control.
Ruby Implementation
Ruby applications interact with distributed systems through client libraries. While consistency guarantees come from the underlying distributed store, Ruby code must handle errors, retries, and configuration to achieve desired consistency.
Redis provides configurable consistency in Ruby applications. Redis replication is asynchronous by default, providing eventual consistency. The WAIT command achieves synchronous replication by blocking until replicas acknowledge writes.
require 'redis'
class RedisConsistencyWrapper
def initialize(redis)
@redis = redis
end
# Eventual consistency - default Redis behavior
def write_eventual(key, value)
@redis.set(key, value)
end
# Strong consistency using WAIT
def write_strong(key, value, num_replicas: 2, timeout: 1000)
@redis.set(key, value)
# Wait for replicas to acknowledge
replicas_acked = @redis.wait(num_replicas, timeout)
if replicas_acked < num_replicas
raise "Only #{replicas_acked} replicas acknowledged"
end
true
end
# Read with potential staleness
def read_eventual(key)
@redis.get(key)
end
# Read from primary for consistency
def read_strong(key)
# Connect directly to primary node
@redis.get(key)
end
end
# Usage
redis = Redis.new(host: 'localhost', port: 6379)
store = RedisConsistencyWrapper.new(redis)
# Write with strong consistency
store.write_strong('user:123:balance', '100.00', num_replicas: 2)
# Read from primary
balance = store.read_strong('user:123:balance')
DynamoDB Ruby SDK exposes consistency choices through API parameters. Read operations accept a consistent_read parameter. Setting it true performs strongly consistent reads from the primary node. Default eventually consistent reads may return stale data from replicas.
require 'aws-sdk-dynamodb'
class DynamoDBConsistencyExample
def initialize
@dynamodb = Aws::DynamoDB::Client.new(region: 'us-west-2')
@table_name = 'Users'
end
# Eventually consistent read (default)
def read_eventual(user_id)
resp = @dynamodb.get_item({
table_name: @table_name,
key: { 'UserId' => user_id },
consistent_read: false # or omit this parameter
})
resp.item
end
# Strongly consistent read
def read_strong(user_id)
resp = @dynamodb.get_item({
table_name: @table_name,
key: { 'UserId' => user_id },
consistent_read: true # forces read from primary
})
resp.item
end
# Conditional write with optimistic locking
def update_with_version_check(user_id, new_email, expected_version)
@dynamodb.update_item({
table_name: @table_name,
key: { 'UserId' => user_id },
update_expression: 'SET Email = :email, Version = :new_version',
condition_expression: 'Version = :expected_version',
expression_attribute_values: {
':email' => new_email,
':new_version' => expected_version + 1,
':expected_version' => expected_version
}
})
rescue Aws::DynamoDB::Errors::ConditionalCheckFailedException
raise "Version mismatch - data was modified concurrently"
end
end
# Usage
db = DynamoDBConsistencyExample.new
# Write followed by strong read
db.update_with_version_check('user123', 'new@email.com', 5)
user = db.read_strong('user123')
Cassandra offers tunable consistency per query. Write and read consistency levels range from ONE (lowest consistency) to ALL (highest consistency) to QUORUM (majority). Ruby's Cassandra driver supports consistency configuration.
require 'cassandra'
class CassandraConsistencyManager
def initialize
@cluster = Cassandra.cluster(hosts: ['127.0.0.1'])
@session = @cluster.connect('my_keyspace')
end
def write_with_quorum(user_id, email)
statement = @session.prepare(
'UPDATE users SET email = ? WHERE user_id = ?'
)
@session.execute(
statement,
arguments: [email, user_id],
consistency: :quorum # requires majority acknowledgment
)
end
def write_with_all(user_id, email)
statement = @session.prepare(
'UPDATE users SET email = ? WHERE user_id = ?'
)
@session.execute(
statement,
arguments: [email, user_id],
consistency: :all # requires all replicas
)
end
def read_with_one(user_id)
statement = @session.prepare(
'SELECT * FROM users WHERE user_id = ?'
)
result = @session.execute(
statement,
arguments: [user_id],
consistency: :one # read from any single replica
)
result.first
end
def read_with_quorum(user_id)
statement = @session.prepare(
'SELECT * FROM users WHERE user_id = ?'
)
result = @session.execute(
statement,
arguments: [user_id],
consistency: :quorum # read from majority
)
result.first
end
# Lightweight transaction for compare-and-set
def conditional_update(user_id, new_email, expected_email)
result = @session.execute(
'UPDATE users SET email = ? WHERE user_id = ? IF email = ?',
arguments: [new_email, user_id, expected_email],
consistency: :quorum
)
result.rows.first['[applied]']
end
end
etcd Ruby client provides strongly consistent reads and writes through the Raft consensus protocol. All operations go through the leader, ensuring linearizability. etcd stores configuration data where consistency matters more than throughput.
require 'etcdv3'
class EtcdConsistencyExample
def initialize
@client = Etcdv3.new(endpoints: 'http://localhost:2379')
end
def write_linearizable(key, value)
# All writes go through Raft consensus
@client.put(key, value)
end
def read_linearizable(key)
# Serializable read - strongly consistent by default
@client.get(key).kvs.first&.value
end
def watch_changes(key)
# Watch provides consistent notification of changes
@client.watch(key) do |events|
events.each do |event|
puts "Key: #{event.kv.key}, Value: #{event.kv.value}"
end
end
end
def transaction_with_cas(key, old_value, new_value)
# Compare-and-swap transaction
txn = @client.transaction do |txn|
txn.compare = [
txn.value(key, :equal, old_value)
]
txn.success = [
txn.put(key, new_value)
]
txn.failure = []
end
txn.succeeded
end
end
PostgreSQL with Ruby provides ACID transactions and serializable isolation. Distributed PostgreSQL systems like Citus maintain strong consistency within shards. Cross-shard operations use two-phase commit.
require 'pg'
class PostgreSQLConsistency
def initialize
@conn = PG.connect(dbname: 'myapp')
end
# Serializable transaction
def transfer_funds(from_account, to_account, amount)
@conn.transaction do |conn|
conn.exec('SET TRANSACTION ISOLATION LEVEL SERIALIZABLE')
# Read balances
from_balance = conn.exec_params(
'SELECT balance FROM accounts WHERE id = $1 FOR UPDATE',
[from_account]
).first['balance'].to_f
raise 'Insufficient funds' if from_balance < amount
# Update balances
conn.exec_params(
'UPDATE accounts SET balance = balance - $1 WHERE id = $2',
[amount, from_account]
)
conn.exec_params(
'UPDATE accounts SET balance = balance + $1 WHERE id = $2',
[amount, to_account]
)
end
rescue PG::TRSerializationFailure
retry
end
# Read-your-writes with advisory locks
def update_with_session_consistency(user_id, data)
@conn.transaction do |conn|
# Advisory lock ensures session sees its writes
conn.exec_params(
'SELECT pg_advisory_xact_lock($1)',
[user_id.hash]
)
conn.exec_params(
'UPDATE users SET data = $1, updated_at = NOW() WHERE id = $2',
[data.to_json, user_id]
)
end
end
end
Implementing client-side consistency tracking helps when server-side guarantees are insufficient. Clients track version numbers or timestamps, verifying operations satisfy required consistency levels.
class ClientSideConsistencyTracker
def initialize(client)
@client = client
@last_seen_version = {}
end
def write_with_tracking(key, value)
version = Time.now.to_f
@client.write(key, value, version)
@last_seen_version[key] = version
version
end
def read_monotonic(key)
loop do
result = @client.read(key)
# Ensure we don't read older version than previously seen
if result[:version] >= (@last_seen_version[key] || 0)
@last_seen_version[key] = result[:version]
return result[:value]
end
# Retry if stale data
sleep 0.01
end
end
def read_your_writes(key)
required_version = @last_seen_version[key]
return @client.read(key)[:value] unless required_version
loop do
result = @client.read(key)
return result[:value] if result[:version] >= required_version
sleep 0.01
end
end
end
Tools & Ecosystem
The distributed systems ecosystem provides databases, coordination services, and frameworks implementing various consistency models. Ruby integrates with these tools through client libraries.
Redis serves as an in-memory data store with primary-replica replication. Redis Cluster shards data across nodes with eventual consistency by default. Redis Sentinel provides high availability with automatic failover. The redis-rb gem offers connection pooling and cluster support.
Apache Cassandra implements a peer-to-peer architecture with tunable consistency. Each node handles reads and writes without a primary. The cassandra-driver gem supports prepared statements, batching, and per-query consistency levels. Cassandra suits large-scale deployments requiring high availability.
etcd provides a strongly consistent key-value store using the Raft consensus algorithm. Service discovery, configuration management, and distributed coordination use etcd. The etcdv3 Ruby gem interacts with etcd clusters. Kubernetes uses etcd for cluster state.
Apache ZooKeeper coordinates distributed systems through a hierarchical namespace. ZooKeeper guarantees sequential consistency and provides primitives for leader election, distributed locks, and configuration. The zk gem provides Ruby bindings.
DynamoDB offers a fully managed NoSQL database with configurable consistency. Global tables replicate data across regions with eventual consistency. Single-region operations support strong consistency. The AWS SDK provides Ruby integration.
MongoDB supports configurable read and write concerns. Write concerns specify acknowledgment requirements from replica sets. Read concerns control data staleness. The mongo Ruby driver exposes these options.
require 'mongo'
class MongoDBConsistencyConfig
def initialize
@client = Mongo::Client.new(
['localhost:27017'],
database: 'myapp',
replica_set: 'myapp-rs'
)
end
# Write with majority acknowledgment
def write_majority(collection, document)
@client[collection].insert_one(
document,
write_concern: { w: :majority }
)
end
# Read from primary
def read_primary(collection, filter)
@client[collection].find(
filter,
read: { mode: :primary }
).first
end
# Read from secondary (eventual consistency)
def read_secondary(collection, filter)
@client[collection].find(
filter,
read: { mode: :secondary_preferred }
).first
end
# Linearizable read
def read_linearizable(collection, filter)
@client[collection].find(
filter,
read_concern: { level: :linearizable }
).first
end
end
Consul provides service discovery, health checking, and key-value storage with strong consistency. Consul uses Raft for consensus. The diplomat gem offers Ruby client functionality.
Riak implements an eventually consistent distributed database based on Amazon's Dynamo paper. Riak uses consistent hashing and vector clocks. The riak-client gem provides Ruby access.
CockroachDB delivers distributed SQL with serializable isolation and strong consistency. It uses Raft consensus per range of keys. CockroachDB's PostgreSQL compatibility enables usage with the pg gem.
Hazelcast provides distributed data structures like maps, queues, and locks. Hazelcast maintains strong consistency for distributed computing needs. The hazelcast Java library interoperates with JRuby.
ActiveRecord with multiple databases enables read/write splitting. Rails applications can route writes to primary databases and reads to replicas. This provides eventual consistency for reads with strong consistency for writes.
# Rails 6+ multiple database configuration
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
connects_to database: {
writing: :primary,
reading: :replica
}
end
class User < ApplicationRecord
# Read from replica by default
def self.find_for_display(id)
connected_to(role: :reading) do
find(id)
end
end
# Write to primary
def self.create_user(attributes)
connected_to(role: :writing) do
create(attributes)
end
end
# Ensure read-your-writes
def self.find_after_write(id)
connected_to(role: :writing) do
find(id)
end
end
end
Sidekiq with Redis provides eventual consistency for background jobs. Failed jobs retry automatically. Unique job gems prevent duplicate job execution. Job idempotency ensures correct behavior during retries.
Kafka streams events with strong ordering guarantees per partition. Producers write to partition leaders. Consumers read committed messages. The ruby-kafka gem provides Ruby integration. Kafka suits event sourcing architectures.
gRPC with Ruby enables efficient inter-service communication. gRPC supports streaming and bidirectional communication. Services can implement distributed protocols over gRPC. The grpc gem provides Ruby support.
Reference
Consistency Models Comparison
| Model | Ordering Guarantees | Visibility | Coordination Cost | Use Cases |
|---|---|---|---|---|
| Linearizability | Total order across all operations | Immediate global | High | Banking, inventory |
| Sequential | Total order per process | Eventually global | Medium-High | Collaborative apps |
| Causal | Preserves causality | Eventually global | Medium | Social media, messaging |
| Eventual | No guarantees | Eventually global | Low | Caches, analytics |
| Monotonic Reads | No reversals for reader | Per session | Low-Medium | User timelines |
| Read-Your-Writes | User sees own writes | Per session | Low-Medium | Web applications |
| Session | Multiple guarantees per session | Per session | Medium | Shopping carts |
Consistency Levels in Popular Systems
| System | Strongest Level | Weakest Level | Tunable | Default |
|---|---|---|---|---|
| Redis | WAIT command | Async replication | No | Eventual |
| Cassandra | ALL | ONE | Yes | ONE |
| DynamoDB | Consistent read | Eventually consistent | Per operation | Eventual |
| MongoDB | Linearizable | Secondaries | Yes | Primary reads |
| Riak | QUORUM | ONE | Per operation | QUORUM |
| PostgreSQL | Serializable | Read uncommitted | Per transaction | Read committed |
| etcd | Linearizable | Serializable | No | Linearizable |
Quorum Configurations
| R + W Configuration | Consistency Level | Availability Impact | Read Latency | Write Latency |
|---|---|---|---|---|
| R=N, W=1 | Eventual for reads | High writes, low reads | Low | Very Low |
| R=1, W=N | Strong if all survive | Low writes, high reads | Very Low | High |
| R=W=N/2+1 | Strong (overlapping quorums) | Medium | Medium | Medium |
| R=1, W=1 | Eventual | Very High | Very Low | Very Low |
| R=N, W=N | Strong | Very Low | High | High |
Vector Clock Operations
| Operation | Purpose | Time Complexity | Space Complexity |
|---|---|---|---|
| Increment | Record local event | O(1) | O(N) |
| Merge | Synchronize replicas | O(N) | O(N) |
| Compare | Detect concurrency | O(N) | O(N) |
| Prune | Remove obsolete entries | O(N) | O(N) |
Conflict Resolution Strategies
| Strategy | Mechanism | Data Loss Risk | Implementation Complexity | Best For |
|---|---|---|---|---|
| Last-Write-Wins | Timestamp comparison | High (losing writes) | Low | Non-critical data |
| Version Vectors | Causal tracking | None (detects conflicts) | Medium | Structured data |
| CRDTs | Commutative operations | None (automatic merge) | Medium-High | Counters, sets |
| Application Merge | Custom logic | None (preserves all) | High | Business logic |
| Multi-Version | Keep all versions | None (requires resolution) | Medium | User intervention |
Consensus Algorithm Properties
| Algorithm | Fault Tolerance | Message Complexity | Leader Election | Recovery Time |
|---|---|---|---|---|
| Paxos | f < N/2 | O(N²) | Separate phase | Slow |
| Raft | f < N/2 | O(N) | Integrated | Fast |
| ZAB | f < N/2 | O(N) | Integrated | Fast |
| Viewstamped Replication | f < N/2 | O(N) | Integrated | Medium |
CAP Theorem Trade-offs
| System Type | Consistency | Availability | Partition Tolerance | Examples |
|---|---|---|---|---|
| CP | Strong | Sacrificed during partition | Tolerates partitions | etcd, ZooKeeper, HBase |
| AP | Eventual | Maintained during partition | Tolerates partitions | Cassandra, Riak, DynamoDB |
| CA | Strong | High | Fails on partition | Single-site RDBMS |
Ruby Gem Consistency Features
| Gem | System | Consistency Control | Connection Pooling | Retry Logic | Async Support |
|---|---|---|---|---|---|
| redis-rb | Redis | WAIT command | Yes | Optional | Limited |
| cassandra-driver | Cassandra | Per-query levels | Yes | Configurable | Yes |
| aws-sdk-dynamodb | DynamoDB | consistent_read parameter | Yes | Automatic | Yes |
| mongo | MongoDB | Read/write concerns | Yes | Configurable | Limited |
| etcdv3 | etcd | Built-in strong | Yes | Automatic | Limited |
| pg | PostgreSQL | Isolation levels | Yes | Manual | No |
Replication Patterns
| Pattern | Write Path | Read Path | Consistency | Failover Complexity | Scaling |
|---|---|---|---|---|---|
| Primary-Backup | Primary only | Primary or backups | Strong on primary | Medium | Vertical |
| Multi-Master | Any master | Any replica | Eventual | High | Horizontal |
| Chain Replication | Head node | Tail node | Strong | High | Horizontal |
| Quorum | W nodes | R nodes | Tunable | Low | Horizontal |
Distributed Transaction Approaches
| Approach | Atomicity | Isolation | Availability Impact | Complexity | Performance |
|---|---|---|---|---|---|
| Two-Phase Commit | Yes | Serializable | High (blocks on failure) | Medium | Low |
| Saga Pattern | Eventual | None | Low | High | High |
| Compensation | Eventual | None | Low | High | High |
| CRDT | None needed | None | Very Low | Medium | Very High |
| Consensus | Yes | Linearizable | Medium | High | Medium |
Error Handling Strategies
| Error Type | Detection Method | Recovery Strategy | Client Action | System Action |
|---|---|---|---|---|
| Network Timeout | Operation timeout | Retry with backoff | Retry or fail | Queue for later |
| Split Brain | Quorum failure | Elect new leader | Wait and retry | Reconcile state |
| Version Conflict | Version mismatch | Re-read and retry | Merge or choose | Store both versions |
| Node Failure | Health check | Replica promotion | Redirect to new node | Failover |
| Partition | Quorum loss | Degrade gracefully | Cache or queue | Anti-entropy sync |