CrackedRuby - CAP Theorem

Overview

The CAP theorem, also known as Brewer's theorem, states that a distributed data store cannot simultaneously provide more than two out of three guarantees: Consistency, Availability, and Partition tolerance. Eric Brewer introduced this concept in 2000, and Seth Gilbert and Nancy Lynch later proved it formally in 2002.

The theorem addresses fundamental constraints in distributed systems where data is replicated across multiple nodes. When network failures occur (partitions), the system must choose between maintaining consistency across all nodes or remaining available for read and write operations. This choice has significant implications for system design and behavior.

The three properties are:

Consistency: Every read receives the most recent write or an error. All nodes see the same data at the same time.

Availability: Every request receives a non-error response, without guarantee that it contains the most recent write. The system remains operational for reads and writes.

Partition Tolerance: The system continues to operate despite network partitions that prevent communication between nodes.

In practice, partition tolerance is not optional for distributed systems operating across networks. Network failures occur in real-world deployments, making the practical choice between consistency (CP systems) and availability (AP systems) when partitions happen.

# Conceptual representation of CAP trade-offs
class DistributedSystem
  def initialize(mode)
    @mode = mode # :cp or :ap
  end
  
  def handle_partition
    case @mode
    when :cp
      # Reject writes to maintain consistency
      raise PartitionError, "System unavailable during partition"
    when :ap
      # Accept writes, allow temporary inconsistency
      accept_write_with_eventual_consistency
    end
  end
end

Key Principles

The CAP theorem rests on specific definitions of its three properties that developers must understand to apply the theorem correctly.

Consistency in CAP Context

Consistency in the CAP theorem refers to linearizability, a specific consistency model stronger than eventual consistency or other weaker models. Linearizability requires that operations appear to occur instantaneously at some point between their invocation and completion. Once a write completes, all subsequent reads must return that value or a newer one. This differs from ACID consistency in databases, which concerns maintaining invariants and constraints within a single database.

Consider a distributed counter across two nodes. With linearizability, if Node A increments the counter from 5 to 6, and this operation completes, then any read from any node must return 6 or higher. No node can return 5 after the increment completes, even if network delays occur.

Availability in CAP Context

Availability means every request to a non-failing node must result in a response. The theorem defines this strictly: the system cannot refuse requests or delay indefinitely. A system that responds with stale data remains available, while a system that refuses requests to maintain consistency is not available in CAP terms.

This definition differs from common operational availability metrics like "five nines" (99.999% uptime). CAP availability is binary per request: either the system responds or it does not. Operational availability measures the percentage of time the system responds.

Partition Tolerance in CAP Context

Partition tolerance means the system continues operating when network partitions prevent some nodes from communicating. A partition splits the network into subgroups where nodes within a subgroup can communicate, but nodes in different subgroups cannot. The system must handle these scenarios because networks are unreliable.

Partition tolerance is mandatory for distributed systems. The meaningful choice is between consistency and availability when partitions occur. Systems cannot avoid partitions in environments spanning multiple data centers, cloud availability zones, or unreliable networks.

The Two-Out-Of-Three Choice

When a partition occurs, the system faces a binary decision:

CP systems sacrifice availability to maintain consistency. During a partition, nodes that cannot communicate with others refuse requests rather than risk serving stale data. This ensures all successful reads return the most recent write.

AP systems sacrifice consistency to maintain availability. During a partition, all nodes continue serving requests, accepting that different nodes may have different data temporarily. The system achieves eventual consistency after the partition heals.

Network Partitions Are Inevitable

Real-world networks experience various failure modes: broken cables, failed switches, routing issues, firewall misconfigurations, and cloud provider outages. These failures create partitions where some nodes become unreachable. Systems must account for these scenarios rather than assume perfect networks.

The duration of partitions varies from milliseconds to hours. Short partitions may go unnoticed by users, while extended partitions require explicit handling. The CAP theorem forces architects to decide how their system behaves during these inevitable events.

Design Considerations

Selecting between CP and AP models requires analyzing application requirements, user expectations, and business constraints. The choice affects system behavior, implementation complexity, and operational characteristics.

When to Choose CP (Consistency over Availability)

Financial systems, inventory management, and systems maintaining critical invariants typically require CP characteristics. An ATM that dispenses cash must ensure the account balance is accurate before completing the transaction. Displaying stale balance data and allowing withdrawals could enable overdrafts or duplicate spending.

Configuration management systems often choose consistency. If an application reads configuration determining access control rules, stale configuration could grant unauthorized access. Refusing requests during partitions provides better security than serving stale, potentially incorrect data.

Distributed locking and leader election require consistency. Multiple nodes cannot believe they hold the same lock simultaneously. These coordination primitives sacrifice availability to guarantee correctness.

# CP system example: Strong consistency requirement
class BankAccount
  def initialize(balance, consensus_system)
    @balance = balance
    @consensus = consensus_system
  end
  
  def withdraw(amount)
    # Must achieve consensus before proceeding
    lock = @consensus.acquire_lock(account_id)
    
    unless lock
      raise UnavailableError, "Cannot acquire lock during partition"
    end
    
    current_balance = @consensus.read_committed_value(:balance)
    
    if current_balance >= amount
      new_balance = current_balance - amount
      @consensus.write_committed_value(:balance, new_balance)
      lock.release
      new_balance
    else
      lock.release
      raise InsufficientFundsError
    end
  end
end

When to Choose AP (Availability over Consistency)

Social media feeds, caching layers, and analytics systems often prioritize availability. Users accept temporary inconsistencies in exchange for uninterrupted service. A user seeing a slightly outdated follower count causes minimal harm compared to the entire service becoming unavailable.

Session storage and shopping carts benefit from availability. A user adding items to a cart expects the operation to succeed immediately. If different data center nodes have slightly different cart contents temporarily, the system can reconcile differences when the user checks out.

Content delivery networks and DNS prioritize availability. Users receiving slightly stale cached content is preferable to failed requests. These systems use short TTLs and eventual consistency to converge on current data.

# AP system example: Prioritizing availability
class ShoppingCart
  def initialize(user_id, replica_set)
    @user_id = user_id
    @replicas = replica_set
  end
  
  def add_item(item_id, quantity)
    # Write to any available replica
    available_replica = @replicas.find { |r| r.healthy? }
    
    unless available_replica
      # Fall back to local cache if all replicas unavailable
      return add_to_local_cache(item_id, quantity)
    end
    
    # Write succeeds even if other replicas unreachable
    available_replica.write(
      user_id: @user_id,
      item_id: item_id,
      quantity: quantity,
      timestamp: Time.now.to_f
    )
    
    # Asynchronously propagate to other replicas
    @replicas.each do |replica|
      replica.async_replicate(user_id: @user_id) if replica != available_replica
    end
    
    true
  end
  
  def get_items
    # Return merged results from all available replicas
    items_by_replica = @replicas.select(&:healthy?).map(&:read)
    merge_with_conflict_resolution(items_by_replica)
  end
end

Hybrid Approaches and Tunable Consistency

Some systems allow tuning consistency versus availability per operation. Databases like Cassandra provide consistency levels ranging from ONE (high availability) to ALL (strong consistency). Applications can choose appropriate levels for each operation type.

Read-modify-write operations often require strong consistency, while simple reads may accept eventual consistency. A social media application might require strong consistency when creating posts but accept eventual consistency when loading feeds.

# Tunable consistency example
class TunableDataStore
  def write(key, value, consistency_level: :quorum)
    case consistency_level
    when :one
      # Write to one node, return immediately
      @replicas.first.write(key, value)
    when :quorum
      # Write to majority before returning
      required = (@replicas.size / 2) + 1
      write_to_n_replicas(key, value, required)
    when :all
      # Write to all nodes before returning
      write_to_all_replicas(key, value)
    end
  end
  
  def read(key, consistency_level: :quorum)
    case consistency_level
    when :one
      # Read from first available node
      @replicas.find(&:healthy?).read(key)
    when :quorum
      # Read from majority, return most recent
      required = (@replicas.size / 2) + 1
      values = read_from_n_replicas(key, required)
      resolve_conflicts(values)
    when :all
      # Read from all nodes, return most recent
      values = read_from_all_replicas(key)
      resolve_conflicts(values)
    end
  end
end

Operational Trade-offs

CP systems require more operational complexity during failures. Administrators must monitor partition detection, understand when the system becomes unavailable, and have procedures for recovering from split-brain scenarios where different subsets of nodes cannot communicate.

AP systems require conflict resolution mechanisms. When partitions heal, the system must merge divergent data states. This requires application-level logic to resolve conflicts, tombstones for deletions, and vector clocks or similar mechanisms to track causality.

Implementation Approaches

Different architectural patterns and algorithms implement CAP trade-offs. The choice affects performance, complexity, and failure handling characteristics.

Consensus Protocols for CP Systems

Consensus protocols like Raft and Paxos enable CP systems by ensuring nodes agree on state changes before committing them. These protocols elect a leader that coordinates writes. Followers reject writes during partitions when they cannot reach the leader, sacrificing availability for consistency.

Raft divides time into terms, each with at most one leader. Candidates become leaders by receiving votes from a majority of nodes. Write requests go to the leader, which replicates them to followers. Once a majority acknowledges the write, the leader commits it. This ensures consistency because any future leader must have all committed writes.

# Simplified Raft-inspired consensus
class ConsensusNode
  attr_reader :state, :current_term, :voted_for
  
  def initialize(node_id, cluster_members)
    @node_id = node_id
    @cluster = cluster_members
    @state = :follower
    @current_term = 0
    @voted_for = nil
    @log = []
    @commit_index = 0
  end
  
  def request_vote(term, candidate_id, last_log_index, last_log_term)
    # Reject if candidate's term is older
    return false if term < @current_term
    
    # Update term if candidate's term is newer
    if term > @current_term
      @current_term = term
      @voted_for = nil
      @state = :follower
    end
    
    # Grant vote if haven't voted and candidate's log is up-to-date
    if @voted_for.nil? && log_is_current?(last_log_index, last_log_term)
      @voted_for = candidate_id
      return true
    end
    
    false
  end
  
  def append_entries(term, leader_id, prev_log_index, prev_log_term, entries, leader_commit)
    # Reject if leader's term is older
    return false if term < @current_term
    
    # Accept leader and update term
    @current_term = term
    @state = :follower
    
    # Verify log consistency
    unless log_matches?(prev_log_index, prev_log_term)
      return false
    end
    
    # Append new entries
    append_to_log(entries, prev_log_index)
    
    # Update commit index
    if leader_commit > @commit_index
      @commit_index = [leader_commit, @log.size - 1].min
    end
    
    true
  end
  
  def client_write(data)
    unless @state == :leader
      raise NotLeaderError, "Forward to current leader"
    end
    
    # Append to log
    entry = { term: @current_term, data: data }
    @log << entry
    
    # Replicate to majority
    replicas_acked = replicate_to_followers(entry)
    
    if replicas_acked >= majority_size
      @commit_index = @log.size - 1
      apply_to_state_machine(entry)
      return true
    else
      # Could not reach majority - system unavailable
      raise UnavailableError, "Cannot achieve consensus"
    end
  end
  
  private
  
  def majority_size
    (@cluster.size / 2) + 1
  end
end

Last Write Wins for AP Systems

AP systems using eventual consistency often employ Last Write Wins (LWW) conflict resolution. Each write includes a timestamp. When conflicts arise, the write with the latest timestamp wins. This strategy is simple but can lose concurrent updates.

LWW works well for operations that naturally overwrite previous values, such as updating user profiles or configuration settings. It performs poorly for operations that should accumulate, like incrementing counters or adding items to sets.

CRDTs for AP Systems

Conflict-free Replicated Data Types (CRDTs) enable AP systems to merge concurrent updates without conflicts. CRDTs define data structures with merge operations that are commutative, associative, and idempotent. Replicas can apply updates in any order and converge to the same state.

G-Counter (Grow-only Counter) CRDTs track increments across replicas. Each replica maintains a vector of counts, one per replica. The total count is the sum of all vector elements. When replicas merge, they take the maximum count for each position.

# CRDT G-Counter implementation
class GCounter
  def initialize(replica_id, num_replicas)
    @replica_id = replica_id
    @counts = Array.new(num_replicas, 0)
  end
  
  def increment(amount = 1)
    @counts[@replica_id] += amount
  end
  
  def value
    @counts.sum
  end
  
  def merge(other_counter)
    @counts.each_with_index do |count, i|
      @counts[i] = [count, other_counter.counts[i]].max
    end
  end
  
  protected
  
  attr_reader :counts
end

# Usage across partitions
counter_a = GCounter.new(0, 3)
counter_b = GCounter.new(1, 3)

counter_a.increment(5)
counter_b.increment(3)

# After partition heals, merge states
counter_a.merge(counter_b)
counter_a.value # => 8, no data lost

Quorum-Based Replication

Quorum systems balance consistency and availability by requiring acknowledgment from multiple replicas. A write quorum of W replicas must acknowledge writes. A read quorum of R replicas must respond to reads. Setting W + R > N (total replicas) ensures reads always see the latest write.

With N=5, W=3, R=3, the system tolerates two replica failures. Writes require three replicas, and reads query three replicas, guaranteeing overlap with the most recent write. Setting W=3, R=1 prioritizes read availability over read consistency.

# Quorum-based storage
class QuorumStore
  def initialize(replicas, write_quorum, read_quorum)
    @replicas = replicas
    @write_quorum = write_quorum
    @read_quorum = read_quorum
  end
  
  def write(key, value)
    version = generate_version
    successful_writes = 0
    
    @replicas.each do |replica|
      begin
        replica.write(key, value, version)
        successful_writes += 1
        break if successful_writes >= @write_quorum
      rescue ReplicaError
        next
      end
    end
    
    if successful_writes >= @write_quorum
      true
    else
      raise InsufficientReplicasError, "Could not reach write quorum"
    end
  end
  
  def read(key)
    responses = []
    
    @replicas.each do |replica|
      begin
        responses << replica.read(key)
        break if responses.size >= @read_quorum
      rescue ReplicaError
        next
      end
    end
    
    if responses.size >= @read_quorum
      # Return value with highest version
      responses.max_by { |r| r[:version] }[:value]
    else
      raise InsufficientReplicasError, "Could not reach read quorum"
    end
  end
end

Compensating Transactions

Some systems use compensating transactions to provide eventual consistency while appearing available. The system accepts operations immediately and executes them asynchronously. If conflicts or failures occur, the system reverses the operation with a compensating transaction.

Payment systems may immediately confirm a purchase, then asynchronously verify the charge with the payment processor. If verification fails, the system cancels the order and notifies the customer. This approach maximizes availability while maintaining consistency through compensation.

Practical Examples

Concrete scenarios demonstrate how CAP trade-offs manifest in real systems and guide implementation decisions.

Distributed Counter Service

A web analytics service tracks page view counts across multiple data centers. The system must decide how to handle network partitions between data centers.

CP approach: The system designates one data center as the leader. All increments route to the leader, which replicates to followers. During a partition isolating the leader, the system rejects increments from isolated data centers. This guarantees accurate counts but makes the service unavailable in some regions during partitions.

class CPAnalyticsCounter
  def initialize(node_id, is_leader, peer_nodes)
    @node_id = node_id
    @is_leader = is_leader
    @peers = peer_nodes
    @counts = {}
  end
  
  def increment(page_id)
    unless @is_leader
      # Forward to leader
      leader = @peers.find(&:leader?)
      unless leader&.reachable?
        raise UnavailableError, "Cannot reach leader"
      end
      return leader.increment(page_id)
    end
    
    # Leader increments locally
    @counts[page_id] ||= 0
    @counts[page_id] += 1
    
    # Replicate to majority of followers
    acks = replicate_to_followers(page_id, @counts[page_id])
    
    unless acks >= majority_required
      # Rollback and reject
      @counts[page_id] -= 1
      raise UnavailableError, "Cannot reach quorum"
    end
    
    @counts[page_id]
  end
  
  def get_count(page_id)
    @counts[page_id] || 0
  end
end

AP approach: Each data center maintains its own counter. Increments succeed locally without coordination. The system periodically exchanges counts between data centers and merges them. During partitions, each data center continues accepting increments. Counts may temporarily diverge but eventually converge.

class APAnalyticsCounter
  def initialize(node_id, peer_nodes)
    @node_id = node_id
    @peers = peer_nodes
    @local_counts = {}  # This node's increments
    @merged_counts = {}  # Merged view from all nodes
  end
  
  def increment(page_id)
    # Always succeeds locally
    @local_counts[page_id] ||= 0
    @local_counts[page_id] += 1
    
    # Asynchronously propagate to peers
    @peers.each do |peer|
      peer.async_notify_increment(page_id, @node_id) if peer.reachable?
    end
    
    # Return local view (may be incomplete)
    get_count(page_id)
  end
  
  def get_count(page_id)
    local = @local_counts[page_id] || 0
    remote = @merged_counts[page_id] || 0
    local + remote
  end
  
  def merge_remote_counts(node_id, counts)
    counts.each do |page_id, count|
      @merged_counts[page_id] ||= 0
      @merged_counts[page_id] += count
    end
  end
  
  def sync_with_peers
    # Periodic full sync to handle missed increments
    @peers.each do |peer|
      next unless peer.reachable?
      
      peer_counts = peer.get_local_counts
      merge_remote_counts(peer.node_id, peer_counts)
    end
  end
end

Shopping Cart in Multi-Region Deployment

An e-commerce platform deploys across three geographic regions: US, EU, and ASIA. Users add items to carts, which must remain available during network issues between regions.

The system chooses an AP approach because cart operations should never fail, and temporary inconsistencies are acceptable. Each region maintains cart state locally. When users add items, the operation succeeds immediately in their region. The system asynchronously replicates cart changes to other regions.

During a partition between US and EU regions, a user's session may migrate from US to EU due to CDN routing. The EU region may not have the latest cart state. The system merges cart contents from both regions using timestamps and conflict resolution rules:

class DistributedCart
  def initialize(user_id, region_id, peer_regions)
    @user_id = user_id
    @region_id = region_id
    @peers = peer_regions
    @items = {}  # item_id => { quantity, timestamp, region_id }
  end
  
  def add_item(item_id, quantity)
    timestamp = Time.now.to_f
    
    @items[item_id] = {
      quantity: quantity,
      timestamp: timestamp,
      region_id: @region_id
    }
    
    # Replicate asynchronously
    @peers.each do |peer|
      peer.async_replicate_cart_item(@user_id, item_id, quantity, timestamp, @region_id)
    end
    
    true  # Always succeeds
  end
  
  def remove_item(item_id)
    timestamp = Time.now.to_f
    
    # Mark as tombstone rather than deleting
    @items[item_id] = {
      quantity: 0,
      timestamp: timestamp,
      region_id: @region_id,
      deleted: true
    }
    
    # Replicate deletion
    @peers.each do |peer|
      peer.async_replicate_cart_removal(@user_id, item_id, timestamp, @region_id)
    end
    
    true
  end
  
  def merge_remote_update(item_id, quantity, timestamp, source_region_id, deleted: false)
    existing = @items[item_id]
    
    # No existing entry - accept remote update
    unless existing
      @items[item_id] = {
        quantity: quantity,
        timestamp: timestamp,
        region_id: source_region_id,
        deleted: deleted
      }
      return
    end
    
    # Compare timestamps for conflict resolution
    if timestamp > existing[:timestamp]
      # Remote update is newer
      @items[item_id] = {
        quantity: quantity,
        timestamp: timestamp,
        region_id: source_region_id,
        deleted: deleted
      }
    elsif timestamp == existing[:timestamp]
      # Concurrent updates - use region ID as tiebreaker
      if source_region_id < existing[:region_id]
        @items[item_id] = {
          quantity: quantity,
          timestamp: timestamp,
          region_id: source_region_id,
          deleted: deleted
        }
      end
      # Otherwise keep existing
    end
    # Older updates are ignored
  end
  
  def get_items
    @items.reject { |_, item| item[:deleted] }
           .transform_values { |item| item[:quantity] }
  end
end

Session Management Across Data Centers

A web application maintains user sessions across multiple data centers. Sessions store authentication tokens, user preferences, and temporary state. The system must decide how to handle sessions during data center partitions.

CP approach creates problems: if the session data center becomes unreachable, users cannot authenticate until connectivity restores. This makes the application effectively unavailable.

AP approach with sticky sessions: users consistently route to one data center based on geographic proximity. Sessions replicate asynchronously to other data centers. During partitions, users continue accessing their local data center with current session state. If a data center fails entirely, users fail over to another data center with slightly stale session state, requiring re-authentication at worst.

class ReplicatedSession
  def initialize(session_id, primary_dc, replica_dcs)
    @session_id = session_id
    @primary_dc = primary_dc
    @replicas = replica_dcs
    @data = {}
    @version = 0
  end
  
  def set(key, value)
    @version += 1
    @data[key] = { value: value, version: @version }
    
    # Write to primary (local) immediately
    @primary_dc.write_session(@session_id, key, value, @version)
    
    # Async replicate to backups
    @replicas.each do |replica|
      replica.async_replicate_session(@session_id, key, value, @version)
    end
    
    true
  end
  
  def get(key)
    # Read from local store
    item = @data[key]
    item ? item[:value] : nil
  end
  
  def sync_from_replica(replica_data)
    # Merge data from replica using version numbers
    replica_data.each do |key, item|
      local_item = @data[key]
      
      if !local_item || item[:version] > local_item[:version]
        @data[key] = item
        @version = [@version, item[:version]].max
      end
    end
  end
end

Inventory Management System

An inventory system tracks product quantities across warehouses. The business requirement is that products cannot oversell (go negative). This necessitates a CP approach for decrement operations.

class CPInventorySystem
  def initialize(product_id, warehouses, consensus_service)
    @product_id = product_id
    @warehouses = warehouses
    @consensus = consensus_service
  end
  
  def reserve_stock(quantity, warehouse_id)
    # Must acquire distributed lock
    lock = @consensus.acquire_lock("inventory:#{@product_id}")
    
    unless lock
      raise UnavailableError, "Cannot acquire lock during partition"
    end
    
    begin
      # Read current inventory with strong consistency
      current_stock = @consensus.read_committed(
        "inventory:#{@product_id}:#{warehouse_id}"
      )
      
      if current_stock >= quantity
        new_stock = current_stock - quantity
        @consensus.write_committed(
          "inventory:#{@product_id}:#{warehouse_id}",
          new_stock
        )
        return { success: true, remaining: new_stock }
      else
        return { success: false, error: "Insufficient stock" }
      end
    ensure
      lock.release
    end
  end
  
  def check_stock(warehouse_id)
    # Reads can be eventually consistent
    @warehouses[warehouse_id].read_local("inventory:#{@product_id}")
  end
end

Ruby Implementation

Ruby applications interact with distributed systems implementing various CAP trade-offs. Ruby provides libraries and patterns for building both CP and AP systems.

Using Redis for AP Caching

Redis operates as an AP system when deployed in master-replica configuration without wait commands. Writes succeed on the master without waiting for replica acknowledgment. During partitions, stale reads may occur from replicas.

require 'redis'

class APCacheService
  def initialize
    @master = Redis.new(url: ENV['REDIS_MASTER_URL'])
    @replicas = [
      Redis.new(url: ENV['REDIS_REPLICA_1_URL']),
      Redis.new(url: ENV['REDIS_REPLICA_2_URL'])
    ]
  end
  
  def set(key, value, ttl: 3600)
    # Write to master, don't wait for replication
    @master.setex(key, ttl, value)
    true
  rescue Redis::CannotConnectError
    # Master unavailable - system becomes unavailable for writes
    false
  end
  
  def get(key)
    # Try master first
    begin
      return @master.get(key)
    rescue Redis::CannotConnectError
      # Fall back to replicas
    end
    
    # Try replicas in order
    @replicas.each do |replica|
      begin
        return replica.get(key)
      rescue Redis::CannotConnectError
        next
      end
    end
    
    # All nodes unavailable
    nil
  end
  
  def get_with_health_check(key)
    # Select healthy replica
    available_nodes = [@master] + @replicas.select do |r|
      r.ping == 'PONG' rescue false
    end
    
    return nil if available_nodes.empty?
    
    # Read from random healthy node
    available_nodes.sample.get(key)
  end
end

Using etcd for CP Configuration

etcd provides strong consistency using the Raft consensus algorithm. Ruby applications use etcd for configuration, service discovery, and distributed locks.

require 'etcdv3'

class CPConfigurationService
  def initialize
    @etcd = Etcdv3.new(
      endpoints: ENV['ETCD_ENDPOINTS'].split(','),
      command_timeout: 5
    )
  end
  
  def get_config(key)
    response = @etcd.get(key)
    response.kvs.first&.value
  rescue GRPC::Unavailable, GRPC::DeadlineExceeded
    # etcd cluster unavailable or can't reach quorum
    raise ConfigUnavailableError, "Cannot reach etcd cluster"
  end
  
  def set_config(key, value)
    @etcd.put(key, value)
    true
  rescue GRPC::Unavailable
    # Cannot reach quorum - refuse operation
    raise ConfigUnavailableError, "Cannot reach etcd quorum"
  end
  
  def acquire_lock(key, ttl: 30)
    lease = @etcd.lease_grant(ttl)
    
    # Attempt to acquire lock using compare-and-swap
    success = @etcd.transaction do |txn|
      txn.compare = [
        txn.create_revision(key) == 0  # Key doesn't exist
      ]
      txn.success = [
        txn.put(key, Process.pid.to_s, lease: lease.id)
      ]
      txn.failure = []
    end
    
    if success.succeeded
      Lock.new(@etcd, key, lease)
    else
      @etcd.lease_revoke(lease.id)
      nil
    end
  rescue GRPC::Unavailable
    raise LockUnavailableError, "Cannot acquire lock during partition"
  end
end

class Lock
  def initialize(etcd, key, lease)
    @etcd = etcd
    @key = key
    @lease = lease
  end
  
  def release
    @etcd.lease_revoke(@lease.id)
  end
  
  def keep_alive
    @etcd.lease_keep_alive_once(@lease.id)
  end
end

Building a CRDT in Ruby

Ruby applications can implement CRDTs for AP characteristics with automatic conflict resolution.

# PN-Counter (Positive-Negative Counter) CRDT
class PNCounter
  def initialize(replica_id, num_replicas)
    @replica_id = replica_id
    @positive = Array.new(num_replicas, 0)
    @negative = Array.new(num_replicas, 0)
  end
  
  def increment(amount = 1)
    @positive[@replica_id] += amount
  end
  
  def decrement(amount = 1)
    @negative[@replica_id] += amount
  end
  
  def value
    @positive.sum - @negative.sum
  end
  
  def merge(other)
    @positive.each_with_index do |count, i|
      @positive[i] = [count, other.positive[i]].max
    end
    
    @negative.each_with_index do |count, i|
      @negative[i] = [count, other.negative[i]].max
    end
    
    self
  end
  
  def to_h
    {
      replica_id: @replica_id,
      positive: @positive,
      negative: @negative
    }
  end
  
  def self.from_h(hash)
    counter = allocate
    counter.instance_variable_set(:@replica_id, hash[:replica_id])
    counter.instance_variable_set(:@positive, hash[:positive])
    counter.instance_variable_set(:@negative, hash[:negative])
    counter
  end
  
  protected
  
  attr_reader :positive, :negative
end

# Usage in distributed system
replica_1 = PNCounter.new(0, 3)
replica_2 = PNCounter.new(1, 3)

# Concurrent operations during partition
replica_1.increment(10)
replica_1.decrement(3)

replica_2.increment(5)
replica_2.decrement(2)

# Merge after partition heals
replica_1.merge(replica_2)
replica_1.value  # => 10

Implementing Circuit Breaker for Partition Handling

Circuit breakers detect partition-like failures and fail fast rather than timing out repeatedly.

class CircuitBreaker
  STATES = [:closed, :open, :half_open].freeze
  
  def initialize(failure_threshold: 5, timeout: 60, half_open_attempts: 3)
    @failure_threshold = failure_threshold
    @timeout = timeout
    @half_open_attempts = half_open_attempts
    @state = :closed
    @failure_count = 0
    @last_failure_time = nil
    @half_open_successes = 0
  end
  
  def call
    case @state
    when :open
      if time_to_retry?
        attempt_reset
        yield
      else
        raise CircuitOpenError, "Circuit breaker is open"
      end
    when :half_open
      execute_half_open { yield }
    when :closed
      execute_closed { yield }
    end
  end
  
  private
  
  def execute_closed
    yield.tap { on_success }
  rescue => e
    on_failure
    raise e
  end
  
  def execute_half_open
    yield.tap do
      @half_open_successes += 1
      if @half_open_successes >= @half_open_attempts
        reset_circuit
      end
    end
  rescue => e
    trip_circuit
    raise e
  end
  
  def on_success
    @failure_count = 0
  end
  
  def on_failure
    @failure_count += 1
    @last_failure_time = Time.now
    
    trip_circuit if @failure_count >= @failure_threshold
  end
  
  def trip_circuit
    @state = :open
    @last_failure_time = Time.now
  end
  
  def attempt_reset
    @state = :half_open
    @half_open_successes = 0
  end
  
  def reset_circuit
    @state = :closed
    @failure_count = 0
  end
  
  def time_to_retry?
    @last_failure_time && (Time.now - @last_failure_time) >= @timeout
  end
end

# Usage with distributed service
class DistributedService
  def initialize
    @circuit_breaker = CircuitBreaker.new(
      failure_threshold: 3,
      timeout: 30
    )
  end
  
  def call_remote_service(data)
    @circuit_breaker.call do
      # Call that may fail during partitions
      HTTParty.post('https://remote-service/api', body: data, timeout: 5)
    end
  rescue CircuitOpenError
    # Circuit is open - fail fast with cached/default response
    get_cached_response(data)
  end
end

Real-World Applications

Production systems demonstrate CAP trade-offs in practice across industries and architectures.

Cassandra - Tunable AP Database

Apache Cassandra provides tunable consistency levels allowing per-operation CAP trade-offs. Applications using Cassandra through Ruby clients choose consistency based on operation criticality.

Social media timelines use eventual consistency (CL=ONE) for high availability. Users accept slight delays in seeing new posts. Account authentication uses quorum reads (CL=QUORUM) to prevent login with stale credentials. Financial transactions use strong consistency (CL=ALL) to prevent duplicate charges.

Cassandra's implementation uses consistent hashing for data distribution and vector clocks for conflict detection. During partitions, different consistency levels behave differently: ONE continues operating with potentially stale data, QUORUM refuses operations if the quorum becomes unreachable, ALL requires all replicas.

DynamoDB - AP with Conditional CP

Amazon DynamoDB defaults to eventual consistency for reads, providing high availability. Applications requiring stronger consistency use strongly consistent reads, trading some availability. DynamoDB's conditional writes provide CP characteristics for operations requiring atomicity.

E-commerce sites use DynamoDB for product catalogs with eventual consistency. Catalog reads remain available during network issues, accepting slightly stale data. Order placement uses conditional writes to prevent overselling, choosing consistency over availability for critical operations.

PostgreSQL with Replication - CP Database

PostgreSQL with synchronous replication provides CP characteristics. The primary database waits for replicas to acknowledge writes before committing. During partitions, if replicas become unreachable, writes fail to maintain consistency.

Financial services use PostgreSQL for transactional data requiring strong consistency. Account balances, transaction histories, and audit logs must remain consistent. The system rejects writes during partitions rather than risk data inconsistency.

Consul - CP Service Discovery

HashiCorp Consul provides service discovery and configuration with strong consistency using Raft consensus. Services register with Consul and query for other services. During partitions, services cannot register or update if isolated from the Consul cluster.

Microservice architectures accept reduced availability during partitions in exchange for consistent service routing. Stale service discovery could route traffic to failed instances or outdated endpoints, causing cascading failures.

Memcached Cluster - AP Cache

Memcached clusters provide no consistency guarantees across nodes. Each node operates independently. During partitions, clients may read stale data or fail to invalidate caches across all nodes. Applications accept this trade-off for cache availability.

Web applications use Memcached for session storage and page fragments. Stale cached data causes minor issues like showing outdated page views or requiring re-login. The alternative - unavailable cache during partitions - would degrade performance severely.

Reference

CAP Theorem Properties

Property	Definition	System Behavior
Consistency	All nodes see the same data simultaneously	Reads return the most recent write or error
Availability	Every request receives a response	System accepts all requests to non-failing nodes
Partition Tolerance	System operates despite network failures	Continues functioning when nodes cannot communicate

System Classifications

Type	Guarantees	Trade-offs	Use Cases
CP	Consistency + Partition Tolerance	Refuses requests during partitions	Financial systems, inventory, configuration
AP	Availability + Partition Tolerance	Serves potentially stale data	Social media, caching, session storage
CA	Consistency + Availability	Cannot handle partitions	Single-node systems only

Consistency Models

Model	Definition	Latency	Use Case
Linearizability	Operations appear instantaneous	High	Coordination, locking
Sequential	Operations execute in some total order	Medium	General purpose
Causal	Causally related operations ordered	Medium	Collaborative editing
Eventual	All replicas converge eventually	Low	High availability systems
Read Your Writes	Clients see their own writes	Low	Session data

Consensus Algorithms

Algorithm	Leader Election	Partition Behavior	Complexity
Raft	Yes	Rejects writes without majority	Moderate
Paxos	Yes	Rejects writes without majority	High
ZAB	Yes	Rejects writes without quorum	Moderate
Viewstamped Replication	Yes	Rejects writes without quorum	Moderate

Conflict Resolution Strategies

Strategy	Description	Data Loss Risk	Complexity
Last Write Wins	Newest timestamp wins	Concurrent updates lost	Low
Vector Clocks	Track causality per replica	None if combined with merge	Medium
CRDTs	Mathematically convergent types	None by design	Medium
Application Logic	Custom merge rules	Depends on implementation	High

Quorum Configuration Trade-offs

Configuration	R	W	N	Read Latency	Write Latency	Consistency
Strong	3	3	5	Medium	Medium	Strong
Balanced	2	3	5	Low	Medium	Eventually strong
Read Optimized	1	5	5	Low	High	Eventually strong
Write Optimized	5	1	5	High	Low	Weak

Ruby Libraries for Distributed Systems

Library	CAP Profile	Use Case	Key Features
redis-rb	AP	Caching, sessions	Master-replica replication
etcdv3	CP	Configuration, locking	Raft consensus
cassandra-driver	Tunable	Distributed database	Per-operation consistency levels
bunny	Depends on RabbitMQ	Message queuing	Clustering with mirrored queues
connection_pool	N/A	Connection management	Handles failover logic

Operational Monitoring Metrics

Metric	Purpose	Alert Threshold
Partition Detection Rate	Track network failures	Baseline + 2 std dev
Quorum Achievement Rate	Monitor consensus health	Below 99.9%
Replication Lag	Measure data freshness	Above 1 second
Failed Request Rate	Track availability	Above 0.1%
Conflict Resolution Rate	Monitor AP system health	Sudden increases
Leader Election Time	CP system recovery speed	Above 10 seconds

CAP Theorem