CrackedRuby - Column-Family Stores

Overview

Column-family stores represent a NoSQL database model that organizes data into column families rather than traditional row-based tables. Unlike relational databases that store complete rows together, column-family stores group related columns into families and distribute data across nodes based on row keys. This architecture originated from Google's Bigtable paper and powers systems handling massive datasets across distributed clusters.

The fundamental distinction lies in how data is physically stored and accessed. A relational database stores an entire row contiguously on disk, requiring reads to fetch all columns even when only specific attributes are needed. Column-family stores organize data so that columns within a family are stored together, enabling efficient retrieval of specific attributes across millions of rows without scanning unnecessary data.

Consider a user profile system. A relational database stores each user as a complete row:

UserID | Name | Email | LastLogin | PreferenceA | PreferenceB | ...
1      | Alice| a@... | 2025-01-15| true        | false       | ...

A column-family store structures the same data differently:

RowKey: user:1
  Identity:Name = "Alice"
  Identity:Email = "a@example.com"
  Activity:LastLogin = "2025-01-15"
  Preferences:A = "true"
  Preferences:B = "false"

Each column family (Identity, Activity, Preferences) can be stored and retrieved independently. Reading login activity does not require loading preference data, reducing I/O operations dramatically when dealing with wide rows containing hundreds of attributes.

Column-family stores excel in scenarios requiring high write throughput, flexible schemas, and the ability to scale horizontally across commodity hardware. Time-series data, user activity tracking, content management systems, and recommendation engines commonly use this model. The architecture trades ACID guarantees and complex joins for scalability and availability, following the principles outlined in the CAP theorem.

Key Principles

The column-family data model organizes information into a hierarchical structure of keyspaces, column families, rows, and columns. A keyspace functions as the top-level namespace, similar to a database in relational systems, containing configuration for replication and data distribution strategies. Within a keyspace, column families define logical groupings of related data with shared physical storage characteristics.

Each row is identified by a unique row key, functioning as the primary access path to data. Row keys determine data distribution across cluster nodes through partitioning functions. The system hashes row keys to assign data to specific nodes, ensuring balanced distribution and parallel processing capabilities. Rows within a column family need not contain the same columns, providing schema flexibility absent in relational models.

Columns consist of three components: a name, value, and timestamp. The timestamp enables versioning, allowing the database to maintain multiple versions of the same column and resolve conflicts in distributed environments. Column names themselves carry semantic meaning and often include dynamic elements:

# Column structure representation
{
  name: "temperature:2025-01-15:12:00",
  value: "72.5",
  timestamp: 1736946000000
}

This structure allows for sparse data storage. If a row lacks a particular column, no space is allocated for that column, unlike relational tables where NULL values still consume storage. This property makes column-family stores efficient for datasets where different rows contain varying attributes.

Column families group columns with similar access patterns. Data within a column family is stored contiguously on disk, compressed together, and cached as a unit. Proper column family design directly impacts performance. Frequently accessed columns should reside in the same family, while rarely accessed large objects should occupy separate families to avoid loading unnecessary data.

The system maintains data consistency through eventual consistency models rather than strict ACID transactions. When a write occurs, it propagates to multiple replica nodes asynchronously. The system accepts writes immediately and resolves conflicts later using timestamps or application-defined resolution strategies. This approach prioritizes availability and partition tolerance over immediate consistency.

Data is written to an append-only commit log for durability, then stored in memory structures called memtables. When memtables reach size thresholds, the system flushes them to disk as immutable SSTable files. Background compaction processes merge SSTables, removing obsolete versions and deleted data. This log-structured merge tree (LSM tree) architecture optimizes write performance:

# Write path conceptual flow
write_request = { row_key: "user:1000", column: "email", value: "user@example.com" }

# 1. Append to commit log (sequential write)
commit_log.append(write_request)

# 2. Write to memtable (memory structure)
memtable.put(write_request[:row_key], write_request[:column], write_request[:value])

# 3. Return success to client
# 4. Asynchronously flush memtable to SSTable when full
# 5. Background compaction merges SSTables

Reads query the memtable first, then search SSTables from newest to oldest until the requested data is found. Bloom filters accelerate this process by quickly determining whether an SSTable contains a particular row key, avoiding unnecessary disk reads.

Replication strategies determine how data copies distribute across the cluster. The replication factor specifies the number of replicas, typically three for production systems. Simple strategy assigns replicas to consecutive nodes in the cluster ring, suitable for single-datacenter deployments. Network topology strategy places replicas across different racks and datacenters, providing resilience against hardware and facility failures.

Design Considerations

Column-family stores suit specific access patterns and scaling requirements. The decision to adopt this model depends on workload characteristics, consistency requirements, and operational complexity tolerance.

Write-heavy workloads with append-dominated patterns benefit significantly from column-family stores. The LSM tree architecture converts random writes into sequential log appends, achieving high throughput on rotational disks and SSDs alike. Systems ingesting sensor data, event logs, or user activity streams often achieve 10-100x higher write rates compared to B-tree based storage engines. However, this advantage comes with read amplification, as queries may examine multiple SSTables to construct the current state.

Schema flexibility addresses evolving data requirements without expensive migrations. Adding new columns to existing rows requires no schema alterations or table locks. Different rows can contain completely different column sets, accommodating heterogeneous data naturally. This property benefits content management systems, product catalogs, and multi-tenant applications where data attributes vary significantly between entities:

# Different products with varying attributes
electronics = {
  row_key: "product:1000",
  columns: {
    "basic:name" => "Laptop",
    "basic:price" => "999.99",
    "specs:cpu" => "Intel i7",
    "specs:ram" => "16GB",
    "specs:warranty" => "3 years"
  }
}

clothing = {
  row_key: "product:2000",
  columns: {
    "basic:name" => "Shirt",
    "basic:price" => "29.99",
    "specs:size" => "M",
    "specs:color" => "Blue",
    "specs:material" => "Cotton"
    # No CPU, RAM, or warranty columns needed
  }
}

Time-series data aligns naturally with the column-family model. Timestamps embedded in column names create self-organizing time-ordered data, enabling efficient range queries without secondary indexes. IoT systems, financial tick data, and monitoring applications commonly use this pattern. The wide-row approach stores all measurements for a device or entity in a single row with timestamp-qualified columns, reducing the number of row keys and improving cache locality.

Conversely, column-family stores present challenges for certain workload types. Complex queries requiring joins across multiple entity types perform poorly. The absence of foreign key relationships and query optimizers means applications must orchestrate multi-step reads and implement join logic. OLAP workloads with ad-hoc analytical queries suffer from the lack of flexible indexing and aggregation capabilities present in relational systems.

Transactions spanning multiple rows or column families require application-level coordination. While most column-family stores support lightweight transactions for single-row updates, distributed transactions across partitions are either unavailable or carry significant performance penalties. Applications requiring strong consistency guarantees across related entities should carefully evaluate whether denormalization can eliminate cross-partition transaction needs or whether a different database model better fits requirements.

Data modeling in column-family stores inverts relational design principles. Denormalization becomes standard practice, duplicating data across multiple rows to support different access patterns. Query patterns drive schema design rather than entity relationships. The same logical entity may appear in multiple column families optimized for specific queries:

# Denormalized design for different access patterns

# User profile access by user ID
user_by_id = {
  row_key: "user:#{user_id}",
  column_family: "profiles",
  columns: {
    "name" => "Alice",
    "email" => "alice@example.com",
    "created" => "2024-01-15"
  }
}

# User lookup by email
user_by_email = {
  row_key: "email:alice@example.com",
  column_family: "email_index",
  columns: {
    "user_id" => "user:12345"
  }
}

This duplication increases storage costs but eliminates the need for secondary indexes and scatter-gather queries. Write operations must update all denormalized copies atomically or accept eventual consistency between related rows.

Operational complexity increases compared to managed relational databases. Column-family stores require cluster management, compaction tuning, replication monitoring, and capacity planning across distributed nodes. Teams must develop expertise in partition key design, consistency level selection, and cluster topology configuration. Small datasets that fit comfortably on a single server rarely justify this operational overhead.

Implementation Approaches

Data modeling strategies in column-family stores differ fundamentally from relational normalization. The approach begins with access pattern identification rather than entity relationship modeling. Each distinct query pattern may require a separate physical table optimized for that specific read or write path.

The wide-row pattern stores related data in a single row with many columns, typically using compound column names to encode multiple dimensions. Time-series applications use this extensively:

RowKey: sensor:device123
Columns:
  temperature:2025-01-15:00:00 = 72.5
  temperature:2025-01-15:01:00 = 72.8
  temperature:2025-01-15:02:00 = 73.1
  humidity:2025-01-15:00:00 = 45
  humidity:2025-01-15:01:00 = 46
  ...

This pattern supports efficient range queries over time intervals without scanning multiple rows. The tradeoff involves row size management, as infinitely growing rows eventually degrade performance. Bucketing strategies partition time-series data across multiple rows:

# Time-bucketed approach
def generate_row_key(device_id, timestamp)
  bucket = timestamp.strftime("%Y-%m-%d") # Daily buckets
  "sensor:#{device_id}:#{bucket}"
end

# Creates separate rows for each day
row_2025_01_15 = generate_row_key("device123", Time.parse("2025-01-15"))
# => "sensor:device123:2025-01-15"

row_2025_01_16 = generate_row_key("device123", Time.parse("2025-01-16"))
# => "sensor:device123:2025-01-16"

The composite key pattern incorporates multiple attributes into the row key to support different query dimensions. A messaging application might use sender-receiver-timestamp combinations:

RowKey: message:user123:user456:2025-01-15:14:30
Columns:
  subject = "Meeting reminder"
  body = "Don't forget about tomorrow's meeting"
  status = "read"

This approach enables efficient queries for messages between specific users within time ranges. However, querying all messages for a user across all conversations requires maintaining a separate index structure.

Inverted index patterns implement secondary access paths by maintaining lookup tables:

# Primary table: content by document ID
document_table = {
  row_key: "doc:12345",
  columns: {
    "title" => "Ruby Performance",
    "content" => "...",
    "tags" => ["ruby", "performance", "optimization"]
  }
}

# Inverted index: documents by tag
tag_index = {
  row_key: "tag:ruby",
  columns: {
    "doc:12345" => "",  # Column name is document ID, value is empty or metadata
    "doc:12346" => "",
    "doc:12350" => ""
  }
}

Applications must update both tables during writes to maintain consistency. The empty column values or lightweight metadata minimize storage overhead while the column names serve as the actual index entries.

Materialized view patterns precompute aggregations and transformations:

# Raw events table
events_table = {
  row_key: "event:2025-01-15:12:30:123",
  columns: {
    "user_id" => "user:789",
    "action" => "purchase",
    "amount" => "99.99"
  }
}

# Materialized daily summary
daily_summary = {
  row_key: "summary:user:789:2025-01-15",
  columns: {
    "total_purchases" => "3",
    "total_amount" => "284.97",
    "last_activity" => "2025-01-15:18:45"
  }
}

Background processes or streaming pipelines continuously update materialized views from the raw event stream, trading storage space and write amplification for fast read access to aggregated data.

Partition key design critically impacts cluster performance and data distribution. Poorly chosen partition keys create hotspots where a small number of nodes handle disproportionate traffic:

# Bad: Sequential keys create hotspots
bad_row_key = "user:#{Time.now.to_i}:#{user_id}"
# All recent writes go to the same partition

# Good: Hashed prefix distributes load
require 'digest'
hash_prefix = Digest::MD5.hexdigest(user_id.to_s)[0..3]
good_row_key = "#{hash_prefix}:user:#{user_id}"
# Distributes users across partitions evenly

The hashed prefix ensures random distribution across the cluster while still maintaining efficient lookups when the full user ID is known. Time-based bucketing with hash prefixes combines temporal locality with load distribution:

def bucketed_key(entity_id, timestamp)
  bucket = timestamp.strftime("%Y-%m-%d")
  hash = Digest::MD5.hexdigest(entity_id.to_s)[0..1]
  "#{hash}:#{entity_id}:#{bucket}"
end

Ruby Implementation

Ruby applications interact with column-family stores through driver libraries that handle connection pooling, request routing, and data serialization. The cassandra-driver gem provides the reference implementation for Apache Cassandra:

require 'cassandra'

cluster = Cassandra.cluster(
  hosts: ['10.0.1.1', '10.0.1.2', '10.0.1.3'],
  port: 9042,
  consistency: :quorum,
  timeout: 10
)

session = cluster.connect('ecommerce')

The cluster object manages connections to multiple nodes, routing queries to appropriate coordinators based on partition keys. The consistency level determines how many replicas must respond before the operation returns, trading latency for data accuracy guarantees.

Schema definition uses CQL (Cassandra Query Language), a SQL-like language adapted for column-family semantics:

# Create keyspace with replication strategy
session.execute(<<-CQL)
  CREATE KEYSPACE IF NOT EXISTS ecommerce
  WITH REPLICATION = {
    'class': 'NetworkTopologyStrategy',
    'datacenter1': 3
  }
CQL

# Create column family (table)
session.execute(<<-CQL)
  CREATE TABLE IF NOT EXISTS products (
    category text,
    product_id uuid,
    name text,
    price decimal,
    attributes map<text, text>,
    created_at timestamp,
    PRIMARY KEY (category, product_id)
  )
  WITH CLUSTERING ORDER BY (product_id ASC)
CQL

The PRIMARY KEY declaration defines both the partition key (category) and clustering columns (product_id). All products within the same category reside on the same partition, enabling efficient category-scoped queries. Clustering columns determine sort order within the partition.

Write operations use prepared statements to improve performance and prevent injection attacks:

# Prepare statement once
insert_product = session.prepare(<<-CQL)
  INSERT INTO products (category, product_id, name, price, attributes, created_at)
  VALUES (?, ?, ?, ?, ?, ?)
  USING TTL ?
CQL

# Execute multiple times with different values
require 'securerandom'

session.execute(insert_product, arguments: [
  'electronics',
  SecureRandom.uuid,
  'Wireless Mouse',
  BigDecimal('29.99'),
  { 'color' => 'black', 'connectivity' => 'bluetooth' },
  Time.now,
  86400 # TTL in seconds, data expires after 24 hours
])

The driver maintains a pool of prepared statements, sending only parameter values for subsequent executions. TTL (time-to-live) automatically removes data after the specified duration, useful for session data or temporary caches.

Range queries leverage clustering column ordering:

# Query all electronics products with price filtering
result = session.execute(
  "SELECT * FROM products WHERE category = ? AND price > ? ALLOW FILTERING",
  arguments: ['electronics', BigDecimal('50.00')]
)

result.each do |row|
  puts "#{row['name']}: $#{row['price']}"
  row['attributes'].each { |k, v| puts "  #{k}: #{v}" }
end

The ALLOW FILTERING clause permits non-indexed column filters but may scan large partitions. Production systems should add secondary indexes or design tables specifically for common query patterns rather than relying on filtering.

Batch operations group multiple writes for atomic execution within a partition:

batch = session.batch do |b|
  # All updates must share the same partition key for atomicity
  b.add(
    "UPDATE products SET price = ? WHERE category = ? AND product_id = ?",
    arguments: [BigDecimal('24.99'), 'electronics', product_id]
  )
  
  b.add(
    "INSERT INTO price_history (category, product_id, changed_at, old_price, new_price) 
     VALUES (?, ?, ?, ?, ?)",
    arguments: ['electronics', product_id, Time.now, BigDecimal('29.99'), BigDecimal('24.99')]
  )
end

session.execute(batch)

Batches spanning multiple partitions lose atomicity guarantees and degrade performance. Use batches only for related writes to the same partition key.

Asynchronous operations improve throughput when issuing multiple independent queries:

# Fire multiple queries concurrently
futures = categories.map do |category|
  session.execute_async(
    "SELECT * FROM products WHERE category = ? LIMIT 10",
    arguments: [category]
  )
end

# Wait for all results
results = futures.map(&:get)

# Process combined results
results.flatten.each do |row|
  process_product(row)
end

The driver manages connection pooling and request pipelining transparently, maximizing parallelism across cluster nodes.

Counter columns provide distributed counters without read-modify-write cycles:

session.execute(<<-CQL)
  CREATE TABLE IF NOT EXISTS view_counts (
    content_id uuid PRIMARY KEY,
    views counter
  )
CQL

# Increment counter (no read required)
session.execute(
  "UPDATE view_counts SET views = views + ? WHERE content_id = ?",
  arguments: [1, content_id]
)

# Read counter value
result = session.execute(
  "SELECT views FROM view_counts WHERE content_id = ?",
  arguments: [content_id]
)

puts "Views: #{result.first['views']}"

Counter updates propagate eventually across replicas, with the system resolving conflicts by summing updates. Counters sacrifice strict accuracy for high-throughput concurrent increments.

Collections (sets, lists, maps) store structured data within columns:

# Add items to a set
session.execute(
  "UPDATE products SET tags = tags + ? WHERE category = ? AND product_id = ?",
  arguments: [Set.new(['wireless', 'ergonomic']), 'electronics', product_id]
)

# Update map element
session.execute(
  "UPDATE products SET attributes['color'] = ? WHERE category = ? AND product_id = ?",
  arguments: ['silver', 'electronics', product_id]
)

# Append to list
session.execute(
  "UPDATE product_reviews SET review_ids = review_ids + ? WHERE product_id = ?",
  arguments: [[review_id], product_id]
)

Collections have size limits (64KB by default) and the entire collection loads into memory during reads. Large or unbounded collections should use separate rows instead.

Performance Considerations

Write performance in column-family stores stems from the sequential append-only architecture. Writes hit the commit log and memtable immediately, avoiding disk seeks. SSD deployments achieve 100,000-500,000 writes per second per node depending on hardware and data size. This throughput scales linearly with cluster size since each node handles a partition subset independently.

However, the LSM tree structure introduces read amplification. A query may examine the memtable plus multiple SSTables before locating requested data. Compaction strategies balance read performance, write amplification, and disk space overhead:

Size-tiered compaction (STCS) groups SSTables of similar size, creating larger files progressively. This approach optimizes write throughput but can temporarily require 2x disk space during major compactions. STCS suits write-once, read-rarely workloads like time-series data:

# Configure STCS in table definition
session.execute(<<-CQL)
  CREATE TABLE sensor_data (
    device_id text,
    reading_time timestamp,
    temperature decimal,
    PRIMARY KEY (device_id, reading_time)
  )
  WITH compaction = {
    'class': 'SizeTieredCompactionStrategy',
    'min_threshold': 4,
    'max_threshold': 32
  }
CQL

Leveled compaction (LCS) organizes SSTables into levels, with each level containing non-overlapping data ranges. Reads examine fewer files at the cost of higher write amplification from more frequent compaction. LCS benefits read-heavy workloads:

# Configure LCS for read-optimized table
session.execute(<<-CQL)
  CREATE TABLE user_profiles (
    user_id uuid PRIMARY KEY,
    username text,
    email text,
    preferences map<text, text>
  )
  WITH compaction = {
    'class': 'LeveledCompactionStrategy',
    'sstable_size_in_mb': 160
  }
CQL

Time-window compaction (TWCS) groups data by time windows, expiring entire SSTables when all contained data exceeds TTL. This strategy eliminates compaction overhead for time-series data with TTLs:

session.execute(<<-CQL)
  CREATE TABLE application_logs (
    app_id text,
    log_time timestamp,
    message text,
    PRIMARY KEY (app_id, log_time)
  )
  WITH compaction = {
    'class': 'TimeWindowCompactionStrategy',
    'compaction_window_size': 1,
    'compaction_window_unit': 'DAYS'
  }
  AND default_time_to_live = 604800
CQL

Bloom filters significantly improve read performance by avoiding disk I/O for absent keys. Each SSTable maintains a probabilistic data structure indicating key presence. Queries skip SSTables with negative bloom filter results:

# Larger bloom filters reduce false positives at memory cost
session.execute(<<-CQL)
  ALTER TABLE products
  WITH bloom_filter_fp_chance = 0.01
CQL

The false positive probability trades memory overhead for I/O reduction. Values of 0.01-0.1 balance resource usage effectively.

Partition size directly impacts query latency. Partitions exceeding 100MB create hotspots and slow queries that must scan large column sets. Wide-row patterns must implement bucketing to bound partition growth:

# Monitor partition sizes
result = session.execute(<<-CQL)
  SELECT 
    token(category) as token,
    category,
    COUNT(*) as product_count
  FROM products
  GROUP BY category
CQL

result.each do |row|
  if row['product_count'] > 100_000
    puts "Warning: Large partition #{row['category']} with #{row['product_count']} products"
  end
end

Refactoring large partitions requires splitting data across multiple partition keys, typically adding time buckets or hash prefixes.

Consistency levels trade latency for data accuracy. LOCAL_QUORUM requires majority acknowledgment within the local datacenter, providing strong consistency with minimal cross-datacenter latency:

# Query with specific consistency level
result = session.execute(
  "SELECT * FROM products WHERE category = ?",
  arguments: ['electronics'],
  consistency: :local_quorum
)

Read repair and anti-entropy processes ensure eventual consistency across replicas. Applications must handle stale reads when using lower consistency levels like ONE or LOCAL_ONE.

Connection pooling parameters affect throughput under concurrent load:

cluster = Cassandra.cluster(
  hosts: nodes,
  connections_per_local_node: 2,    # Connections to local DC nodes
  connections_per_remote_node: 1,   # Connections to remote DC nodes
  requests_per_connection: 128,     # Concurrent requests per connection
  heartbeat_interval: 30,
  idle_timeout: 120
)

Insufficient connections create queuing delays, while excessive connections waste resources. Tune based on request rate and latency requirements through load testing.

Tools & Ecosystem

Apache Cassandra represents the most widely deployed column-family store, with production clusters at Netflix, Apple, and Discord handling millions of operations per second. Cassandra provides masterless architecture where every node can handle reads and writes, eliminating single points of failure. The system automatically rebalances data when adding or removing nodes.

The cassandra-driver gem provides the official Ruby client:

# Gemfile
gem 'cassandra-driver', '~> 3.2'

# Connection with advanced configuration
cluster = Cassandra.cluster(
  hosts: ENV['CASSANDRA_HOSTS'].split(','),
  port: 9042,
  compression: :lz4,
  protocol_version: 4,
  page_size: 1000,
  load_balancing_policy: Cassandra::LoadBalancing::Policies::TokenAware.new(
    Cassandra::LoadBalancing::Policies::RoundRobin.new
  ),
  retry_policy: Cassandra::Retry::Policies::DowngradingConsistency.new
)

Token-aware load balancing routes queries directly to nodes owning the partition, eliminating coordinator hops and reducing latency. Downgrading retry policies automatically reduce consistency levels when insufficient replicas respond, maintaining availability during partial outages.

ScyllaDB reimplements Cassandra in C++ with improved performance characteristics, achieving 10x higher throughput on equivalent hardware through thread-per-core architecture and optimized data structures. The cassandra-driver gem works with ScyllaDB clusters without modification due to CQL protocol compatibility.

HBase builds on Hadoop HDFS for storage, integrating with the Hadoop ecosystem for batch processing and analytics. The hbase-ruby gem provides Ruby bindings:

require 'hbase'

client = HBase::Client.new(
  host: 'hbase-master.example.com',
  port: 9090
)

# HBase uses different terminology: tables contain column families
table = client.table('products')

# Put operation
table.put('electronics:12345', {
  'info:name' => 'Laptop',
  'info:price' => '999.99',
  'specs:cpu' => 'Intel i7'
})

# Get operation  
result = table.get('electronics:12345', columns: ['info:name', 'info:price'])
puts result['info:name']

HBase provides strong consistency through ZooKeeper coordination but sacrifices availability during network partitions, following CP semantics rather than Cassandra's AP approach.

DataStax Enterprise extends Cassandra with integrated search, analytics, and graph capabilities. The datastax-ruby-driver gem adds DSE-specific features:

require 'dse'

cluster = Dse.cluster(
  hosts: nodes,
  graph_name: 'product_graph'
)

# Execute graph query using Gremlin
result = cluster.graph.execute(
  "g.V().hasLabel('product').has('category', 'electronics').valueMap()"
)

The integrated search functionality uses Solr indexes for full-text and geospatial queries without external systems.

Cequel provides an ActiveRecord-like ORM for Cassandra:

# Gemfile
gem 'cequel'

# Model definition
class Product
  include Cequel::Record
  
  key :category, :text
  key :product_id, :uuid
  column :name, :text
  column :price, :decimal
  column :attributes, :map, key_type: :text, value_type: :text
  column :created_at, :timestamp
  
  validates :name, presence: true
end

# Usage
product = Product.new(
  category: 'electronics',
  product_id: SecureRandom.uuid,
  name: 'Wireless Keyboard',
  price: 49.99
)
product.save

# Query interface
products = Product.where(category: 'electronics').limit(10)

Cequel handles connection management, query generation, and result mapping, reducing boilerplate for applications primarily performing CRUD operations.

Monitoring tools track cluster health and performance. Cassandra exposes metrics through JMX, accessible via tools like DataDog, Prometheus, or New Relic. Key metrics include:

# Example monitoring script using JMX
require 'jmx4r'

JMX::MBean.establish_connection(
  host: 'cassandra-node-1',
  port: 7199
)

# Read operation metrics
read_latency = JMX::MBean.find_by_name(
  'org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency'
)
puts "Read latency 99th percentile: #{read_latency['99thPercentile']}ms"

# Compaction metrics
pending_tasks = JMX::MBean.find_by_name(
  'org.apache.cassandra.metrics:type=Compaction,name=PendingTasks'
)
puts "Pending compactions: #{pending_tasks['Value']}"

Grafana dashboards visualize metrics over time, alerting on anomalies like increasing read latency, growing compaction queues, or unbalanced partition distribution.

Reference

CQL Data Types

Type	Ruby Type	Description	Example
text	String	UTF-8 encoded string	username
uuid	SecureRandom.uuid	Type 4 UUID	Primary keys
timeuuid	Cassandra::TimeUuid	Type 1 UUID with timestamp	Event ordering
int	Integer	32-bit signed integer	Age, count
bigint	Integer	64-bit signed integer	Large counters
decimal	BigDecimal	Arbitrary precision decimal	Currency
timestamp	Time	Date and time with millisecond precision	created_at
boolean	TrueClass, FalseClass	True or false	is_active
blob	String with ASCII-8BIT encoding	Binary data	Image data
counter	Integer	Distributed counter	Page views
map	Hash	Key-value pairs	User preferences
set	Set	Unique unordered values	Tags
list	Array	Ordered values allowing duplicates	Comments

Consistency Levels

Level	Replicas	Use Case	Latency
ANY	1 hinted handoff	Maximum write availability	Lowest
ONE	1 replica	Low latency, eventual consistency	Low
TWO	2 replicas	Balanced consistency and latency	Medium
QUORUM	Majority of replicas	Strong consistency	Medium
LOCAL_QUORUM	Majority in local datacenter	Multi-DC strong consistency	Medium
EACH_QUORUM	Majority in each datacenter	Cross-DC consistency	High
ALL	All replicas	Maximum consistency	Highest
LOCAL_ONE	1 replica in local datacenter	Geo-local low latency	Lowest

Compaction Strategies

Strategy	Optimized For	Write Amplification	Read Performance	Space Overhead
SizeTieredCompactionStrategy	Write-heavy, time-series	Low	Moderate	High during compaction
LeveledCompactionStrategy	Read-heavy, updates	High	High	Low
TimeWindowCompactionStrategy	Time-series with TTL	Low	High for recent data	Low

Primary Key Components

Component	Purpose	Determines	Example
Partition Key	Data distribution	Which nodes store the row	category
Clustering Columns	Data ordering	Sort order within partition	product_id, created_at
Composite Partition Key	Multi-attribute distribution	Partition by multiple columns	category, subcategory

Common CQL Operations

Operation	Syntax	Notes
Insert	INSERT INTO table (cols) VALUES (vals)	Creates or overwrites row
Update	UPDATE table SET col = val WHERE key = ?	Creates row if absent
Delete	DELETE FROM table WHERE key = ?	Writes tombstone
Select	SELECT cols FROM table WHERE key = ?	Partition key required
Batch	BEGIN BATCH ... APPLY BATCH	Atomic within partition

Connection Pool Settings

Parameter	Default	Purpose	Tuning Guidance
connections_per_local_node	1	Connections to local DC	Increase for high concurrency
connections_per_remote_node	1	Connections to remote DC	Keep low, prefer local
requests_per_connection	128	Concurrent requests per connection	Match application concurrency
heartbeat_interval	30	Seconds between keepalives	Reduce for fast failure detection
idle_timeout	60	Seconds before closing idle connections	Increase for bursty traffic

Performance Tuning Checklist

Aspect	Action	Impact
Partition Size	Keep under 100MB	Prevents hotspots and slow scans
Compaction Strategy	Match workload pattern	Optimizes read or write performance
Bloom Filter	Set false positive rate 0.01-0.1	Reduces unnecessary disk reads
Consistency Level	Use LOCAL_QUORUM for most workloads	Balances latency and consistency
Connection Pooling	Tune based on load testing	Prevents connection exhaustion
Query Pagination	Use token-based pagination	Avoids expensive OFFSET queries
Batch Operations	Keep within same partition	Maintains atomicity guarantees
TTL	Set on temporary data	Reduces compaction and storage
Monitoring	Track read/write latency percentiles	Identifies performance degradation
Replication Factor	Use 3 for production	Provides fault tolerance

Column-Family Stores