Overview
Database storage engines organize data in two primary patterns: row-based storage and columnar storage. Row-based storage stores complete records contiguously on disk, grouping all attributes of a single entity together. Columnar storage inverts this relationship, storing all values of a single attribute contiguously across multiple records. This architectural decision affects query performance, compression ratios, storage costs, and the types of workloads the database can handle efficiently.
Traditional relational databases like PostgreSQL and MySQL use row-based storage by default. These systems optimize for transactional workloads where applications frequently read and write complete records. A typical e-commerce transaction reads a customer record, updates inventory, and writes an order—all operations that benefit from accessing complete rows quickly.
Analytical databases like Amazon Redshift, Google BigQuery, and Apache Parquet use columnar storage. These systems optimize for analytical queries that aggregate values across millions of records but only access a few columns. A typical analytics query might calculate average purchase amounts by region, reading only two columns from billions of rows.
# Row storage conceptual representation
users_table = [
{ id: 1, name: "Alice", email: "alice@example.com", age: 30 },
{ id: 2, name: "Bob", email: "bob@example.com", age: 25 },
{ id: 3, name: "Carol", email: "carol@example.com", age: 35 }
]
# Stored on disk: [1,Alice,alice@example.com,30][2,Bob,bob@example.com,25][3,Carol,carol@example.com,35]
# Columnar storage conceptual representation
users_columnar = {
id: [1, 2, 3],
name: ["Alice", "Bob", "Carol"],
email: ["alice@example.com", "bob@example.com", "carol@example.com"],
age: [30, 25, 35]
}
# Stored on disk: id:[1,2,3] name:[Alice,Bob,Carol] email:[alice@...,bob@...,carol@...] age:[30,25,35]
The storage pattern choice impacts more than query speed. Columnar formats achieve superior compression because identical data types stored contiguously compress more efficiently than heterogeneous row data. A column of integers compresses better than alternating integers, strings, and dates within rows. Columnar databases routinely achieve 10x compression ratios on analytical data, reducing storage costs substantially.
Write patterns differ dramatically between formats. Row storage writes complete records efficiently—insert a row, update a row, delete a row. Columnar storage must fragment these operations across multiple column files, making single-record writes expensive. Most columnar systems batch writes and periodically reorganize data to maintain query performance.
Key Principles
Row-based storage organizes data in a record-oriented fashion. Each row contains all columns for a single entity, stored contiguously in memory or on disk. The database engine reads complete rows sequentially, which maps naturally to how applications think about data. When an application requests user ID 42, the system seeks to one location and reads all attributes in a single operation.
Physical storage layout determines access patterns. Row stores place related data close together physically. All attributes of record N sit adjacent to each other, followed by all attributes of record N+1. This layout optimizes for scenarios where the application needs multiple columns from the same row. The disk head performs one seek operation and reads a contiguous block containing the entire record.
Columnar storage inverts this organization. All values for column A appear together, followed by all values for column B. The system stores each column in a separate structure, often in separate files or disk pages. Reading a single complete record requires accessing multiple locations—one per column—but reading a single column across all records requires accessing only one structure.
Compression characteristics differ fundamentally between formats. Row storage compresses heterogeneous data—integers next to strings next to timestamps. General-purpose compression algorithms like LZ4 or Snappy work reasonably well but cannot exploit patterns within specific data types. A compressed row block might achieve 2-3x compression.
Columnar storage compresses homogeneous data. A column of timestamps compresses with delta encoding—store the first timestamp and subsequent deltas. A column of repeated values uses run-length encoding—store value and count. A column with low cardinality uses dictionary encoding—store unique values once and references elsewhere. These type-specific encodings achieve 10-20x compression on analytical data.
# Compression advantage in columnar format
row_data = [
[1, "pending", "2024-01-15"],
[2, "pending", "2024-01-15"],
[3, "pending", "2024-01-15"],
[4, "completed", "2024-01-16"]
]
# Row compression: compress mixed data types together
# Limited patterns to exploit
columnar_data = {
id: [1, 2, 3, 4], # Sequential integers: delta encode
status: ["pending", "pending", "pending", "completed"], # Low cardinality: dictionary encode
date: ["2024-01-15", "2024-01-15", "2024-01-15", "2024-01-16"] # Repeated values: RLE
}
# Columnar compression exploits patterns within each type
Query execution patterns diverge based on storage format. Analytical queries typically filter on few columns and aggregate across many rows. A query computing monthly revenue by product category reads two columns—amount and category—from millions of rows. Columnar storage reads only those two column files, ignoring unused columns entirely. Row storage reads complete rows, discarding unused columns after reading them from disk.
Transactional queries exhibit opposite characteristics. OLTP workloads read and write complete records frequently. An e-commerce checkout reads customer data (name, address, payment method), updates inventory (quantity, location), and writes an order (items, totals, timestamp). Each operation touches many columns from few rows. Row storage handles these access patterns efficiently with single seeks and writes.
Write amplification affects columnar systems significantly. Inserting a single row in columnar storage requires updating multiple column files. The system must locate the appropriate position in each column structure and insert the new value, potentially triggering reorganization to maintain sort order or compression. Row stores write the complete record to one location, requiring one disk operation.
Most columnar systems batch writes to mitigate this overhead. New data accumulates in a row-oriented write buffer, then periodically flushes to columnar format. This approach trades write latency for write throughput—individual writes are slower due to buffering, but overall throughput increases by amortizing reorganization costs across many rows.
Performance Considerations
Query selectivity determines which storage format performs better. Selectivity measures the ratio of columns accessed to total columns. Low selectivity queries—accessing few columns from many rows—favor columnar storage. High selectivity queries—accessing many columns from few rows—favor row storage.
Consider a table with 100 columns and 10 million rows. An analytical query calculating average salary by department reads two columns from all rows: selectivity is 2%. Columnar storage reads 2GB of data (2 columns × 1GB average per column). Row storage reads 100GB (all columns × all rows). The columnar system processes 50x less data.
Conversely, a transactional query retrieving complete customer records for 100 specific IDs reads all columns from few rows. Row storage performs 100 seeks and reads 100 records contiguously. Columnar storage performs 100 × 100 = 10,000 seeks (100 column files × 100 records), fragmenting access across many locations. Row storage dominates for this workload.
# Performance comparison simulation
class RowStore
def initialize(rows:, columns:)
@data = Array.new(rows) { |i| Array.new(columns) { |j| "value_#{i}_#{j}" } }
end
def aggregate_column(col_index)
@data.map { |row| row[col_index] }.sum # Reads all rows, accesses target column
end
def fetch_row(row_index)
@data[row_index] # Single access returns complete row
end
end
class ColumnStore
def initialize(rows:, columns:)
@data = Array.new(columns) { |j| Array.new(rows) { |i| "value_#{i}_#{j}" } }
end
def aggregate_column(col_index)
@data[col_index].sum # Single column access, no wasted I/O
end
def fetch_row(row_index)
@data.map { |col| col[row_index] } # Must access every column file
end
end
# Analytical query: aggregate one column
row_store = RowStore.new(rows: 1_000_000, columns: 100)
col_store = ColumnStore.new(rows: 1_000_000, columns: 100)
# ColumnStore reads 1M values from one array
# RowStore reads 1M rows × 100 columns = 100M values, keeps 1M
Cache efficiency differs between storage patterns. Modern CPUs cache recently accessed memory in hierarchical levels (L1, L2, L3). Row storage places related data contiguously, improving spatial locality when accessing multiple columns from the same row. One cache line load brings multiple attributes into cache simultaneously.
Columnar storage optimizes for temporal locality within single columns. Reading sequential values from one column keeps the cache warm for that specific access pattern. Switching between columns requires loading different cache lines, but analytical queries typically process one column completely before moving to the next.
Compression affects both storage size and query performance. Compressed data reduces disk I/O—reading compressed blocks from disk transfers less data than uncompressed blocks. The CPU decompression cost trades off against reduced I/O time. On I/O-bound workloads, compression improves performance despite CPU overhead.
Columnar compression enables predicate pushdown optimizations. Many columnar formats store metadata about each column block—minimum value, maximum value, null count. The query executor checks metadata before decompressing blocks. A query filtering for values greater than 1000 skips blocks where maximum value is 500, avoiding decompression entirely.
# Predicate pushdown with columnar metadata
class ColumnBlock
attr_reader :data, :min_value, :max_value
def initialize(data)
@data = data
@min_value = data.min
@max_value = data.max
end
def matches_predicate?(predicate)
case predicate[:op]
when :greater_than
@max_value > predicate[:value] # Skip block if max <= threshold
when :less_than
@min_value < predicate[:value] # Skip block if min >= threshold
when :equals
@min_value <= predicate[:value] && @max_value >= predicate[:value]
end
end
end
ages = [
ColumnBlock.new([18, 22, 25, 29]),
ColumnBlock.new([45, 52, 58, 61]),
ColumnBlock.new([31, 35, 38, 42])
]
# Query: WHERE age > 50
predicate = { op: :greater_than, value: 50 }
ages.select { |block| block.matches_predicate?(predicate) }
# => [ColumnBlock with ages 45-61]
# Skips first and third blocks without reading data
Update performance degrades significantly in columnar stores. Modifying a single column value requires locating and updating that value within a compressed, encoded column file. The system may need to decompress the block, update the value, recompress the block, and potentially reorganize adjacent blocks if size changes.
Row stores update complete records atomically. The database locates the row, modifies the target column value in place or writes a new version, and commits. Most row stores use versioning systems (MVCC) that append new row versions rather than updating in place, maintaining both read and write performance.
Parallel processing capabilities favor columnar formats for analytical workloads. Column-oriented data partitions naturally—different threads or processes read different column files simultaneously without coordination. A query aggregating three columns spawns three parallel readers, each processing one column file independently.
Row storage parallelizes by partitioning rows across threads. Each thread processes a subset of rows, reading all necessary columns from its partition. This approach works well but requires more coordination when threads need to aggregate results, as each thread produces partial results across all columns.
Design Considerations
Workload characteristics determine the appropriate storage format. Online Transaction Processing (OLTP) workloads perform frequent small reads and writes of complete records. Applications insert new orders, update inventory quantities, and retrieve customer profiles. These operations touch many columns from few rows, making row storage the natural choice.
Online Analytical Processing (OLAP) workloads scan large portions of tables, aggregating values across millions of rows while accessing few columns. Business intelligence queries compute sales by region, customer lifetime value, or inventory turnover. These queries read 2-5 columns from billions of rows, where columnar storage excels.
# Workload pattern analysis
class WorkloadAnalyzer
def analyze(queries)
queries.map do |query|
{
query: query[:sql],
selectivity: query[:columns_accessed].size.to_f / query[:total_columns],
row_count: query[:rows_scanned],
pattern: classify_pattern(query)
}
end
end
private
def classify_pattern(query)
selectivity = query[:columns_accessed].size.to_f / query[:total_columns]
row_ratio = query[:rows_scanned].to_f / query[:total_rows]
if selectivity > 0.5 && row_ratio < 0.01
:oltp # Many columns, few rows
elsif selectivity < 0.2 && row_ratio > 0.1
:olap # Few columns, many rows
else
:mixed
end
end
end
analyzer = WorkloadAnalyzer.new
queries = [
{ sql: "SELECT * FROM users WHERE id = 42",
columns_accessed: ["id", "name", "email", "address", "phone"],
total_columns: 5, rows_scanned: 1, total_rows: 1_000_000 },
{ sql: "SELECT AVG(price) FROM sales WHERE date >= '2024-01'",
columns_accessed: ["price", "date"],
total_columns: 20, rows_scanned: 500_000, total_rows: 1_000_000 }
]
results = analyzer.analyze(queries)
# => First query: :oltp pattern (100% selectivity, 0.0001% rows)
# => Second query: :olap pattern (10% selectivity, 50% rows)
Data modification frequency impacts format selection. Tables receiving frequent inserts, updates, or deletes require efficient write patterns. Row storage handles these operations naturally—write complete rows, update in place or append new versions, mark rows as deleted. Write throughput remains high even with continuous modifications.
Columnar systems struggle with frequent modifications. Each write operation updates multiple column files, and deletes create fragmentation requiring periodic compaction. Most columnar databases append changes to separate delta stores, periodically merging deltas into main columnar storage. This architecture introduces latency—recent writes remain in row-oriented delta stores until the merge completes.
Query latency requirements constrain format choice. Interactive applications requiring sub-second response times need predictable, low-latency queries. Row storage provides consistent performance for point queries—lookup by primary key returns results in milliseconds. Columnar systems introduce variability—cache state, compression format, and data organization affect query time.
Analytical workloads tolerate higher latency. Business intelligence queries computing monthly trends or customer segments complete in seconds or minutes. Users accept longer execution times in exchange for scanning larger datasets. Columnar storage trades individual query latency for overall throughput—queries run longer but process more data.
Hybrid approaches combine both formats. Some databases maintain row and columnar representations simultaneously, routing queries to the appropriate format based on access pattern. Amazon Aurora maintains a row-oriented primary store and asynchronously replicates to columnar format for analytics. Applications write to the row store and query from the columnar store, avoiding format-specific tradeoffs.
# Hybrid storage router
class HybridStore
def initialize
@row_store = RowStore.new(rows: 1000, columns: 50)
@column_store = ColumnStore.new(rows: 1000, columns: 50)
end
def execute_query(query)
strategy = select_strategy(query)
case strategy
when :row_optimized
@row_store.execute(query)
when :column_optimized
@column_store.execute(query)
when :hybrid
combine_results(
@row_store.execute(query[:row_part]),
@column_store.execute(query[:column_part])
)
end
end
private
def select_strategy(query)
if query[:columns_accessed] > query[:total_columns] * 0.7
:row_optimized
elsif query[:rows_scanned] > query[:total_rows] * 0.1
:column_optimized
else
:hybrid
end
end
end
Data freshness requirements affect storage decisions. Real-time applications need immediate visibility of writes. Row storage provides read-after-write consistency—once a transaction commits, subsequent reads see the new data. Columnar systems often introduce replication lag—writes go to row-oriented buffers, then asynchronously convert to columnar format.
Some use cases tolerate eventual consistency. Data warehouse applications accept minutes or hours of replication lag. Overnight batch processes load daily transaction data into columnar format for next-day analytics. The delay between data generation and analytical availability matches business requirements.
Storage costs versus query costs create economic tradeoffs. Columnar compression reduces storage by 10x compared to row storage, significantly lowering storage costs. However, columnar databases often require more powerful hardware for write operations and delta merging. The total cost of ownership includes storage, compute, and operational complexity.
Row storage consumes more disk space but runs efficiently on standard hardware. The operational complexity remains lower—no delta merging, simpler backup procedures, straightforward replication. Organizations with low storage costs and high operational costs may prefer row storage despite higher disk usage.
Ruby Implementation
Ruby applications interact with both row and columnar databases through standard interfaces. The database adapter layer abstracts storage format, presenting consistent APIs regardless of underlying storage. Applications query PostgreSQL (row storage) and Redshift (columnar storage) using similar ActiveRecord or Sequel syntax.
Row-based database access patterns match Ruby's object-oriented nature. ORM frameworks map database rows to Ruby objects, where each object represents one complete record with all attributes accessible as instance variables or hash keys. This mapping aligns naturally with row storage—one database row becomes one Ruby object.
require 'pg'
class UserStore
def initialize(connection_string)
@conn = PG.connect(connection_string)
end
def find_user(user_id)
result = @conn.exec_params(
'SELECT id, name, email, created_at FROM users WHERE id = $1',
[user_id]
)
return nil if result.ntuples.zero?
row = result[0]
{
id: row['id'].to_i,
name: row['name'],
email: row['email'],
created_at: Time.parse(row['created_at'])
}
end
def create_user(name:, email:)
@conn.exec_params(
'INSERT INTO users (name, email, created_at) VALUES ($1, $2, $3) RETURNING id',
[name, email, Time.now]
)
end
def update_user(user_id, attributes)
set_clause = attributes.keys.map.with_index { |k, i| "#{k} = $#{i + 2}" }.join(', ')
@conn.exec_params(
"UPDATE users SET #{set_clause} WHERE id = $1",
[user_id] + attributes.values
)
end
end
# Row storage optimizes these patterns
store = UserStore.new('postgresql://localhost/myapp')
user = store.find_user(42) # Single seek, read complete row
store.update_user(42, name: 'Alice Smith') # Single row update
Columnar database access requires different query patterns. Ruby applications querying analytical databases should minimize column selection and maximize row scanning. The query layer specifies only necessary columns, allowing the columnar engine to read fewer column files.
require 'pg'
class AnalyticsQuery
def initialize(redshift_connection)
@conn = redshift_connection
end
def monthly_revenue_by_category
result = @conn.exec(<<~SQL)
SELECT
DATE_TRUNC('month', order_date) AS month,
category,
SUM(amount) AS revenue
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY 1, 2
ORDER BY 1 DESC, 3 DESC
SQL
result.map do |row|
{
month: Date.parse(row['month']),
category: row['category'],
revenue: row['revenue'].to_f
}
end
end
def customer_segmentation
# Columnar query pattern: few columns, many rows, heavy aggregation
result = @conn.exec(<<~SQL)
SELECT
customer_id,
COUNT(*) AS order_count,
SUM(amount) AS total_spent,
MAX(order_date) AS last_order_date
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '24 months'
GROUP BY customer_id
HAVING COUNT(*) >= 5
SQL
result.map do |row|
{
customer_id: row['customer_id'].to_i,
order_count: row['order_count'].to_i,
total_spent: row['total_spent'].to_f,
last_order_date: Date.parse(row['last_order_date'])
}
end
end
end
Ruby's CSV library demonstrates columnar data transformation. Applications can convert row-oriented CSV files into columnar structures for analysis, or export columnar query results to row-oriented formats for reporting.
require 'csv'
class ColumnarTransformer
def self.csv_to_columnar(filename)
columns = {}
CSV.foreach(filename, headers: true) do |row|
row.each do |column_name, value|
columns[column_name] ||= []
columns[column_name] << value
end
end
columns
end
def self.columnar_to_csv(columns, filename)
return if columns.empty?
row_count = columns.values.first.size
headers = columns.keys
CSV.open(filename, 'w', write_headers: true, headers: headers) do |csv|
row_count.times do |i|
row = headers.map { |h| columns[h][i] }
csv << row
end
end
end
end
# Convert CSV to columnar for analysis
columnar_data = ColumnarTransformer.csv_to_columnar('sales.csv')
# => { "date" => [...], "product" => [...], "amount" => [...] }
# Analyze single column without loading complete rows into memory
total_revenue = columnar_data['amount'].map(&:to_f).sum
# Reads only amount column, ignores other columns
# Export columnar results to CSV
ColumnarTransformer.columnar_to_csv(columnar_data, 'output.csv')
The Parquet gem provides Ruby access to Apache Parquet files, a standard columnar format. Parquet files store metadata describing column encodings, compression types, and statistics. Ruby applications read Parquet files directly or generate them for consumption by other systems.
require 'parquet'
class ParquetHandler
def write_columnar_data(data, filename)
schema = infer_schema(data)
Parquet::Writer.open(filename, schema) do |writer|
row_count = data.values.first.size
row_count.times do |i|
row = data.transform_values { |values| values[i] }
writer << row
end
end
end
def read_columnar_projection(filename, columns)
result = Hash.new { |h, k| h[k] = [] }
Parquet::Reader.open(filename) do |reader|
reader.each do |row|
columns.each do |col|
result[col] << row[col] if row.key?(col)
end
end
end
result
end
private
def infer_schema(data)
schema = {}
data.each do |column_name, values|
sample = values.first
schema[column_name] = case sample
when Integer then :int64
when Float then :double
when String then :binary
when Date then :int32 # Store as days since epoch
end
end
schema
end
end
handler = ParquetHandler.new
# Write columnar data to Parquet
sales_data = {
'order_id' => [1, 2, 3],
'amount' => [100.50, 250.75, 75.25],
'date' => [Date.today, Date.today, Date.today]
}
handler.write_columnar_data(sales_data, 'sales.parquet')
# Read specific columns only (columnar projection)
revenue_data = handler.read_columnar_projection('sales.parquet', ['amount'])
# Reads only amount column from file
Ruby applications interfacing with cloud analytics platforms handle columnar data implicitly. Amazon Athena, Google BigQuery, and Snowflake present SQL interfaces while using columnar storage internally. Ruby code submits queries and receives results without managing storage format directly.
require 'aws-sdk-athena'
class AthenaAnalytics
def initialize
@client = Aws::Athena::Client.new(region: 'us-east-1')
@database = 'analytics_db'
@output_location = 's3://my-athena-results/'
end
def execute_columnar_query(sql)
response = @client.start_query_execution(
query_string: sql,
query_execution_context: { database: @database },
result_configuration: { output_location: @output_location }
)
query_execution_id = response.query_execution_id
wait_for_completion(query_execution_id)
fetch_results(query_execution_id)
end
def aggregate_metrics
# Athena uses columnar Parquet files in S3
# Query reads only necessary columns from millions of rows
sql = <<~SQL
SELECT
date_trunc('day', timestamp) AS day,
COUNT(*) AS event_count,
COUNT(DISTINCT user_id) AS unique_users
FROM events
WHERE timestamp >= current_date - interval '30' day
GROUP BY 1
ORDER BY 1
SQL
execute_columnar_query(sql)
end
private
def wait_for_completion(query_execution_id)
loop do
response = @client.get_query_execution(query_execution_id: query_execution_id)
status = response.query_execution.status.state
break if ['SUCCEEDED', 'FAILED', 'CANCELLED'].include?(status)
sleep 1
end
end
def fetch_results(query_execution_id)
results = []
next_token = nil
loop do
response = @client.get_query_results(
query_execution_id: query_execution_id,
next_token: next_token
)
response.result_set.rows.each do |row|
results << row.data.map(&:var_char_value)
end
next_token = response.next_token
break unless next_token
end
results
end
end
Tools & Ecosystem
PostgreSQL dominates row-oriented database deployments. The database engine stores complete rows in heap files, using B-tree indexes for efficient lookups. PostgreSQL extensions like TimescaleDB add time-series optimizations while maintaining row storage fundamentals. Ruby applications use the pg gem for native connectivity.
require 'pg'
require 'sequel'
# Sequel provides ORM layer over PostgreSQL
DB = Sequel.connect('postgres://localhost/production')
class Order < Sequel::Model
# Sequel assumes row storage: operations work on complete records
plugin :timestamps
plugin :validation_helpers
many_to_one :customer
one_to_many :line_items
end
# Operations match row storage strengths
order = Order[42] # Retrieves complete row efficiently
order.update(status: 'shipped', shipped_at: Time.now) # Updates single row
new_order = Order.create(customer_id: 100, total: 250.00) # Inserts complete row
Amazon Redshift implements columnar storage for analytics. Based on PostgreSQL, Redshift replaces the storage engine with a columnar format optimized for scanning large tables. The query interface remains SQL, but internal execution differs substantially. Ruby applications connect using the pg gem with Redshift-specific connection strings.
Redshift organizes data into 1MB blocks, each containing values from a single column. The system applies compression automatically—choosing run-length encoding for low-cardinality columns, delta encoding for sequential values, and dictionary encoding for repeated strings. Query performance depends on column encodings and sort keys.
require 'pg'
class RedshiftLoader
def initialize(connection_string)
@conn = PG.connect(connection_string)
end
def load_from_s3(table:, s3_path:, columns:)
# Redshift COPY command loads data efficiently into columnar format
copy_command = <<~SQL
COPY #{table} (#{columns.join(', ')})
FROM '#{s3_path}'
IAM_ROLE 'arn:aws:iam::account:role/RedshiftRole'
FORMAT AS PARQUET
SQL
@conn.exec(copy_command)
end
def analyze_table(table)
# Update column statistics for query optimization
@conn.exec("ANALYZE #{table}")
end
def vacuum_table(table)
# Reclaim space and resort columnar data
@conn.exec("VACUUM SORT ONLY #{table}")
end
end
loader = RedshiftLoader.new('postgresql://redshift-cluster:5439/analytics')
loader.load_from_s3(
table: 'events',
s3_path: 's3://data-bucket/events/2024/',
columns: ['event_id', 'user_id', 'event_type', 'timestamp']
)
loader.analyze_table('events')
Apache Parquet serves as the standard columnar file format. Parquet files contain column chunks with embedded metadata, supporting predicate pushdown and column pruning. Spark, Hive, and Presto read Parquet natively. Ruby applications use the parquet gem or shell out to command-line tools.
DuckDB provides an embedded columnar database similar to SQLite. The database runs in-process without a separate server, querying Parquet files directly. DuckDB excels at analytical queries on local data, making it suitable for data science workflows and batch processing.
require 'duckdb'
class LocalAnalytics
def initialize
@db = DuckDB::Database.open(':memory:')
@conn = @db.connect
end
def query_parquet(filename, sql)
# DuckDB queries Parquet files directly without loading into tables
full_sql = sql.gsub('SOURCE', filename)
result = @conn.query(full_sql)
result.map do |row|
Hash[result.columns.zip(row)]
end
end
def aggregate_sales(parquet_files)
# Query across multiple Parquet files
file_list = parquet_files.map { |f| "'#{f}'" }.join(', ')
sql = <<~SQL
SELECT
product_category,
SUM(amount) AS total_revenue,
COUNT(*) AS order_count
FROM read_parquet([#{file_list}])
WHERE order_date >= CURRENT_DATE - INTERVAL 90 DAY
GROUP BY product_category
ORDER BY total_revenue DESC
SQL
@conn.query(sql).to_a
end
end
analytics = LocalAnalytics.new
results = analytics.aggregate_sales([
'data/sales_2024_q1.parquet',
'data/sales_2024_q2.parquet'
])
ClickHouse implements a columnar database optimized for real-time analytics. The system ingests data continuously, stores it in columnar format, and executes queries with sub-second latency. ClickHouse uses specialized column encodings and aggressive compression. Ruby applications interface through HTTP or native TCP protocols.
BigQuery provides managed columnar analytics at scale. Google's service stores data in a proprietary columnar format, automatically partitioning and clustering tables for optimal query performance. Ruby applications use the google-cloud-bigquery gem to submit queries and retrieve results.
require 'google/cloud/bigquery'
class BigQueryAnalytics
def initialize(project_id:, dataset:)
@bigquery = Google::Cloud::Bigquery.new(project_id: project_id)
@dataset = @bigquery.dataset(dataset)
end
def scan_large_table(table_name, columns, filter)
# BigQuery charges by bytes scanned: limit columns to reduce cost
query = @dataset.query(<<~SQL)
SELECT #{columns.join(', ')}
FROM `#{table_name}`
WHERE #{filter}
SQL
query.map { |row| row.to_h }
end
def create_partitioned_table(table_name)
# Partition by date column for efficient filtering
table = @dataset.create_table(table_name) do |schema|
schema.timestamp 'timestamp', mode: :required
schema.string 'user_id'
schema.float 'amount'
end
table.time_partitioning_type = 'DAY'
table.time_partitioning_field = 'timestamp'
end
end
ActiveRecord adapters abstract storage differences. The same Ruby code queries both PostgreSQL (row storage) and Redshift (columnar storage). Performance characteristics differ, but the application interface remains consistent. Developers switch databases by changing connection configuration.
Practical Examples
E-commerce order processing demonstrates row storage optimization. The application handles customer checkouts, inventory updates, and order fulfillment—all operations touching complete records. Each transaction reads and writes many columns from few rows.
require 'sequel'
DB = Sequel.connect('postgres://localhost/ecommerce')
class OrderProcessor
def checkout(customer_id:, items:)
DB.transaction do
# Read complete customer record (many columns, one row)
customer = DB[:customers].where(id: customer_id).first
raise "Customer not found" unless customer
# Calculate order total
total = items.sum { |item| item[:quantity] * item[:price] }
# Create order record (insert complete row)
order_id = DB[:orders].insert(
customer_id: customer_id,
order_date: Time.now,
status: 'pending',
subtotal: total,
tax: total * 0.08,
total: total * 1.08,
shipping_address: customer[:default_address],
billing_address: customer[:billing_address]
)
# Update inventory for each item (multiple row updates)
items.each do |item|
rows_updated = DB[:inventory]
.where(product_id: item[:product_id])
.where { quantity >= item[:quantity] }
.update(quantity: Sequel[:quantity] - item[:quantity])
raise "Insufficient inventory" if rows_updated.zero?
# Insert line item (complete row)
DB[:order_items].insert(
order_id: order_id,
product_id: item[:product_id],
quantity: item[:quantity],
unit_price: item[:price],
total: item[:quantity] * item[:price]
)
end
order_id
end
end
def get_order_details(order_id)
# Read complete order record plus related items
order = DB[:orders].where(id: order_id).first
items = DB[:order_items].where(order_id: order_id).all
{
order: order,
items: items,
customer: DB[:customers].where(id: order[:customer_id]).first
}
end
end
processor = OrderProcessor.new
order_id = processor.checkout(
customer_id: 42,
items: [
{ product_id: 100, quantity: 2, price: 29.99 },
{ product_id: 101, quantity: 1, price: 149.99 }
]
)
# Row storage handles this efficiently: complete records, atomic updates
Business intelligence reporting demonstrates columnar storage advantages. Queries aggregate millions of rows, computing metrics across specific dimensions while ignoring most columns.
require 'pg'
class SalesAnalytics
def initialize(redshift_connection)
@conn = redshift_connection
end
def monthly_sales_report(year:)
# Scans millions of rows but reads only 4 columns
result = @conn.exec_params(<<~SQL, [year])
SELECT
DATE_TRUNC('month', order_date) AS month,
product_category,
SUM(total) AS revenue,
COUNT(*) AS order_count,
COUNT(DISTINCT customer_id) AS unique_customers
FROM orders
WHERE EXTRACT(YEAR FROM order_date) = $1
GROUP BY 1, 2
ORDER BY 1, 3 DESC
SQL
result.map do |row|
{
month: Date.parse(row['month']),
category: row['product_category'],
revenue: row['revenue'].to_f,
orders: row['order_count'].to_i,
customers: row['unique_customers'].to_i
}
end
end
def customer_lifetime_value
# Reads 3 columns from potentially billions of rows
result = @conn.exec(<<~SQL)
SELECT
customer_id,
MIN(order_date) AS first_order,
MAX(order_date) AS last_order,
COUNT(*) AS total_orders,
SUM(total) AS lifetime_value
FROM orders
GROUP BY customer_id
HAVING COUNT(*) >= 5
SQL
result.map do |row|
{
customer_id: row['customer_id'].to_i,
tenure_days: (Date.parse(row['last_order']) - Date.parse(row['first_order'])).to_i,
orders: row['total_orders'].to_i,
ltv: row['lifetime_value'].to_f
}
end
end
def product_performance_matrix
# Complex aggregation reading few columns from many rows
result = @conn.exec(<<~SQL)
WITH monthly_sales AS (
SELECT
product_id,
DATE_TRUNC('month', order_date) AS month,
SUM(quantity) AS units_sold,
SUM(total) AS revenue
FROM order_items
JOIN orders USING (order_id)
WHERE order_date >= CURRENT_DATE - INTERVAL '12 months'
GROUP BY 1, 2
)
SELECT
product_id,
AVG(units_sold) AS avg_monthly_units,
STDDEV(units_sold) AS units_volatility,
SUM(revenue) AS total_revenue,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY revenue) AS median_revenue
FROM monthly_sales
GROUP BY product_id
ORDER BY total_revenue DESC
LIMIT 100
SQL
result.to_a
end
end
analytics = SalesAnalytics.new(PG.connect('postgresql://redshift:5439/warehouse'))
# Columnar storage reads only necessary columns
# Row storage would read complete order records (20+ columns)
report = analytics.monthly_sales_report(year: 2024)
ltv_data = analytics.customer_lifetime_value
Log analysis benefits from columnar storage when analyzing structured logs at scale. Applications write logs with many attributes but queries typically filter and aggregate on few fields.
require 'duckdb'
require 'csv'
class LogAnalyzer
def initialize(log_directory)
@db = DuckDB::Database.open(':memory:')
@conn = @db.connect
@log_directory = log_directory
end
def load_logs
# Convert CSV logs to Parquet for columnar analysis
Dir.glob("#{@log_directory}/*.csv").each do |csv_file|
parquet_file = csv_file.sub('.csv', '.parquet')
@conn.execute(<<~SQL)
COPY (SELECT * FROM read_csv_auto('#{csv_file}'))
TO '#{parquet_file}' (FORMAT PARQUET, COMPRESSION ZSTD)
SQL
end
end
def error_rate_by_service
# Read only timestamp, service, and status columns
result = @conn.query(<<~SQL)
SELECT
service_name,
COUNT(*) AS total_requests,
SUM(CASE WHEN status_code >= 500 THEN 1 ELSE 0 END) AS errors,
SUM(CASE WHEN status_code >= 500 THEN 1 ELSE 0 END)::FLOAT / COUNT(*) AS error_rate
FROM read_parquet('#{@log_directory}/*.parquet')
WHERE timestamp >= CURRENT_TIMESTAMP - INTERVAL 1 HOUR
GROUP BY service_name
ORDER BY error_rate DESC
SQL
result.to_a
end
def latency_percentiles(service:)
result = @conn.query(<<~SQL)
SELECT
PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY response_time_ms) AS p50,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY response_time_ms) AS p95,
PERCENTILE_CONT(0.99) WITHIN GROUP (ORDER BY response_time_ms) AS p99,
MAX(response_time_ms) AS max
FROM read_parquet('#{@log_directory}/*.parquet')
WHERE service_name = '#{service}'
AND timestamp >= CURRENT_TIMESTAMP - INTERVAL 24 HOUR
SQL
result.first.to_h
end
end
analyzer = LogAnalyzer.new('/var/log/application')
analyzer.load_logs
errors = analyzer.error_rate_by_service
latencies = analyzer.latency_percentiles(service: 'api-gateway')
Time-series data collection uses hybrid storage. Recent data remains in row-oriented tables for fast inserts. Older data moves to columnar format for efficient analytical queries. The application queries both stores and merges results.
require 'sequel'
class TimeSeriesStore
def initialize
@pg = Sequel.connect('postgres://localhost/timeseries')
@redshift = Sequel.connect('postgresql://redshift:5439/analytics')
end
def record_metric(metric_name:, value:, tags:)
# Recent data: fast writes to row storage
@pg[:metrics].insert(
metric_name: metric_name,
value: value,
tags: Sequel.pg_jsonb(tags),
timestamp: Time.now
)
end
def query_metric(metric_name:, start_time:, end_time:, aggregation: :avg)
cutoff = Time.now - 86400 # 24 hours ago
# Recent data from row storage
recent_data = @pg[:metrics]
.where(metric_name: metric_name)
.where { timestamp >= [start_time, cutoff].max }
.where { timestamp <= end_time }
.select(:timestamp, :value)
.all
# Historical data from columnar storage
historical_data = if start_time < cutoff
@redshift[:metrics_history]
.where(metric_name: metric_name)
.where { timestamp >= start_time }
.where { timestamp < cutoff }
.select(:timestamp, :value)
.all
else
[]
end
# Merge and aggregate
all_data = recent_data + historical_data
aggregate_values(all_data, aggregation)
end
def archive_old_data
cutoff = Time.now - 86400
# Move data from row storage to columnar
old_data = @pg[:metrics].where { timestamp < cutoff }.all
@redshift[:metrics_history].multi_insert(old_data)
@pg[:metrics].where { timestamp < cutoff }.delete
old_data.count
end
private
def aggregate_values(data, aggregation)
values = data.map { |row| row[:value] }
case aggregation
when :avg
values.sum.to_f / values.size
when :sum
values.sum
when :max
values.max
when :min
values.min
end
end
end
store = TimeSeriesStore.new
# Fast writes to row storage
store.record_metric(
metric_name: 'api.response_time',
value: 45.2,
tags: { service: 'api-gateway', region: 'us-east-1' }
)
# Query spans both stores, columnar handles historical aggregation
avg_response = store.query_metric(
metric_name: 'api.response_time',
start_time: Time.now - 7 * 86400,
end_time: Time.now,
aggregation: :avg
)
# Periodic archival moves data to columnar format
archived_count = store.archive_old_data
Reference
Storage Format Comparison
| Aspect | Row Storage | Columnar Storage |
|---|---|---|
| Data Organization | Complete records stored together | Column values stored together |
| Read Pattern | Entire row in single I/O operation | Multiple I/O operations per row |
| Write Pattern | Single write per row | Multiple writes across columns |
| Compression Ratio | 2-3x typical | 10-20x typical |
| Insert Performance | Excellent single row inserts | Poor single row inserts |
| Update Performance | Efficient in-place updates | Expensive cross-column updates |
| Scan Performance | Poor for column subsets | Excellent for column subsets |
| Point Query | Fast for complete records | Slow for complete records |
| Analytical Query | Inefficient column scanning | Optimized column scanning |
| OLTP Suitability | Excellent | Poor |
| OLAP Suitability | Poor | Excellent |
Query Selectivity Guidelines
| Columns Accessed | Rows Scanned | Recommended Format | Reasoning |
|---|---|---|---|
| Many (>50%) | Few (<1%) | Row | Reading complete records efficiently |
| Few (<20%) | Many (>10%) | Columnar | Column subset with massive scans |
| Many (>50%) | Many (>10%) | Row or Hybrid | Full table scan of all data |
| Few (<20%) | Few (<1%) | Row | Point queries benefit from row layout |
| Many (>50%) | Medium (1-10%) | Row | Record-oriented access pattern |
| Few (<20%) | Medium (1-10%) | Columnar | Analytical aggregation pattern |
Compression Techniques
| Technique | Best For | Compression Ratio | Columnar Advantage |
|---|---|---|---|
| Dictionary Encoding | Low cardinality columns | 5-10x | Entire column shares dictionary |
| Run Length Encoding | Repeated values | 10-100x | Long runs in sorted columns |
| Delta Encoding | Sequential values | 5-20x | Timestamps or IDs in column |
| Bit Packing | Small integer ranges | 2-8x | Uniform type per column |
| LZ4 Compression | General purpose | 2-3x | Works on any format |
| Snappy Compression | Fast general purpose | 1.5-2x | Works on any format |
Database System Characteristics
| System | Storage Format | Primary Use Case | Ruby Gem | Write Throughput |
|---|---|---|---|---|
| PostgreSQL | Row | OLTP transactions | pg | Excellent |
| MySQL | Row | OLTP transactions | mysql2 | Excellent |
| Amazon Redshift | Columnar | Data warehouse analytics | pg | Moderate |
| Google BigQuery | Columnar | Managed analytics | google-cloud-bigquery | Good |
| ClickHouse | Columnar | Real-time analytics | clickhouse | Excellent |
| DuckDB | Columnar | Embedded analytics | duckdb | Good |
| Apache Parquet | Columnar file | Data interchange | parquet | Batch only |
| Snowflake | Columnar | Cloud data warehouse | odbc | Good |
Workload Classification
| Pattern | Column Access | Row Access | Updates | Queries | Best Format |
|---|---|---|---|---|---|
| OLTP | High (80%+) | Low (<1%) | Frequent | Simple | Row |
| OLAP | Low (<20%) | High (50%+) | Rare | Complex aggregations | Columnar |
| Hybrid Transaction | Medium (40-60%) | Low (<5%) | Moderate | Mixed | Row or Hybrid |
| Hybrid Analytics | Low-Medium (20-40%) | Medium (10-50%) | Moderate | Mixed | Hybrid or Columnar |
| Append-Only Logs | Few (2-5) | Very High (100%) | None | Time-based aggregations | Columnar |
| User Profile Lookup | High (90%+) | Very Low (<0.1%) | Moderate | Primary key lookups | Row |
Ruby Adapter Methods
| Operation | Row-Optimized Pattern | Columnar-Optimized Pattern |
|---|---|---|
| Single Record Fetch | Model.find(id) | Avoid, use batch queries |
| Batch Fetch | Model.where(id: ids).all | Acceptable with few columns |
| Aggregation | Scan entire table | SELECT few columns with GROUP BY |
| Insert Single | Model.create(attrs) | Batch into buffer |
| Insert Batch | Model.import(records) | Optimal, load in batch |
| Update Single | Model.update(id, attrs) | Expensive, consider append |
| Update Batch | Update individual rows | Append new versions |
| Column Subset | Wasteful, reads all columns | SELECT specific columns only |
Performance Tuning Guidelines
| Scenario | Row Storage Optimization | Columnar Storage Optimization |
|---|---|---|
| Point Queries | Add B-tree indexes | Avoid, use row storage |
| Range Scans | Index on filter columns | Partition by common filters |
| Aggregations | Pre-compute in tables | Use column statistics |
| Joins | Index foreign keys | Broadcast small tables |
| Sorting | Sort on indexed columns | Define sort keys |
| Filtering | Compound indexes | Predicate pushdown |
| Updates | MVCC for concurrency | Batch updates, vacuum |
| Inserts | Single row commits | Batch load thousands |
| Compression | Enable page compression | Choose encoding per column |
| Partitioning | Range partition large tables | Partition by date or key column |