CrackedRuby CrackedRuby

Overview

Time-series databases store and retrieve data points indexed by time. Unlike traditional relational databases optimized for random access and complex joins, time-series databases optimize for write-heavy workloads where data arrives chronologically and queries typically scan temporal ranges. These databases handle metrics, events, and measurements that accumulate over time, such as server metrics, sensor readings, financial tick data, and application logs.

The primary distinction between time-series databases and general-purpose databases lies in data access patterns. Time-series workloads exhibit sequential writes, range-based queries, and frequent aggregations over time windows. A monitoring system might write thousands of metric points per second but query data only for specific time ranges with aggregations like averages or percentiles. Traditional databases struggle with this pattern due to index overhead and lack of time-aware optimizations.

Time-series databases emerged from the need to handle increasing volumes of temporal data in monitoring, IoT, and analytics applications. Early solutions repurposed relational databases with specialized schemas, but dedicated time-series databases provide native support for temporal operations, automatic data retention policies, and compression algorithms designed for sequential numeric data.

# Traditional approach: storing metrics in relational database
class Metric < ActiveRecord::Base
  # Writes require index updates, slow at scale
  # Queries lack time-aware optimizations
end

# Time-series approach: optimized for temporal patterns
influx_client = InfluxDB2::Client.new('http://localhost:8086', token)
write_api = influx_client.create_write_api

# Efficient batched writes with automatic timestamping
write_api.write(data: point, bucket: 'metrics')

The core architectural difference manifests in storage layout. Time-series databases organize data by time, often using specialized storage engines that compress sequential values and eliminate redundant timestamps through delta encoding. This approach reduces storage requirements by 10-20x compared to traditional databases while accelerating temporal queries through time-based partitioning.

Key Principles

Time-series databases operate on several fundamental principles that distinguish them from other database types. Understanding these principles clarifies design decisions and usage patterns.

Time as Primary Index

Every data point in a time-series database associates with a timestamp. The database treats time as the primary organizational axis, not as just another column. Storage engines partition data temporally, creating segments for specific time ranges. This organization enables efficient writes, as new data appends to the most recent segment without affecting older data. Queries specifying time ranges directly access relevant segments without scanning the entire dataset.

Immutable Data Model

Time-series databases assume data represents historical facts that do not change. A temperature reading at 10:00 AM remains constant regardless of subsequent events. This immutability enables aggressive caching, simplified replication, and efficient compression. The database does not support traditional updates; corrections require writing new points with more recent timestamps or deletion of entire time ranges.

Tags and Fields

Time-series databases separate metadata from measurements. Tags identify the data series through key-value pairs (server name, region, sensor ID), while fields contain actual measurements. Tags serve as dimensions for grouping and filtering, indexed for query performance. Fields store numeric values or strings representing the measurements themselves.

# Tags identify what generated the data
tags = {
  host: 'web-server-01',
  region: 'us-east',
  environment: 'production'
}

# Fields contain the measurements
fields = {
  cpu_usage: 45.2,
  memory_mb: 2048,
  request_count: 1523
}

# Timestamp indicates when
timestamp = Time.now.to_i

Tags create the series cardinality—the number of unique combinations of tag values. High cardinality (millions of unique series) challenges time-series databases differently than high data volume. A database might handle billions of points efficiently but struggle with millions of distinct series due to index overhead.

Write Optimization

Time-series databases prioritize write performance since data typically arrives in real-time streams. They achieve high write throughput through several mechanisms: in-memory buffers that batch writes before persisting to disk, append-only storage files that avoid random I/O, and relaxed consistency models that defer replication or indexing.

The write path typically buffers incoming points in memory, sorted by time, until reaching a threshold for flushing to disk. This batching amortizes disk I/O costs across many points. Some databases accept data out of temporal order within a window, buffering and sorting before writing to maintain temporal locality on disk.

Downsampling and Retention

Raw data accumulates rapidly—a single metric collected every second generates 86,400 points daily. Time-series databases provide automatic downsampling to reduce storage requirements for historical data. Downsampling aggregates high-resolution data into lower-resolution summaries, storing 5-minute averages instead of individual second-level points for older data.

Retention policies automatically delete data exceeding specified ages. A database might retain raw data for 7 days, hourly aggregates for 90 days, and daily summaries indefinitely. These policies execute during background compaction, reclaiming storage without impacting write or query performance.

# Define retention policy with downsampling
retention_config = {
  duration: '7d',           # Keep raw data for 7 days
  shard_duration: '1h',     # Partition by hour
  replication: 1,
  downsampling: [
    { duration: '90d', aggregation: 'mean', interval: '1h' },
    { duration: 'INF', aggregation: 'mean', interval: '1d' }
  ]
}

Compression

Sequential numeric data exhibits patterns that compression algorithms exploit. Delta encoding stores differences between consecutive values rather than absolute values, reducing storage when values change gradually. Run-length encoding compresses repeated values. Specialized codecs like Gorilla compression (Facebook) achieve 12-bit-per-point average for floating-point metrics with minimal CPU overhead.

Compression operates transparently during writes. The database compresses data blocks before writing to disk, decompressing during reads. Query engines operate on compressed data where possible, reducing I/O.

Implementation Approaches

Implementing time-series storage requires choosing between several architectural approaches, each with distinct trade-offs for specific workloads.

Specialized Time-Series Databases

Dedicated time-series databases like InfluxDB, TimescaleDB, and Prometheus optimize all components for temporal workloads. These systems provide native time-series data types, query languages with temporal functions, and storage engines designed for sequential access patterns.

InfluxDB uses a custom storage engine (TSM) that organizes data into time-sharded files with columnar compression. Each shard covers a specific time range, and the query engine knows to access only relevant shards for temporal queries. The line protocol for data ingestion provides high write throughput with minimal parsing overhead.

TimescaleDB extends PostgreSQL with time-series optimizations while retaining SQL compatibility and PostgreSQL features. It automatically partitions tables into chunks based on time ranges, creating a hypertable that appears as a single table but distributes data across many partitions. This approach combines time-series performance with relational database capabilities.

Prometheus targets monitoring and alerting, pulling metrics from instrumented applications. Its local storage uses TSDB blocks with aggressive compression, while the query language (PromQL) provides temporal aggregation and alerting rules. Prometheus assumes eventual consistency and prioritizes availability over durability.

Relational Database with Time-Series Schema

Traditional relational databases can store time-series data through careful schema design. Wide tables with timestamp columns, proper indexing, and partitioning provide acceptable performance for moderate workloads. This approach leverages existing database infrastructure and operational knowledge.

Partitioning divides tables by time ranges, typically daily or weekly partitions. Queries restricted to recent data access only relevant partitions, avoiding full table scans. Older partitions can be archived or dropped without affecting active data. Index strategies focus on time-range queries, often using BRIN indexes in PostgreSQL that summarize ranges rather than indexing every value.

class CreateMetricsTable < ActiveRecord::Migration[7.0]
  def change
    create_table :metrics, id: false do |t|
      t.bigint :id, null: false
      t.datetime :timestamp, null: false, precision: 6
      t.string :metric_name, null: false
      t.jsonb :tags, default: {}
      t.float :value, null: false
    end
    
    # Partition by month
    execute <<-SQL
      CREATE INDEX idx_metrics_time ON metrics USING BRIN (timestamp);
      CREATE INDEX idx_metrics_tags ON metrics USING GIN (tags);
      ALTER TABLE metrics ADD PRIMARY KEY (id, timestamp);
    SQL
  end
end

The relational approach works when write volumes remain under 10,000 points per second and query patterns align with SQL capabilities. It struggles with high cardinality tag sets and lacks native compression for temporal data.

Distributed Time-Series Architectures

Large-scale deployments distribute time-series data across clusters. Systems like M3DB or distributed InfluxDB provide horizontal scalability through sharding and replication. Each node stores a subset of series, determined by consistent hashing of series identifiers.

Distributed architectures introduce complexity in coordinating queries across nodes, maintaining data consistency during replication, and rebalancing data when nodes join or leave. Query coordinators fan out requests to relevant nodes and aggregate results. The CAP theorem applies—systems typically prioritize availability and partition tolerance over strong consistency for time-series workloads.

Sharding strategies affect query performance. Range-based sharding by time distributes writes across nodes but concentrates queries on nodes holding relevant time ranges. Series-based sharding distributes both reads and writes but requires querying all nodes for time-range queries spanning multiple series.

Hybrid Approaches

Some systems combine approaches. OpenTSDB layers on HBase, providing time-series semantics over a distributed key-value store. VictoriaMetrics offers InfluxDB-compatible APIs with custom storage optimized for high cardinality and long-term retention.

Cloud-managed services like Amazon Timestream or Azure Time Series Insights abstract infrastructure complexity, providing time-series capabilities without managing databases. These services charge based on writes, storage, and queries, suitable for applications preferring operational simplicity over cost optimization.

Ruby Implementation

Ruby applications interact with time-series databases through client libraries and Ruby-specific patterns. Several gems provide idiomatic interfaces to popular time-series databases.

InfluxDB Integration

The influxdb-client gem provides Ruby access to InfluxDB 2.x. It supports batched writes, parameterized queries, and asynchronous operations through connection pooling.

require 'influxdb-client'

# Initialize client with connection parameters
client = InfluxDB2::Client.new(
  'http://localhost:8086',
  ENV['INFLUXDB_TOKEN'],
  bucket: 'system_metrics',
  org: 'my_org',
  precision: InfluxDB2::WritePrecision::NANOSECOND
)

# Create write API with batching
write_api = client.create_write_api(
  write_options: InfluxDB2::WriteOptions.new(
    batch_size: 1000,
    flush_interval: 10_000,  # milliseconds
    retry_interval: 5_000
  )
)

# Write points with tags and fields
def record_metric(write_api, host, metric_name, value)
  point = InfluxDB2::Point.new(name: metric_name)
    .add_tag('host', host)
    .add_tag('environment', Rails.env)
    .add_field('value', value)
    .time(Time.now, InfluxDB2::WritePrecision::MILLISECOND)
  
  write_api.write(data: point)
end

# Query with Flux language
query_api = client.create_query_api
flux_query = <<-FLUX
  from(bucket: "system_metrics")
    |> range(start: -1h)
    |> filter(fn: (r) => r._measurement == "cpu_usage")
    |> filter(fn: (r) => r.host == "web-01")
    |> aggregateWindow(every: 5m, fn: mean)
FLUX

result = query_api.query(query: flux_query)
result.each do |table|
  table.records.each do |record|
    puts "#{record.time}: #{record.value}"
  end
end

client.close!

The gem handles connection management, automatic retries on failures, and background flushing of buffered writes. Applications should reuse client instances rather than creating new connections for each operation.

TimescaleDB with ActiveRecord

TimescaleDB extends PostgreSQL, making it accessible through standard Ruby database libraries. The timescaledb gem provides ActiveRecord integration with hypertable management.

# Migration creating hypertable
class CreateSensorReadings < ActiveRecord::Migration[7.0]
  def up
    create_table :sensor_readings, id: false do |t|
      t.bigint :id, null: false
      t.datetime :time, null: false, precision: 6
      t.string :sensor_id, null: false
      t.float :temperature
      t.float :humidity
      t.integer :battery_level
    end
    
    execute "SELECT create_hypertable('sensor_readings', 'time');"
    execute "ALTER TABLE sensor_readings ADD PRIMARY KEY (id, time);"
    
    add_index :sensor_readings, [:sensor_id, :time]
  end
  
  def down
    drop_table :sensor_readings
  end
end

# Model with time-series queries
class SensorReading < ApplicationRecord
  self.primary_key = [:id, :time]
  
  scope :recent, -> { where('time > ?', 1.hour.ago) }
  scope :for_sensor, ->(sensor_id) { where(sensor_id: sensor_id) }
  
  def self.hourly_average(sensor_id, start_time, end_time)
    select(
      "time_bucket('1 hour', time) AS hour",
      "AVG(temperature) as avg_temp",
      "AVG(humidity) as avg_humidity"
    )
    .where(sensor_id: sensor_id)
    .where(time: start_time..end_time)
    .group("hour")
    .order("hour")
  end
  
  def self.continuous_aggregate(name, query)
    execute <<-SQL
      CREATE MATERIALIZED VIEW #{name}
      WITH (timescaledb.continuous) AS
      #{query}
    SQL
  end
end

# Usage
SensorReading.create!(
  id: SecureRandom.uuid,
  time: Time.current,
  sensor_id: 'temp-sensor-01',
  temperature: 22.5,
  humidity: 65.0,
  battery_level: 87
)

averages = SensorReading.hourly_average(
  'temp-sensor-01',
  24.hours.ago,
  Time.current
)

TimescaleDB-specific functions like time_bucket integrate with ActiveRecord through raw SQL or Arel. The primary_key configuration supports composite keys required for TimescaleDB hypertables.

Prometheus Client

Applications expose metrics for Prometheus scraping using the prometheus-client gem. This approach inverts the typical client-server relationship—applications provide HTTP endpoints that Prometheus polls.

require 'prometheus/client'
require 'prometheus/client/rack/exporter'

# Initialize registry
prometheus = Prometheus::Client.registry

# Define metrics
http_requests = prometheus.counter(
  :http_requests_total,
  docstring: 'Total HTTP requests',
  labels: [:method, :path, :status]
)

request_duration = prometheus.histogram(
  :http_request_duration_seconds,
  docstring: 'HTTP request duration',
  labels: [:method, :path],
  buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5]
)

# Instrument application
class ApplicationController < ActionController::Base
  around_action :track_metrics
  
  private
  
  def track_metrics
    start_time = Time.now
    begin
      yield
      status = response.status
    rescue => e
      status = 500
      raise
    ensure
      duration = Time.now - start_time
      
      labels = {
        method: request.method,
        path: request.path,
        status: status
      }
      
      http_requests.increment(labels: labels)
      request_duration.observe(duration, labels: labels.except(:status))
    end
  end
end

# Mount metrics endpoint
Rails.application.routes.draw do
  mount Prometheus::Client::Rack::Exporter.new, at: '/metrics'
end

The prometheus-client gem provides thread-safe metric collection suitable for multi-threaded Ruby servers. Metrics accumulate in-process, and the exporter serializes them in Prometheus text format when scraped.

Background Processing for Writes

High-frequency metric collection should avoid blocking application threads. Background processing with Sidekiq or similar frameworks batches metric writes efficiently.

class MetricsWriter
  include Sidekiq::Worker
  
  sidekiq_options queue: :metrics, retry: 3
  
  def perform(batch)
    client = InfluxDB2::Client.new(
      ENV['INFLUXDB_URL'],
      ENV['INFLUXDB_TOKEN']
    )
    
    write_api = client.create_write_api
    
    points = batch.map do |metric|
      InfluxDB2::Point.new(name: metric['name'])
        .add_tag('source', metric['source'])
        .add_field('value', metric['value'])
        .time(Time.parse(metric['timestamp']), 
              InfluxDB2::WritePrecision::MILLISECOND)
    end
    
    write_api.write(data: points)
    client.close!
  end
end

# Application code
class MetricsCollector
  def self.record(name, value, tags = {})
    metric = {
      name: name,
      value: value,
      source: tags[:source] || 'app',
      timestamp: Time.now.iso8601(3)
    }
    
    METRICS_BUFFER << metric
    
    if METRICS_BUFFER.size >= 100
      batch = METRICS_BUFFER.shift(100)
      MetricsWriter.perform_async(batch)
    end
  end
end

METRICS_BUFFER = Concurrent::Array.new

Thread-safe data structures like Concurrent::Array handle metric collection across threads. Batching reduces network overhead and database load.

Performance Considerations

Time-series database performance depends on write patterns, query characteristics, cardinality, and retention policies. Understanding these factors guides optimization.

Write Performance

Time-series databases achieve high write throughput through batching and sequential disk writes. Individual point writes incur significant overhead from network round-trips and transaction processing. Batching amortizes these costs across multiple points.

# Inefficient: individual writes
1000.times do |i|
  point = InfluxDB2::Point.new(name: 'temperature')
    .add_field('value', 20 + rand(10))
  write_api.write(data: point)
end
# Result: ~100 writes/second

# Efficient: batched writes
points = 1000.times.map do |i|
  InfluxDB2::Point.new(name: 'temperature')
    .add_field('value', 20 + rand(10))
end
write_api.write(data: points)
# Result: ~10,000 writes/second

Batch sizes between 1,000 and 10,000 points optimize throughput without excessive memory usage. Larger batches improve write throughput but increase latency and memory consumption. Applications should tune batch sizes based on data arrival rates and acceptable latency.

Write amplification occurs when storage engines reorganize data during compaction. Frequently writing small batches creates many small storage files that require merging. Configure appropriate flush intervals to balance write latency against compaction overhead.

Query Optimization

Time-series queries perform best when restricted to specific time ranges and series. Query patterns that scan all series or long time ranges require careful optimization.

Time-range selection dramatically affects query performance. A query scanning one hour of data might execute in milliseconds, while scanning one year could take minutes. Applications should limit query ranges and page through results for large temporal spans.

# Slow: unbounded time range
flux_query = <<-FLUX
  from(bucket: "metrics")
    |> filter(fn: (r) => r._measurement == "cpu")
    |> mean()
FLUX

# Fast: bounded time range
flux_query = <<-FLUX
  from(bucket: "metrics")
    |> range(start: -1h)
    |> filter(fn: (r) => r._measurement == "cpu")
    |> mean()
FLUX

Tag filtering reduces the series scanned during queries. Queries should filter on indexed tags before applying computations. The order of filters affects performance in some databases—place highly selective filters first.

Downsampled data provides faster queries for historical analysis. Rather than querying raw one-second data points over a month, query pre-aggregated hourly data. This reduces query time by two orders of magnitude with minimal accuracy loss for most analyses.

Cardinality Management

High cardinality—many unique combinations of tag values—challenges time-series databases. Each unique series requires index entries and metadata storage. Databases perform well with millions of unique series but degrade with tens of millions.

Applications should avoid unbounded cardinality sources. User IDs, session tokens, or UUIDs as tags create new series indefinitely. Instead, use bounded sets like server names, service types, or geographical regions.

# High cardinality: unbounded tag values
# Creates new series for every user and request
point = InfluxDB2::Point.new(name: 'request')
  .add_tag('user_id', user.id)          # Bad: millions of values
  .add_tag('request_id', request.uuid)   # Bad: infinite values
  .add_field('duration_ms', 45)

# Low cardinality: bounded tag values
# Reuses existing series
point = InfluxDB2::Point.new(name: 'request')
  .add_tag('endpoint', '/api/users')     # Good: limited values
  .add_tag('method', 'GET')              # Good: finite set
  .add_tag('status_code', '200')         # Good: small range
  .add_field('duration_ms', 45)
  .add_field('user_id', user.id)         # OK: in field, not tag

Fields handle high-cardinality values without index overhead. Store identifiers and unique values in fields rather than tags. Applications can filter on field values, though less efficiently than tag filtering.

Memory Management

Time-series databases cache recent data in memory for fast access. Memory consumption grows with active series count and retention of recent data. Write buffers, query caches, and series indexes all consume memory.

Monitor memory metrics to detect issues. Excessive memory usage often indicates high cardinality or retention of too much hot data. Reducing retention windows or implementing more aggressive downsampling alleviates memory pressure.

# Monitor database memory usage
def check_influxdb_memory
  stats_api = client.create_api_client('v2')
  metrics = stats_api.get_metrics
  
  memory_usage = metrics['memory_bytes']
  memory_limit = metrics['memory_limit_bytes']
  usage_pct = (memory_usage.to_f / memory_limit * 100).round(2)
  
  if usage_pct > 80
    Rails.logger.warn "InfluxDB memory high: #{usage_pct}%"
    # Consider reducing retention or increasing resources
  end
end

Storage Optimization

Compression reduces storage requirements significantly. Time-series databases typically achieve 10-20x compression through specialized codecs. Compression ratios improve with longer retention—more data provides better compression patterns.

Storage grows linearly with write rate and retention period. A metric collected every second with one-year retention generates 31.5 million points annually. At 12 bytes per compressed point, this requires 378 MB per metric. Multiply by metric count to estimate storage needs.

Retention policies automatically delete old data, preventing unbounded storage growth. Applications should configure retention matching their analysis needs—keeping raw data for operational time frames and aggregated data for historical analysis.

Tools & Ecosystem

The time-series database ecosystem includes specialized databases, monitoring platforms, visualization tools, and supporting libraries. Selecting appropriate tools depends on use case requirements.

Database Options

InfluxDB provides a complete time-series platform with clustering, visualization, and alerting. Version 2.x introduced Flux query language and unified time-series and tasks in a single platform. InfluxDB excels at moderate cardinality workloads with strong consistency requirements. The OSS version limits to single-node deployments; clustering requires the commercial version.

TimescaleDB extends PostgreSQL, combining time-series performance with relational capabilities. Applications can join time-series data with relational tables, use SQL for queries, and leverage existing PostgreSQL tools. TimescaleDB handles high-cardinality workloads better than InfluxDB but requires PostgreSQL administration knowledge.

Prometheus targets service monitoring and alerting. Its pull-based model and local storage suit monitoring architectures where Prometheus scrapes metrics from application endpoints. Prometheus excels at short-term metrics retention (weeks) but struggles with long-term storage. Many deployments pair Prometheus with long-term storage backends like Cortex or Thanos.

VictoriaMetrics provides a Prometheus-compatible database optimized for high cardinality and long retention. It supports both push and pull models, handles millions of active series, and achieves better compression than Prometheus. VictoriaMetrics suits large-scale monitoring deployments requiring long-term metric retention.

Ruby Libraries

Several gems simplify time-series database integration:

# influxdb-client: InfluxDB 2.x support
gem 'influxdb-client'

# timescaledb: Rails/ActiveRecord integration
gem 'timescaledb'

# prometheus-client: Application metrics
gem 'prometheus-client'

# graphite-api: Graphite protocol support
gem 'graphite-api'

The influxdb-client gem provides comprehensive InfluxDB 2.x support including Flux queries, write batching, and asynchronous operations. For InfluxDB 1.x, the older influxdb gem remains available but lacks features from the 2.x API.

The timescaledb gem adds TimescaleDB-specific ActiveRecord methods and migrations. It handles hypertable creation, continuous aggregates, and compression policies through Rails migrations.

Visualization Platforms

Grafana dominates time-series visualization, supporting dozens of data sources including InfluxDB, Prometheus, TimescaleDB, and Graphite. It provides dashboarding, alerting, and data exploration. Ruby applications typically write metrics to time-series databases that Grafana queries directly.

# Configure Grafana dashboard via API
require 'net/http'
require 'json'

def create_grafana_dashboard(title, panels)
  uri = URI("#{ENV['GRAFANA_URL']}/api/dashboards/db")
  request = Net::HTTP::Post.new(uri)
  request['Authorization'] = "Bearer #{ENV['GRAFANA_API_KEY']}"
  request['Content-Type'] = 'application/json'
  
  dashboard = {
    dashboard: {
      title: title,
      panels: panels,
      schemaVersion: 16,
      version: 0
    },
    overwrite: false
  }
  
  request.body = dashboard.to_json
  response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
    http.request(request)
  end
  
  JSON.parse(response.body)
end

Kibana and Elastic stack provide alternative visualization for time-series data stored in Elasticsearch. This approach suits applications already using Elasticsearch for logging and search.

Data Collection Agents

Telegraf collects system and application metrics, writing to InfluxDB, Prometheus, or other outputs. It provides input plugins for monitoring CPU, memory, disk, network, and various services. Ruby applications expose metrics through StatsD or HTTP endpoints that Telegraf polls.

# Output metrics in StatsD format for Telegraf
require 'statsd-instrument'

StatsD.backend = StatsD::Instrument::Backends::UDPBackend.new(
  'localhost:8125'
)

class ApplicationController < ActionController::Base
  around_action :track_request
  
  private
  
  def track_request
    start = Time.now
    begin
      yield
    ensure
      duration = (Time.now - start) * 1000
      StatsD.histogram('request.duration', duration, 
        tags: ["endpoint:#{params[:controller]}.#{params[:action]}"])
      StatsD.increment('request.count',
        tags: ["status:#{response.status}"])
    end
  end
end

Prometheus exporters expose metrics from systems not directly instrumented. Exporters exist for databases, message queues, cloud services, and hardware monitoring. Ruby applications can create custom exporters using the prometheus-client gem.

Design Considerations

Selecting and implementing time-series databases requires evaluating trade-offs around consistency, availability, query capabilities, and operational complexity.

Database Selection Criteria

Write and query patterns determine database suitability. Applications with high write volumes (>10,000 points/second) benefit from databases optimized for ingest like InfluxDB or VictoriaMetrics. Applications prioritizing query flexibility might choose TimescaleDB for SQL support.

Cardinality expectations affect database choice. InfluxDB handles moderate cardinality (millions of series) efficiently but degrades with very high cardinality. VictoriaMetrics and M3DB better handle high cardinality workloads with billions of unique series.

Retention requirements influence storage architecture. Short-term retention (days to weeks) suits in-memory databases or local storage engines. Long-term retention (months to years) requires distributed storage with efficient compression and tiered storage options.

# Decision matrix implementation
class DatabaseSelector
  DATABASES = {
    influxdb: {
      max_write_rate: 100_000,
      max_cardinality: 10_000_000,
      sql_support: false,
      clustering: :commercial,
      ideal_retention: '90d'
    },
    timescaledb: {
      max_write_rate: 50_000,
      max_cardinality: 100_000_000,
      sql_support: true,
      clustering: :native,
      ideal_retention: '1y'
    },
    victoriametrics: {
      max_write_rate: 1_000_000,
      max_cardinality: 1_000_000_000,
      sql_support: false,
      clustering: :native,
      ideal_retention: '1y'
    }
  }
  
  def self.recommend(requirements)
    DATABASES.select do |name, capabilities|
      capabilities[:max_write_rate] >= requirements[:write_rate] &&
      capabilities[:max_cardinality] >= requirements[:cardinality]
    end.keys
  end
end

# Usage
requirements = {
  write_rate: 75_000,
  cardinality: 5_000_000,
  retention: '180d'
}

suitable = DatabaseSelector.recommend(requirements)
# => [:influxdb, :timescaledb, :victoriametrics]

Consistency vs Availability Trade-offs

Time-series workloads typically tolerate eventual consistency. Monitoring data arriving a few seconds late rarely affects analysis. This tolerance enables architectures prioritizing availability—accepting writes even during network partitions or node failures.

Some time-series databases sacrifice consistency for availability. Writes succeed immediately without waiting for replication. Queries might return slightly different results from different nodes during network partitions. This model suits monitoring and metrics collection where missing a few data points matters less than maintaining write availability.

Applications requiring strong consistency should choose databases providing tunable consistency levels. TimescaleDB inherits PostgreSQL's consistency model, ensuring writes persist before acknowledging. InfluxDB Enterprise allows consistency level configuration per write.

Push vs Pull Models

Time-series data collection follows push or pull patterns. Push models have applications send metrics to a central database. Pull models have the database scrape metrics from application endpoints.

Push models suit distributed applications or environments with dynamic scaling. Applications know when they generate metrics and push immediately without waiting for scraping. Network firewalls rarely block outbound pushes. The downside: applications need database connection management and retry logic.

Pull models centralize configuration in the metrics system. Prometheus scrapes configured targets at regular intervals, controlling collection frequency and handling service discovery. Applications expose simple HTTP endpoints without managing connections. The downside: requires network access from scraper to all applications.

# Push model: application sends metrics
class MetricsPusher
  def initialize
    @client = InfluxDB2::Client.new(ENV['INFLUXDB_URL'], ENV['TOKEN'])
    @write_api = @client.create_write_api
  end
  
  def record(name, value, tags = {})
    point = InfluxDB2::Point.new(name: name)
    tags.each { |k, v| point.add_tag(k.to_s, v.to_s) }
    point.add_field('value', value)
    
    @write_api.write(data: point)
  end
end

# Pull model: Prometheus scrapes endpoint
class MetricsExporter
  def initialize
    @registry = Prometheus::Client.registry
    @counter = @registry.counter(:requests_total, 
      docstring: 'Total requests',
      labels: [:status])
  end
  
  def record(status)
    @counter.increment(labels: { status: status })
    # Prometheus will scrape /metrics endpoint periodically
  end
end

Schema Design Patterns

Time-series schema design focuses on tag selection and field organization. Tags should have bounded cardinality and identify dimensions for filtering and grouping. Fields contain measurements and high-cardinality identifiers.

Wide schemas store many measurements in a single series, reducing series count but potentially wasting storage for sparse data. Narrow schemas split measurements into separate series, increasing series count but storing only present values.

# Wide schema: multiple fields per series
# Advantage: fewer series, simpler queries
# Disadvantage: sparse data wastes storage
point = InfluxDB2::Point.new(name: 'server_metrics')
  .add_tag('host', 'web-01')
  .add_field('cpu_percent', 45.2)
  .add_field('memory_mb', 2048)
  .add_field('disk_gb', 125)
  .add_field('network_mbps', 15.3)

# Narrow schema: one field per series
# Advantage: efficient storage for sparse data
# Disadvantage: more series, complex queries
['cpu_percent', 'memory_mb', 'disk_gb', 'network_mbps'].each do |metric|
  point = InfluxDB2::Point.new(name: metric)
    .add_tag('host', 'web-01')
    .add_field('value', values[metric])
end

Applications should avoid storing metadata that changes frequently as tags. Server IP addresses might seem like good tags but create new series when servers change IPs. Store such identifiers in fields or separate metadata systems.

Reference

Database Comparison

Database Write Model Query Language Clustering Best For
InfluxDB Push Flux, InfluxQL Commercial Moderate cardinality, strong consistency
TimescaleDB Push SQL Native High cardinality, relational features
Prometheus Pull PromQL Federation Service monitoring, short retention
VictoriaMetrics Push/Pull PromQL, MetricsQL Native High cardinality, long retention
Graphite Push Functions Carbon-relay Legacy systems, simple metrics
M3DB Push PromQL Native Extreme scale, distributed

Ruby Client Libraries

Gem Database Features Use Case
influxdb-client InfluxDB 2.x Batching, Flux queries, async Modern InfluxDB deployments
influxdb InfluxDB 1.x Basic writes, InfluxQL Legacy InfluxDB systems
timescaledb TimescaleDB Hypertables, continuous aggregates PostgreSQL-based systems
prometheus-client Prometheus Metrics exposure, types Application instrumentation
graphite-api Graphite Metric formatting, sending Graphite integration

Write Performance Factors

Factor Impact Optimization Strategy
Batch size 10-100x throughput Batch 1000-10000 points per write
Point size Memory and network Minimize tag count, avoid large field values
Cardinality Index overhead Use bounded tag sets, limit series count
Timestamp precision Storage size Use milliseconds unless microseconds needed
Compression Disk I/O Enable native compression, tune levels
Retention Write amplification Configure appropriate downsampling policies

Query Optimization Techniques

Technique Benefit Implementation
Time range limits Reduce scan size Always specify start/end times
Tag filtering Series reduction Filter on indexed tags first
Downsampled queries 10-100x faster Query aggregated data for historical analysis
Result limits Memory control Limit result rows, paginate large sets
Cached queries Sub-second response Cache results for repeated queries
Continuous aggregates Real-time performance Pre-compute common aggregations

Common Tag Design Patterns

Pattern Tags Fields Cardinality
Infrastructure host, region, cluster cpu, memory, disk Low (hundreds)
Application service, endpoint, method duration_ms, count Medium (thousands)
IoT sensors device_type, location temperature, humidity High (millions)
Financial symbol, exchange, order_type price, volume Very high (billions)

Retention Policy Configuration

Retention Use Case Storage Impact Query Performance
Raw: 7 days Recent operational analysis High write rate, full resolution Fast for recent queries
Hourly: 90 days Short-term trends 1/3600 of raw data Good for hourly analysis
Daily: 1 year Historical reporting 1/86400 of raw data Excellent for long-term trends
Monthly: indefinite Long-term archives 1/2592000 of raw data Sufficient for annual reports

Flux Query Language Basics

Operation Description Example
from Specify data source from(bucket: "metrics")
range Time window range(start: -1h)
filter Series selection filter(fn: (r) => r.host == "web-01")
aggregateWindow Time-based grouping aggregateWindow(every: 5m, fn: mean)
group Group by tags group(columns: ["host"])
map Transform values map(fn: (r) => ({r with scaled: r._value * 100}))
join Combine streams join(tables: {a: stream1, b: stream2}, on: ["host"])

TimescaleDB Functions

Function Purpose Example
time_bucket Time-based grouping time_bucket('5 minutes', time)
first First value in group first(temperature, time)
last Last value in group last(temperature, time)
locf Last observation carried forward locf(reading)
interpolate Linear interpolation interpolate(reading)
time_bucket_gapfill Fill missing time buckets time_bucket_gapfill('1 hour', time)

Monitoring Metrics

Metric Normal Range Action Threshold
Write throughput Varies by hardware <80% of rated capacity
Query latency p95 <100ms for recent data >1 second
Memory usage 50-70% >85%
Series cardinality Depends on database Check database limits
Disk usage growth Linear with retention >90% capacity
Query queue depth 0-10 >100 queued queries