CrackedRuby - Time-Series Databases

Overview

Time-series databases store and retrieve data points indexed by time. Unlike traditional relational databases optimized for random access and complex joins, time-series databases optimize for write-heavy workloads where data arrives chronologically and queries typically scan temporal ranges. These databases handle metrics, events, and measurements that accumulate over time, such as server metrics, sensor readings, financial tick data, and application logs.

The primary distinction between time-series databases and general-purpose databases lies in data access patterns. Time-series workloads exhibit sequential writes, range-based queries, and frequent aggregations over time windows. A monitoring system might write thousands of metric points per second but query data only for specific time ranges with aggregations like averages or percentiles. Traditional databases struggle with this pattern due to index overhead and lack of time-aware optimizations.

Time-series databases emerged from the need to handle increasing volumes of temporal data in monitoring, IoT, and analytics applications. Early solutions repurposed relational databases with specialized schemas, but dedicated time-series databases provide native support for temporal operations, automatic data retention policies, and compression algorithms designed for sequential numeric data.

# Traditional approach: storing metrics in relational database
class Metric < ActiveRecord::Base
  # Writes require index updates, slow at scale
  # Queries lack time-aware optimizations
end

# Time-series approach: optimized for temporal patterns
influx_client = InfluxDB2::Client.new('http://localhost:8086', token)
write_api = influx_client.create_write_api

# Efficient batched writes with automatic timestamping
write_api.write(data: point, bucket: 'metrics')

The core architectural difference manifests in storage layout. Time-series databases organize data by time, often using specialized storage engines that compress sequential values and eliminate redundant timestamps through delta encoding. This approach reduces storage requirements by 10-20x compared to traditional databases while accelerating temporal queries through time-based partitioning.

Key Principles

Time-series databases operate on several fundamental principles that distinguish them from other database types. Understanding these principles clarifies design decisions and usage patterns.

Time as Primary Index

Every data point in a time-series database associates with a timestamp. The database treats time as the primary organizational axis, not as just another column. Storage engines partition data temporally, creating segments for specific time ranges. This organization enables efficient writes, as new data appends to the most recent segment without affecting older data. Queries specifying time ranges directly access relevant segments without scanning the entire dataset.

Immutable Data Model

Time-series databases assume data represents historical facts that do not change. A temperature reading at 10:00 AM remains constant regardless of subsequent events. This immutability enables aggressive caching, simplified replication, and efficient compression. The database does not support traditional updates; corrections require writing new points with more recent timestamps or deletion of entire time ranges.

Tags and Fields

Time-series databases separate metadata from measurements. Tags identify the data series through key-value pairs (server name, region, sensor ID), while fields contain actual measurements. Tags serve as dimensions for grouping and filtering, indexed for query performance. Fields store numeric values or strings representing the measurements themselves.

# Tags identify what generated the data
tags = {
  host: 'web-server-01',
  region: 'us-east',
  environment: 'production'
}

# Fields contain the measurements
fields = {
  cpu_usage: 45.2,
  memory_mb: 2048,
  request_count: 1523
}

# Timestamp indicates when
timestamp = Time.now.to_i

Tags create the series cardinality—the number of unique combinations of tag values. High cardinality (millions of unique series) challenges time-series databases differently than high data volume. A database might handle billions of points efficiently but struggle with millions of distinct series due to index overhead.

Write Optimization

Time-series databases prioritize write performance since data typically arrives in real-time streams. They achieve high write throughput through several mechanisms: in-memory buffers that batch writes before persisting to disk, append-only storage files that avoid random I/O, and relaxed consistency models that defer replication or indexing.

The write path typically buffers incoming points in memory, sorted by time, until reaching a threshold for flushing to disk. This batching amortizes disk I/O costs across many points. Some databases accept data out of temporal order within a window, buffering and sorting before writing to maintain temporal locality on disk.

Downsampling and Retention

Raw data accumulates rapidly—a single metric collected every second generates 86,400 points daily. Time-series databases provide automatic downsampling to reduce storage requirements for historical data. Downsampling aggregates high-resolution data into lower-resolution summaries, storing 5-minute averages instead of individual second-level points for older data.

Retention policies automatically delete data exceeding specified ages. A database might retain raw data for 7 days, hourly aggregates for 90 days, and daily summaries indefinitely. These policies execute during background compaction, reclaiming storage without impacting write or query performance.

# Define retention policy with downsampling
retention_config = {
  duration: '7d',           # Keep raw data for 7 days
  shard_duration: '1h',     # Partition by hour
  replication: 1,
  downsampling: [
    { duration: '90d', aggregation: 'mean', interval: '1h' },
    { duration: 'INF', aggregation: 'mean', interval: '1d' }
  ]
}

Compression

Sequential numeric data exhibits patterns that compression algorithms exploit. Delta encoding stores differences between consecutive values rather than absolute values, reducing storage when values change gradually. Run-length encoding compresses repeated values. Specialized codecs like Gorilla compression (Facebook) achieve 12-bit-per-point average for floating-point metrics with minimal CPU overhead.

Compression operates transparently during writes. The database compresses data blocks before writing to disk, decompressing during reads. Query engines operate on compressed data where possible, reducing I/O.

Implementation Approaches

Implementing time-series storage requires choosing between several architectural approaches, each with distinct trade-offs for specific workloads.

Specialized Time-Series Databases

Dedicated time-series databases like InfluxDB, TimescaleDB, and Prometheus optimize all components for temporal workloads. These systems provide native time-series data types, query languages with temporal functions, and storage engines designed for sequential access patterns.

InfluxDB uses a custom storage engine (TSM) that organizes data into time-sharded files with columnar compression. Each shard covers a specific time range, and the query engine knows to access only relevant shards for temporal queries. The line protocol for data ingestion provides high write throughput with minimal parsing overhead.

TimescaleDB extends PostgreSQL with time-series optimizations while retaining SQL compatibility and PostgreSQL features. It automatically partitions tables into chunks based on time ranges, creating a hypertable that appears as a single table but distributes data across many partitions. This approach combines time-series performance with relational database capabilities.

Prometheus targets monitoring and alerting, pulling metrics from instrumented applications. Its local storage uses TSDB blocks with aggressive compression, while the query language (PromQL) provides temporal aggregation and alerting rules. Prometheus assumes eventual consistency and prioritizes availability over durability.

Relational Database with Time-Series Schema

Traditional relational databases can store time-series data through careful schema design. Wide tables with timestamp columns, proper indexing, and partitioning provide acceptable performance for moderate workloads. This approach leverages existing database infrastructure and operational knowledge.

Partitioning divides tables by time ranges, typically daily or weekly partitions. Queries restricted to recent data access only relevant partitions, avoiding full table scans. Older partitions can be archived or dropped without affecting active data. Index strategies focus on time-range queries, often using BRIN indexes in PostgreSQL that summarize ranges rather than indexing every value.

class CreateMetricsTable < ActiveRecord::Migration[7.0]
  def change
    create_table :metrics, id: false do |t|
      t.bigint :id, null: false
      t.datetime :timestamp, null: false, precision: 6
      t.string :metric_name, null: false
      t.jsonb :tags, default: {}
      t.float :value, null: false
    end
    
    # Partition by month
    execute <<-SQL
      CREATE INDEX idx_metrics_time ON metrics USING BRIN (timestamp);
      CREATE INDEX idx_metrics_tags ON metrics USING GIN (tags);
      ALTER TABLE metrics ADD PRIMARY KEY (id, timestamp);
    SQL
  end
end

The relational approach works when write volumes remain under 10,000 points per second and query patterns align with SQL capabilities. It struggles with high cardinality tag sets and lacks native compression for temporal data.

Distributed Time-Series Architectures

Large-scale deployments distribute time-series data across clusters. Systems like M3DB or distributed InfluxDB provide horizontal scalability through sharding and replication. Each node stores a subset of series, determined by consistent hashing of series identifiers.

Distributed architectures introduce complexity in coordinating queries across nodes, maintaining data consistency during replication, and rebalancing data when nodes join or leave. Query coordinators fan out requests to relevant nodes and aggregate results. The CAP theorem applies—systems typically prioritize availability and partition tolerance over strong consistency for time-series workloads.

Sharding strategies affect query performance. Range-based sharding by time distributes writes across nodes but concentrates queries on nodes holding relevant time ranges. Series-based sharding distributes both reads and writes but requires querying all nodes for time-range queries spanning multiple series.

Hybrid Approaches

Some systems combine approaches. OpenTSDB layers on HBase, providing time-series semantics over a distributed key-value store. VictoriaMetrics offers InfluxDB-compatible APIs with custom storage optimized for high cardinality and long-term retention.

Cloud-managed services like Amazon Timestream or Azure Time Series Insights abstract infrastructure complexity, providing time-series capabilities without managing databases. These services charge based on writes, storage, and queries, suitable for applications preferring operational simplicity over cost optimization.

Ruby Implementation

Ruby applications interact with time-series databases through client libraries and Ruby-specific patterns. Several gems provide idiomatic interfaces to popular time-series databases.

InfluxDB Integration

The influxdb-client gem provides Ruby access to InfluxDB 2.x. It supports batched writes, parameterized queries, and asynchronous operations through connection pooling.

require 'influxdb-client'

# Initialize client with connection parameters
client = InfluxDB2::Client.new(
  'http://localhost:8086',
  ENV['INFLUXDB_TOKEN'],
  bucket: 'system_metrics',
  org: 'my_org',
  precision: InfluxDB2::WritePrecision::NANOSECOND
)

# Create write API with batching
write_api = client.create_write_api(
  write_options: InfluxDB2::WriteOptions.new(
    batch_size: 1000,
    flush_interval: 10_000,  # milliseconds
    retry_interval: 5_000
  )
)

# Write points with tags and fields
def record_metric(write_api, host, metric_name, value)
  point = InfluxDB2::Point.new(name: metric_name)
    .add_tag('host', host)
    .add_tag('environment', Rails.env)
    .add_field('value', value)
    .time(Time.now, InfluxDB2::WritePrecision::MILLISECOND)
  
  write_api.write(data: point)
end

# Query with Flux language
query_api = client.create_query_api
flux_query = <<-FLUX
  from(bucket: "system_metrics")
    |> range(start: -1h)
    |> filter(fn: (r) => r._measurement == "cpu_usage")
    |> filter(fn: (r) => r.host == "web-01")
    |> aggregateWindow(every: 5m, fn: mean)
FLUX

result = query_api.query(query: flux_query)
result.each do |table|
  table.records.each do |record|
    puts "#{record.time}: #{record.value}"
  end
end

client.close!

The gem handles connection management, automatic retries on failures, and background flushing of buffered writes. Applications should reuse client instances rather than creating new connections for each operation.

TimescaleDB with ActiveRecord

TimescaleDB extends PostgreSQL, making it accessible through standard Ruby database libraries. The timescaledb gem provides ActiveRecord integration with hypertable management.

# Migration creating hypertable
class CreateSensorReadings < ActiveRecord::Migration[7.0]
  def up
    create_table :sensor_readings, id: false do |t|
      t.bigint :id, null: false
      t.datetime :time, null: false, precision: 6
      t.string :sensor_id, null: false
      t.float :temperature
      t.float :humidity
      t.integer :battery_level
    end
    
    execute "SELECT create_hypertable('sensor_readings', 'time');"
    execute "ALTER TABLE sensor_readings ADD PRIMARY KEY (id, time);"
    
    add_index :sensor_readings, [:sensor_id, :time]
  end
  
  def down
    drop_table :sensor_readings
  end
end

# Model with time-series queries
class SensorReading < ApplicationRecord
  self.primary_key = [:id, :time]
  
  scope :recent, -> { where('time > ?', 1.hour.ago) }
  scope :for_sensor, ->(sensor_id) { where(sensor_id: sensor_id) }
  
  def self.hourly_average(sensor_id, start_time, end_time)
    select(
      "time_bucket('1 hour', time) AS hour",
      "AVG(temperature) as avg_temp",
      "AVG(humidity) as avg_humidity"
    )
    .where(sensor_id: sensor_id)
    .where(time: start_time..end_time)
    .group("hour")
    .order("hour")
  end
  
  def self.continuous_aggregate(name, query)
    execute <<-SQL
      CREATE MATERIALIZED VIEW #{name}
      WITH (timescaledb.continuous) AS
      #{query}
    SQL
  end
end

# Usage
SensorReading.create!(
  id: SecureRandom.uuid,
  time: Time.current,
  sensor_id: 'temp-sensor-01',
  temperature: 22.5,
  humidity: 65.0,
  battery_level: 87
)

averages = SensorReading.hourly_average(
  'temp-sensor-01',
  24.hours.ago,
  Time.current
)

TimescaleDB-specific functions like time_bucket integrate with ActiveRecord through raw SQL or Arel. The primary_key configuration supports composite keys required for TimescaleDB hypertables.

Prometheus Client

Applications expose metrics for Prometheus scraping using the prometheus-client gem. This approach inverts the typical client-server relationship—applications provide HTTP endpoints that Prometheus polls.

require 'prometheus/client'
require 'prometheus/client/rack/exporter'

# Initialize registry
prometheus = Prometheus::Client.registry

# Define metrics
http_requests = prometheus.counter(
  :http_requests_total,
  docstring: 'Total HTTP requests',
  labels: [:method, :path, :status]
)

request_duration = prometheus.histogram(
  :http_request_duration_seconds,
  docstring: 'HTTP request duration',
  labels: [:method, :path],
  buckets: [0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5]
)

# Instrument application
class ApplicationController < ActionController::Base
  around_action :track_metrics
  
  private
  
  def track_metrics
    start_time = Time.now
    begin
      yield
      status = response.status
    rescue => e
      status = 500
      raise
    ensure
      duration = Time.now - start_time
      
      labels = {
        method: request.method,
        path: request.path,
        status: status
      }
      
      http_requests.increment(labels: labels)
      request_duration.observe(duration, labels: labels.except(:status))
    end
  end
end

# Mount metrics endpoint
Rails.application.routes.draw do
  mount Prometheus::Client::Rack::Exporter.new, at: '/metrics'
end

The prometheus-client gem provides thread-safe metric collection suitable for multi-threaded Ruby servers. Metrics accumulate in-process, and the exporter serializes them in Prometheus text format when scraped.

Background Processing for Writes

High-frequency metric collection should avoid blocking application threads. Background processing with Sidekiq or similar frameworks batches metric writes efficiently.

class MetricsWriter
  include Sidekiq::Worker
  
  sidekiq_options queue: :metrics, retry: 3
  
  def perform(batch)
    client = InfluxDB2::Client.new(
      ENV['INFLUXDB_URL'],
      ENV['INFLUXDB_TOKEN']
    )
    
    write_api = client.create_write_api
    
    points = batch.map do |metric|
      InfluxDB2::Point.new(name: metric['name'])
        .add_tag('source', metric['source'])
        .add_field('value', metric['value'])
        .time(Time.parse(metric['timestamp']), 
              InfluxDB2::WritePrecision::MILLISECOND)
    end
    
    write_api.write(data: points)
    client.close!
  end
end

# Application code
class MetricsCollector
  def self.record(name, value, tags = {})
    metric = {
      name: name,
      value: value,
      source: tags[:source] || 'app',
      timestamp: Time.now.iso8601(3)
    }
    
    METRICS_BUFFER << metric
    
    if METRICS_BUFFER.size >= 100
      batch = METRICS_BUFFER.shift(100)
      MetricsWriter.perform_async(batch)
    end
  end
end

METRICS_BUFFER = Concurrent::Array.new

Thread-safe data structures like Concurrent::Array handle metric collection across threads. Batching reduces network overhead and database load.

Performance Considerations

Time-series database performance depends on write patterns, query characteristics, cardinality, and retention policies. Understanding these factors guides optimization.

Write Performance

Time-series databases achieve high write throughput through batching and sequential disk writes. Individual point writes incur significant overhead from network round-trips and transaction processing. Batching amortizes these costs across multiple points.

# Inefficient: individual writes
1000.times do |i|
  point = InfluxDB2::Point.new(name: 'temperature')
    .add_field('value', 20 + rand(10))
  write_api.write(data: point)
end
# Result: ~100 writes/second

# Efficient: batched writes
points = 1000.times.map do |i|
  InfluxDB2::Point.new(name: 'temperature')
    .add_field('value', 20 + rand(10))
end
write_api.write(data: points)
# Result: ~10,000 writes/second

Batch sizes between 1,000 and 10,000 points optimize throughput without excessive memory usage. Larger batches improve write throughput but increase latency and memory consumption. Applications should tune batch sizes based on data arrival rates and acceptable latency.

Write amplification occurs when storage engines reorganize data during compaction. Frequently writing small batches creates many small storage files that require merging. Configure appropriate flush intervals to balance write latency against compaction overhead.

Query Optimization

Time-series queries perform best when restricted to specific time ranges and series. Query patterns that scan all series or long time ranges require careful optimization.

Time-range selection dramatically affects query performance. A query scanning one hour of data might execute in milliseconds, while scanning one year could take minutes. Applications should limit query ranges and page through results for large temporal spans.

# Slow: unbounded time range
flux_query = <<-FLUX
  from(bucket: "metrics")
    |> filter(fn: (r) => r._measurement == "cpu")
    |> mean()
FLUX

# Fast: bounded time range
flux_query = <<-FLUX
  from(bucket: "metrics")
    |> range(start: -1h)
    |> filter(fn: (r) => r._measurement == "cpu")
    |> mean()
FLUX

Tag filtering reduces the series scanned during queries. Queries should filter on indexed tags before applying computations. The order of filters affects performance in some databases—place highly selective filters first.

Downsampled data provides faster queries for historical analysis. Rather than querying raw one-second data points over a month, query pre-aggregated hourly data. This reduces query time by two orders of magnitude with minimal accuracy loss for most analyses.

Cardinality Management

High cardinality—many unique combinations of tag values—challenges time-series databases. Each unique series requires index entries and metadata storage. Databases perform well with millions of unique series but degrade with tens of millions.

Applications should avoid unbounded cardinality sources. User IDs, session tokens, or UUIDs as tags create new series indefinitely. Instead, use bounded sets like server names, service types, or geographical regions.

# High cardinality: unbounded tag values
# Creates new series for every user and request
point = InfluxDB2::Point.new(name: 'request')
  .add_tag('user_id', user.id)          # Bad: millions of values
  .add_tag('request_id', request.uuid)   # Bad: infinite values
  .add_field('duration_ms', 45)

# Low cardinality: bounded tag values
# Reuses existing series
point = InfluxDB2::Point.new(name: 'request')
  .add_tag('endpoint', '/api/users')     # Good: limited values
  .add_tag('method', 'GET')              # Good: finite set
  .add_tag('status_code', '200')         # Good: small range
  .add_field('duration_ms', 45)
  .add_field('user_id', user.id)         # OK: in field, not tag

Fields handle high-cardinality values without index overhead. Store identifiers and unique values in fields rather than tags. Applications can filter on field values, though less efficiently than tag filtering.

Memory Management

Time-series databases cache recent data in memory for fast access. Memory consumption grows with active series count and retention of recent data. Write buffers, query caches, and series indexes all consume memory.

Monitor memory metrics to detect issues. Excessive memory usage often indicates high cardinality or retention of too much hot data. Reducing retention windows or implementing more aggressive downsampling alleviates memory pressure.

# Monitor database memory usage
def check_influxdb_memory
  stats_api = client.create_api_client('v2')
  metrics = stats_api.get_metrics
  
  memory_usage = metrics['memory_bytes']
  memory_limit = metrics['memory_limit_bytes']
  usage_pct = (memory_usage.to_f / memory_limit * 100).round(2)
  
  if usage_pct > 80
    Rails.logger.warn "InfluxDB memory high: #{usage_pct}%"
    # Consider reducing retention or increasing resources
  end
end

Storage Optimization

Compression reduces storage requirements significantly. Time-series databases typically achieve 10-20x compression through specialized codecs. Compression ratios improve with longer retention—more data provides better compression patterns.

Storage grows linearly with write rate and retention period. A metric collected every second with one-year retention generates 31.5 million points annually. At 12 bytes per compressed point, this requires 378 MB per metric. Multiply by metric count to estimate storage needs.

Retention policies automatically delete old data, preventing unbounded storage growth. Applications should configure retention matching their analysis needs—keeping raw data for operational time frames and aggregated data for historical analysis.

Tools & Ecosystem

The time-series database ecosystem includes specialized databases, monitoring platforms, visualization tools, and supporting libraries. Selecting appropriate tools depends on use case requirements.

Database Options

InfluxDB provides a complete time-series platform with clustering, visualization, and alerting. Version 2.x introduced Flux query language and unified time-series and tasks in a single platform. InfluxDB excels at moderate cardinality workloads with strong consistency requirements. The OSS version limits to single-node deployments; clustering requires the commercial version.

TimescaleDB extends PostgreSQL, combining time-series performance with relational capabilities. Applications can join time-series data with relational tables, use SQL for queries, and leverage existing PostgreSQL tools. TimescaleDB handles high-cardinality workloads better than InfluxDB but requires PostgreSQL administration knowledge.

Prometheus targets service monitoring and alerting. Its pull-based model and local storage suit monitoring architectures where Prometheus scrapes metrics from application endpoints. Prometheus excels at short-term metrics retention (weeks) but struggles with long-term storage. Many deployments pair Prometheus with long-term storage backends like Cortex or Thanos.

VictoriaMetrics provides a Prometheus-compatible database optimized for high cardinality and long retention. It supports both push and pull models, handles millions of active series, and achieves better compression than Prometheus. VictoriaMetrics suits large-scale monitoring deployments requiring long-term metric retention.

Ruby Libraries

Several gems simplify time-series database integration:

# influxdb-client: InfluxDB 2.x support
gem 'influxdb-client'

# timescaledb: Rails/ActiveRecord integration
gem 'timescaledb'

# prometheus-client: Application metrics
gem 'prometheus-client'

# graphite-api: Graphite protocol support
gem 'graphite-api'

The influxdb-client gem provides comprehensive InfluxDB 2.x support including Flux queries, write batching, and asynchronous operations. For InfluxDB 1.x, the older influxdb gem remains available but lacks features from the 2.x API.

The timescaledb gem adds TimescaleDB-specific ActiveRecord methods and migrations. It handles hypertable creation, continuous aggregates, and compression policies through Rails migrations.

Visualization Platforms

Grafana dominates time-series visualization, supporting dozens of data sources including InfluxDB, Prometheus, TimescaleDB, and Graphite. It provides dashboarding, alerting, and data exploration. Ruby applications typically write metrics to time-series databases that Grafana queries directly.

# Configure Grafana dashboard via API
require 'net/http'
require 'json'

def create_grafana_dashboard(title, panels)
  uri = URI("#{ENV['GRAFANA_URL']}/api/dashboards/db")
  request = Net::HTTP::Post.new(uri)
  request['Authorization'] = "Bearer #{ENV['GRAFANA_API_KEY']}"
  request['Content-Type'] = 'application/json'
  
  dashboard = {
    dashboard: {
      title: title,
      panels: panels,
      schemaVersion: 16,
      version: 0
    },
    overwrite: false
  }
  
  request.body = dashboard.to_json
  response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
    http.request(request)
  end
  
  JSON.parse(response.body)
end

Kibana and Elastic stack provide alternative visualization for time-series data stored in Elasticsearch. This approach suits applications already using Elasticsearch for logging and search.

Data Collection Agents

Telegraf collects system and application metrics, writing to InfluxDB, Prometheus, or other outputs. It provides input plugins for monitoring CPU, memory, disk, network, and various services. Ruby applications expose metrics through StatsD or HTTP endpoints that Telegraf polls.

# Output metrics in StatsD format for Telegraf
require 'statsd-instrument'

StatsD.backend = StatsD::Instrument::Backends::UDPBackend.new(
  'localhost:8125'
)

class ApplicationController < ActionController::Base
  around_action :track_request
  
  private
  
  def track_request
    start = Time.now
    begin
      yield
    ensure
      duration = (Time.now - start) * 1000
      StatsD.histogram('request.duration', duration, 
        tags: ["endpoint:#{params[:controller]}.#{params[:action]}"])
      StatsD.increment('request.count',
        tags: ["status:#{response.status}"])
    end
  end
end

Prometheus exporters expose metrics from systems not directly instrumented. Exporters exist for databases, message queues, cloud services, and hardware monitoring. Ruby applications can create custom exporters using the prometheus-client gem.

Design Considerations

Selecting and implementing time-series databases requires evaluating trade-offs around consistency, availability, query capabilities, and operational complexity.

Database Selection Criteria

Write and query patterns determine database suitability. Applications with high write volumes (>10,000 points/second) benefit from databases optimized for ingest like InfluxDB or VictoriaMetrics. Applications prioritizing query flexibility might choose TimescaleDB for SQL support.

Cardinality expectations affect database choice. InfluxDB handles moderate cardinality (millions of series) efficiently but degrades with very high cardinality. VictoriaMetrics and M3DB better handle high cardinality workloads with billions of unique series.

Retention requirements influence storage architecture. Short-term retention (days to weeks) suits in-memory databases or local storage engines. Long-term retention (months to years) requires distributed storage with efficient compression and tiered storage options.

# Decision matrix implementation
class DatabaseSelector
  DATABASES = {
    influxdb: {
      max_write_rate: 100_000,
      max_cardinality: 10_000_000,
      sql_support: false,
      clustering: :commercial,
      ideal_retention: '90d'
    },
    timescaledb: {
      max_write_rate: 50_000,
      max_cardinality: 100_000_000,
      sql_support: true,
      clustering: :native,
      ideal_retention: '1y'
    },
    victoriametrics: {
      max_write_rate: 1_000_000,
      max_cardinality: 1_000_000_000,
      sql_support: false,
      clustering: :native,
      ideal_retention: '1y'
    }
  }
  
  def self.recommend(requirements)
    DATABASES.select do |name, capabilities|
      capabilities[:max_write_rate] >= requirements[:write_rate] &&
      capabilities[:max_cardinality] >= requirements[:cardinality]
    end.keys
  end
end

# Usage
requirements = {
  write_rate: 75_000,
  cardinality: 5_000_000,
  retention: '180d'
}

suitable = DatabaseSelector.recommend(requirements)
# => [:influxdb, :timescaledb, :victoriametrics]

Consistency vs Availability Trade-offs

Time-series workloads typically tolerate eventual consistency. Monitoring data arriving a few seconds late rarely affects analysis. This tolerance enables architectures prioritizing availability—accepting writes even during network partitions or node failures.

Some time-series databases sacrifice consistency for availability. Writes succeed immediately without waiting for replication. Queries might return slightly different results from different nodes during network partitions. This model suits monitoring and metrics collection where missing a few data points matters less than maintaining write availability.

Applications requiring strong consistency should choose databases providing tunable consistency levels. TimescaleDB inherits PostgreSQL's consistency model, ensuring writes persist before acknowledging. InfluxDB Enterprise allows consistency level configuration per write.

Push vs Pull Models

Time-series data collection follows push or pull patterns. Push models have applications send metrics to a central database. Pull models have the database scrape metrics from application endpoints.

Push models suit distributed applications or environments with dynamic scaling. Applications know when they generate metrics and push immediately without waiting for scraping. Network firewalls rarely block outbound pushes. The downside: applications need database connection management and retry logic.

Pull models centralize configuration in the metrics system. Prometheus scrapes configured targets at regular intervals, controlling collection frequency and handling service discovery. Applications expose simple HTTP endpoints without managing connections. The downside: requires network access from scraper to all applications.

# Push model: application sends metrics
class MetricsPusher
  def initialize
    @client = InfluxDB2::Client.new(ENV['INFLUXDB_URL'], ENV['TOKEN'])
    @write_api = @client.create_write_api
  end
  
  def record(name, value, tags = {})
    point = InfluxDB2::Point.new(name: name)
    tags.each { |k, v| point.add_tag(k.to_s, v.to_s) }
    point.add_field('value', value)
    
    @write_api.write(data: point)
  end
end

# Pull model: Prometheus scrapes endpoint
class MetricsExporter
  def initialize
    @registry = Prometheus::Client.registry
    @counter = @registry.counter(:requests_total, 
      docstring: 'Total requests',
      labels: [:status])
  end
  
  def record(status)
    @counter.increment(labels: { status: status })
    # Prometheus will scrape /metrics endpoint periodically
  end
end

Schema Design Patterns

Time-series schema design focuses on tag selection and field organization. Tags should have bounded cardinality and identify dimensions for filtering and grouping. Fields contain measurements and high-cardinality identifiers.

Wide schemas store many measurements in a single series, reducing series count but potentially wasting storage for sparse data. Narrow schemas split measurements into separate series, increasing series count but storing only present values.

# Wide schema: multiple fields per series
# Advantage: fewer series, simpler queries
# Disadvantage: sparse data wastes storage
point = InfluxDB2::Point.new(name: 'server_metrics')
  .add_tag('host', 'web-01')
  .add_field('cpu_percent', 45.2)
  .add_field('memory_mb', 2048)
  .add_field('disk_gb', 125)
  .add_field('network_mbps', 15.3)

# Narrow schema: one field per series
# Advantage: efficient storage for sparse data
# Disadvantage: more series, complex queries
['cpu_percent', 'memory_mb', 'disk_gb', 'network_mbps'].each do |metric|
  point = InfluxDB2::Point.new(name: metric)
    .add_tag('host', 'web-01')
    .add_field('value', values[metric])
end

Applications should avoid storing metadata that changes frequently as tags. Server IP addresses might seem like good tags but create new series when servers change IPs. Store such identifiers in fields or separate metadata systems.

Reference

Database Comparison

Database	Write Model	Query Language	Clustering	Best For
InfluxDB	Push	Flux, InfluxQL	Commercial	Moderate cardinality, strong consistency
TimescaleDB	Push	SQL	Native	High cardinality, relational features
Prometheus	Pull	PromQL	Federation	Service monitoring, short retention
VictoriaMetrics	Push/Pull	PromQL, MetricsQL	Native	High cardinality, long retention
Graphite	Push	Functions	Carbon-relay	Legacy systems, simple metrics
M3DB	Push	PromQL	Native	Extreme scale, distributed

Ruby Client Libraries

Gem	Database	Features	Use Case
influxdb-client	InfluxDB 2.x	Batching, Flux queries, async	Modern InfluxDB deployments
influxdb	InfluxDB 1.x	Basic writes, InfluxQL	Legacy InfluxDB systems
timescaledb	TimescaleDB	Hypertables, continuous aggregates	PostgreSQL-based systems
prometheus-client	Prometheus	Metrics exposure, types	Application instrumentation
graphite-api	Graphite	Metric formatting, sending	Graphite integration

Write Performance Factors

Factor	Impact	Optimization Strategy
Batch size	10-100x throughput	Batch 1000-10000 points per write
Point size	Memory and network	Minimize tag count, avoid large field values
Cardinality	Index overhead	Use bounded tag sets, limit series count
Timestamp precision	Storage size	Use milliseconds unless microseconds needed
Compression	Disk I/O	Enable native compression, tune levels
Retention	Write amplification	Configure appropriate downsampling policies

Query Optimization Techniques

Technique	Benefit	Implementation
Time range limits	Reduce scan size	Always specify start/end times
Tag filtering	Series reduction	Filter on indexed tags first
Downsampled queries	10-100x faster	Query aggregated data for historical analysis
Result limits	Memory control	Limit result rows, paginate large sets
Cached queries	Sub-second response	Cache results for repeated queries
Continuous aggregates	Real-time performance	Pre-compute common aggregations

Common Tag Design Patterns

Pattern	Tags	Fields	Cardinality
Infrastructure	host, region, cluster	cpu, memory, disk	Low (hundreds)
Application	service, endpoint, method	duration_ms, count	Medium (thousands)
IoT sensors	device_type, location	temperature, humidity	High (millions)
Financial	symbol, exchange, order_type	price, volume	Very high (billions)

Retention Policy Configuration

Retention	Use Case	Storage Impact	Query Performance
Raw: 7 days	Recent operational analysis	High write rate, full resolution	Fast for recent queries
Hourly: 90 days	Short-term trends	1/3600 of raw data	Good for hourly analysis
Daily: 1 year	Historical reporting	1/86400 of raw data	Excellent for long-term trends
Monthly: indefinite	Long-term archives	1/2592000 of raw data	Sufficient for annual reports

Flux Query Language Basics

Operation	Description	Example
from	Specify data source	from(bucket: "metrics")
range	Time window	range(start: -1h)
filter	Series selection	filter(fn: (r) => r.host == "web-01")
aggregateWindow	Time-based grouping	aggregateWindow(every: 5m, fn: mean)
group	Group by tags	group(columns: ["host"])
map	Transform values	map(fn: (r) => ({r with scaled: r._value * 100}))
join	Combine streams	join(tables: {a: stream1, b: stream2}, on: ["host"])

TimescaleDB Functions

Function	Purpose	Example
time_bucket	Time-based grouping	time_bucket('5 minutes', time)
first	First value in group	first(temperature, time)
last	Last value in group	last(temperature, time)
locf	Last observation carried forward	locf(reading)
interpolate	Linear interpolation	interpolate(reading)
time_bucket_gapfill	Fill missing time buckets	time_bucket_gapfill('1 hour', time)

Monitoring Metrics

Metric	Normal Range	Action Threshold
Write throughput	Varies by hardware	<80% of rated capacity
Query latency p95	<100ms for recent data	>1 second
Memory usage	50-70%	>85%
Series cardinality	Depends on database	Check database limits
Disk usage growth	Linear with retention	>90% capacity
Query queue depth	0-10	>100 queued queries

Time-Series Databases