CrackedRuby logo

CrackedRuby

GC Configuration

Comprehensive guide to configuring Ruby's garbage collector for optimal memory management and performance tuning.

Core Modules GC Module
3.6.4

Overview

Ruby's garbage collector automatically manages memory allocation and deallocation, reclaiming objects no longer referenced by the program. Ruby 3.4 provides extensive configuration options through environment variables, runtime methods, and tuning parameters that control GC behavior, frequency, and algorithms.

The GC system in Ruby operates on a generational hypothesis, categorizing objects into different generations based on their age and usage patterns. Young objects (generation 0) undergo frequent collection cycles, while older objects (generations 1 and 2) are collected less frequently. This approach optimizes performance by focusing collection efforts on objects most likely to be garbage.

Ruby 3.4 supports multiple GC algorithms and tuning mechanisms. The primary interface involves environment variables set before program execution, runtime configuration through GC module methods, and monitoring through statistics collection. The collector uses mark-and-sweep algorithms combined with generational collection strategies.

# Check current GC configuration
puts GC.stat
# => {:count=>23, :heap_allocated_pages=>156, :heap_sorted_length=>156, ...}

# Get specific GC parameters
puts GC.stat(:count)
# => 23

Key configuration areas include heap sizing, collection thresholds, growth factors, and generation management. Environment variables like RUBY_GC_HEAP_INIT_SLOTS and RUBY_GC_MALLOC_LIMIT control initial allocation behavior, while runtime methods provide dynamic adjustment capabilities.

# Runtime GC configuration
GC.stress = true  # Force GC on every allocation (debug only)
GC.auto_compact = true  # Enable automatic heap compaction
GC.compact  # Manual heap compaction

The collector maintains detailed statistics accessible through GC.stat, including allocation counts, collection frequencies, heap page usage, and timing measurements. These metrics enable precise performance analysis and configuration optimization.

Basic Usage

GC configuration typically occurs through environment variables set before Ruby process startup. These variables control fundamental parameters like initial heap size, growth factors, and collection thresholds that determine collector behavior throughout program execution.

The most commonly configured parameters control heap initialization and growth. RUBY_GC_HEAP_INIT_SLOTS sets the initial number of object slots available, while RUBY_GC_HEAP_GROWTH_FACTOR determines how aggressively the heap expands when additional memory is needed.

# Environment variable configuration
export RUBY_GC_HEAP_INIT_SLOTS=10000
export RUBY_GC_HEAP_GROWTH_FACTOR=1.8
export RUBY_GC_HEAP_GROWTH_MAX_SLOTS=5000
ruby application.rb

Memory allocation limits control when garbage collection triggers. RUBY_GC_MALLOC_LIMIT sets the threshold for malloc-ed memory that triggers collection, while RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR controls how this limit increases after each collection cycle.

# Memory limit configuration
export RUBY_GC_MALLOC_LIMIT=16777216  # 16MB initial limit
export RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR=2.0
export RUBY_GC_MALLOC_LIMIT_MAX=33554432  # 32MB maximum limit

Generation-specific parameters control how objects move between generational levels. RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR determines the ratio of old objects that triggers major collection cycles, affecting the balance between minor and major collection frequency.

# Runtime configuration methods
GC.stat(:heap_allocated_pages)  # Current heap page count
GC.stat(:heap_live_slots)       # Active object slots
GC.stat(:heap_free_slots)       # Available object slots

# Trigger collection manually
GC.start  # Full garbage collection cycle
GC.start(full_mark: false)  # Minor collection only

Monitoring GC activity during development helps identify appropriate configuration values. The GC.stat method returns comprehensive statistics that reveal collection patterns, memory usage trends, and performance characteristics specific to your application's allocation behavior.

# Basic GC monitoring
before_stats = GC.stat
# ... application code that allocates objects ...
after_stats = GC.stat

collections = after_stats[:count] - before_stats[:count]
puts "GC cycles: #{collections}"
puts "Heap pages: #{after_stats[:heap_allocated_pages]}"

Performance & Memory

GC performance directly affects application throughput and latency characteristics. Proper configuration reduces collection frequency, minimizes pause times, and optimizes memory utilization patterns. Ruby 3.4 provides extensive tuning parameters for different application profiles and deployment scenarios.

Heap sizing parameters significantly impact performance by controlling collection frequency and memory overhead. Applications with high allocation rates benefit from larger initial heap sizes that reduce early collection pressure, while long-running processes require careful growth factor tuning to prevent excessive memory consumption.

# Performance measurement during GC tuning
require 'benchmark'

# Test different heap configurations
configurations = [
  { init_slots: 10000, growth_factor: 1.5 },
  { init_slots: 50000, growth_factor: 1.8 },
  { init_slots: 100000, growth_factor: 2.0 }
]

configurations.each do |config|
  ENV['RUBY_GC_HEAP_INIT_SLOTS'] = config[:init_slots].to_s
  ENV['RUBY_GC_HEAP_GROWTH_FACTOR'] = config[:growth_factor].to_s
  
  time = Benchmark.measure do
    # Allocation-heavy workload
    10000.times { Array.new(1000) { |i| "string_#{i}" } }
  end
  
  puts "Config #{config}: #{time.real}s, GC count: #{GC.stat(:count)}"
end

Memory allocation patterns affect GC performance through fragmentation and collection efficiency. Applications creating many short-lived objects benefit from aggressive minor collection settings, while applications with long-lived data structures require optimized major collection parameters.

Malloc limit configuration controls when external memory pressure triggers collection cycles. Applications using C extensions or large string operations need higher malloc limits to prevent excessive collection frequency, but limits too high can delay necessary cleanup cycles.

# High-throughput application configuration
export RUBY_GC_HEAP_INIT_SLOTS=100000
export RUBY_GC_HEAP_GROWTH_FACTOR=1.6
export RUBY_GC_MALLOC_LIMIT=33554432  # 32MB
export RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR=1.8
export RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR=10

Generational collection parameters balance minor and major collection cycles. Higher RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR values delay expensive major collections but allow more garbage accumulation in older generations. Lower values increase major collection frequency but ensure more thorough cleanup.

# Memory usage analysis
def analyze_memory_usage
  GC.start  # Clean slate
  initial_memory = `ps -o rss= -p #{Process.pid}`.to_i
  
  # Allocation workload
  data = []
  10000.times do |i|
    data << { id: i, content: "data_#{i}" * 100 }
  end
  
  peak_memory = `ps -o rss= -p #{Process.pid}`.to_i
  gc_stats = GC.stat
  
  puts "Memory growth: #{peak_memory - initial_memory} KB"
  puts "GC collections: #{gc_stats[:count]}"
  puts "Heap pages: #{gc_stats[:heap_allocated_pages]}"
  puts "Live objects: #{gc_stats[:heap_live_slots]}"
end

Heap compaction reduces memory fragmentation and improves cache locality. Ruby 3.4's automatic compaction feature runs during major collection cycles, but manual compaction through GC.compact provides control over timing and frequency for applications with specific performance requirements.

# Compaction performance analysis
def measure_compaction_impact
  # Create fragmented heap state
  objects = 100000.times.map { |i| "string_#{i}" }
  objects.each_with_index { |obj, i| objects[i] = nil if i.even? }
  
  GC.start  # Remove deallocated objects
  
  before_compact = GC.stat
  compact_time = Benchmark.measure { GC.compact }
  after_compact = GC.stat
  
  puts "Compaction time: #{compact_time.real}s"
  puts "Pages before: #{before_compact[:heap_allocated_pages]}"
  puts "Pages after: #{after_compact[:heap_allocated_pages]}"
  puts "Freed pages: #{before_compact[:heap_allocated_pages] - after_compact[:heap_allocated_pages]}"
end

Production Patterns

Production GC configuration requires balancing memory efficiency, response time consistency, and resource utilization across varying load patterns. Successful deployments monitor GC metrics continuously and adjust parameters based on observed application behavior under real traffic conditions.

Application servers benefit from optimized heap sizing that accommodates request processing patterns without frequent collection cycles. Web applications typically show allocation spikes during request processing followed by cleanup opportunities between requests, suggesting configuration approaches that defer collection timing strategically.

# Production GC monitoring middleware
class GCMonitoringMiddleware
  def initialize(app)
    @app = app
    @gc_stats = {}
  end
  
  def call(env)
    gc_before = GC.stat
    start_time = Time.now
    
    response = @app.call(env)
    
    end_time = Time.now
    gc_after = GC.stat
    
    track_gc_metrics(gc_before, gc_after, end_time - start_time)
    response
  end
  
  private
  
  def track_gc_metrics(before, after, request_time)
    collections = after[:count] - before[:count]
    if collections > 0
      puts "GC during request: #{collections} cycles, #{request_time}s total"
      puts "Heap growth: #{after[:heap_allocated_pages] - before[:heap_allocated_pages]} pages"
    end
  end
end

Container environments require careful memory limit alignment with GC configuration. Applications running in containers with memory restrictions need heap sizing that prevents out-of-memory conditions while maintaining collection efficiency within available resources.

# Container-optimized configuration
# For 512MB container limit
export RUBY_GC_HEAP_INIT_SLOTS=40000
export RUBY_GC_HEAP_GROWTH_FACTOR=1.4
export RUBY_GC_HEAP_GROWTH_MAX_SLOTS=8000
export RUBY_GC_MALLOC_LIMIT=67108864  # 64MB
export RUBY_GC_MALLOC_LIMIT_MAX=134217728  # 128MB

Background job processors often handle large data volumes with different allocation patterns than request-response applications. These systems benefit from higher malloc limits and delayed major collection cycles that accommodate batch processing workloads without interrupting job execution.

# Background job GC optimization
class OptimizedJobProcessor
  def self.configure_gc
    # Reduce collection frequency during job processing
    GC.disable  # Temporarily disable automatic GC
    
    yield
    
    # Cleanup after job completion
    GC.enable
    GC.start(full_mark: true)
    GC.compact if should_compact?
  end
  
  def self.should_compact?
    stats = GC.stat
    fragmentation_ratio = stats[:heap_free_slots].to_f / stats[:heap_allocated_pages]
    fragmentation_ratio > 0.3  # Compact if >30% fragmentation
  end
  
  def process_job(job)
    self.class.configure_gc do
      # Heavy allocation during job processing
      job.execute
    end
  end
end

Load balancer configurations must account for GC pause times affecting request routing decisions. Applications with aggressive GC settings may experience periodic response time spikes during collection cycles, requiring load balancer timeout adjustments and health check modifications.

Database-intensive applications often exhibit allocation patterns dominated by result set processing and ORM object creation. These applications benefit from generational tuning that rapidly collects temporary query objects while preserving long-lived connection and configuration data.

# Database application GC telemetry
class DatabaseGCProfiler
  def self.profile_query_execution
    ActiveRecord::Base.connection.execute("SELECT pg_stat_reset()")  # PostgreSQL
    
    gc_before = GC.stat
    memory_before = `ps -o rss= -p #{Process.pid}`.to_i
    
    yield  # Database operations
    
    gc_after = GC.stat
    memory_after = `ps -o rss= -p #{Process.pid}`.to_i
    
    report_gc_impact(gc_before, gc_after, memory_before, memory_after)
  end
  
  def self.report_gc_impact(gc_before, gc_after, mem_before, mem_after)
    collections = gc_after[:count] - gc_before[:count]
    memory_growth = mem_after - mem_before
    
    puts "Query batch results:"
    puts "  GC cycles: #{collections}"
    puts "  Memory delta: #{memory_growth} KB"
    puts "  Objects created: #{gc_after[:total_allocated_objects] - gc_before[:total_allocated_objects]}"
    puts "  Heap pages: #{gc_after[:heap_allocated_pages]} (#{gc_after[:heap_allocated_pages] - gc_before[:heap_allocated_pages]} new)"
  end
end

Common Pitfalls

GC misconfiguration frequently causes performance degradation that manifests as inconsistent response times, memory bloat, or excessive CPU usage. Understanding common configuration mistakes prevents deployment issues and guides troubleshooting approaches when applications exhibit unexpected GC-related behavior.

Over-aggressive heap growth factors cause memory consumption to increase rapidly beyond application requirements. Applications configured with growth factors above 2.0 often experience memory pressure in production environments, particularly in containerized deployments where memory limits terminate processes unexpectedly.

# Problematic configuration detection
def check_heap_growth_config
  growth_factor = ENV['RUBY_GC_HEAP_GROWTH_FACTOR'].to_f
  max_slots = ENV['RUBY_GC_HEAP_GROWTH_MAX_SLOTS'].to_i
  
  if growth_factor > 2.0
    warn "Heap growth factor #{growth_factor} may cause memory bloat"
  end
  
  if max_slots == 0
    warn "No maximum growth slot limit - heap can grow unbounded"
  end
  
  # Monitor actual growth during runtime
  initial_pages = GC.stat[:heap_allocated_pages]
  sleep(30)  # Application running time
  current_pages = GC.stat[:heap_allocated_pages]
  
  growth_rate = (current_pages - initial_pages).to_f / initial_pages
  warn "Heap growing at #{(growth_rate * 100).round(1)}% per 30 seconds" if growth_rate > 0.1
end

Malloc limit misalignment with application memory usage patterns causes frequent unnecessary collection cycles. Applications using large string operations, JSON parsing, or binary data processing require malloc limits proportional to their largest single-operation memory requirements.

Setting malloc limits too low triggers collection cycles that interrupt processing workflows, while limits too high delay cleanup of genuinely unused memory. Applications processing variable-sized data benefit from profiling actual malloc usage patterns before configuration.

# Malloc pressure analysis
def analyze_malloc_pressure
  initial_limit = GC.stat[:malloc_increase_bytes_limit]
  initial_usage = GC.stat[:malloc_increase_bytes]
  
  # Simulate typical workload
  data_processing_workload
  
  final_limit = GC.stat[:malloc_increase_bytes_limit]
  final_usage = GC.stat[:malloc_increase_bytes]
  collections = GC.stat(:count)
  
  puts "Malloc limit changes: #{initial_limit} -> #{final_limit}"
  puts "Usage pattern: #{initial_usage} -> #{final_usage}"
  puts "Collections triggered: #{collections}"
  
  if final_limit > initial_limit * 4
    warn "Malloc limit increased significantly - consider higher initial limit"
  end
end

def data_processing_workload
  # Typical problematic pattern
  large_strings = []
  100.times do |i|
    json_data = { id: i, content: "x" * 10000 }.to_json
    parsed_data = JSON.parse(json_data)
    large_strings << parsed_data['content']
  end
end

Generational collection imbalances occur when RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR settings conflict with application object lifetime patterns. Applications retaining many objects across collection cycles need higher factors to prevent excessive major collection frequency, while applications with rapid object turnover benefit from lower factors.

# Generation balance analysis
def analyze_generational_balance
  GC.start  # Reset to known state
  
  before_stats = GC.stat
  create_mixed_lifetime_objects
  after_stats = GC.stat
  
  minor_collections = after_stats[:minor_gc_count] - before_stats[:minor_gc_count]
  major_collections = after_stats[:major_gc_count] - before_stats[:major_gc_count]
  
  ratio = minor_collections.to_f / [major_collections, 1].max
  
  puts "Minor/Major collection ratio: #{ratio.round(2)}"
  puts "Old object percentage: #{(after_stats[:old_objects] * 100.0 / after_stats[:heap_live_slots]).round(1)}%"
  
  if ratio < 3
    warn "Too many major collections - consider increasing OLDOBJECT_LIMIT_FACTOR"
  elsif ratio > 20
    warn "Too few major collections - old generation may be accumulating garbage"
  end
end

def create_mixed_lifetime_objects
  long_lived = []  # Survives multiple GC cycles
  
  50.times do |batch|
    # Short-lived objects
    temp_data = 1000.times.map { |i| "batch_#{batch}_item_#{i}" }
    
    # Some objects survive
    long_lived << temp_data.sample(10) if batch.even?
    
    # Force minor collection
    GC.start(full_mark: false) if batch % 10 == 0
  end
end

Heap compaction timing issues arise when applications perform manual compaction during high-traffic periods or fail to compact sufficiently fragmented heaps. Compaction pauses affect request processing, requiring careful scheduling around application load patterns.

# Compaction timing strategy
class SmartCompactionManager
  def self.should_compact_now?
    stats = GC.stat
    
    # Check fragmentation level
    fragmentation = stats[:heap_free_slots].to_f / stats[:heap_allocated_pages]
    return false if fragmentation < 0.2
    
    # Check application load
    return false if high_traffic_period?
    
    # Check recent compaction history
    return false if compacted_recently?
    
    true
  end
  
  def self.high_traffic_period?
    # Application-specific logic
    hour = Time.now.hour
    (9..17).include?(hour)  # Business hours
  end
  
  def self.compacted_recently?
    @last_compaction ||= Time.at(0)
    Time.now - @last_compaction < 300  # 5 minutes
  end
  
  def self.perform_smart_compaction
    return unless should_compact_now?
    
    puts "Starting heap compaction at #{Time.now}"
    start_time = Time.now
    
    GC.compact
    
    compaction_time = Time.now - start_time
    @last_compaction = Time.now
    
    puts "Compaction completed in #{compaction_time}s"
    
    if compaction_time > 0.1  # 100ms threshold
      warn "Long compaction pause: #{compaction_time}s - consider adjusting timing"
    end
  end
end

Reference

Environment Variables

Variable Default Description Recommended Range
RUBY_GC_HEAP_INIT_SLOTS 10000 Initial object slots in heap 10000-100000
RUBY_GC_HEAP_GROWTH_FACTOR 1.8 Heap expansion multiplier 1.2-2.0
RUBY_GC_HEAP_GROWTH_MAX_SLOTS 0 Maximum slots per growth 1000-10000
RUBY_GC_MALLOC_LIMIT 16777216 Initial malloc threshold (bytes) 16MB-128MB
RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR 1.4 Malloc limit growth multiplier 1.2-2.0
RUBY_GC_MALLOC_LIMIT_MAX 33554432 Maximum malloc limit (bytes) 32MB-512MB
RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR 2 Old object ratio for major GC 2-20

GC Module Methods

Method Parameters Returns Description
GC.start full_mark: true nil Trigger garbage collection
GC.enable None true/false Enable automatic GC
GC.disable None true/false Disable automatic GC
GC.stress true/false true/false Get/set GC stress mode
GC.compact None Hash Compact heap and return stats
GC.auto_compact true/false true/false Get/set auto-compaction
GC.stat key (Symbol) Integer/Hash Get GC statistics
GC.count None Integer Total GC cycle count

Key Statistics

Statistic Description Significance
:count Total GC cycles Overall collection frequency
:minor_gc_count Minor collection cycles Young generation cleanup
:major_gc_count Major collection cycles Full heap cleanup
:heap_allocated_pages Total heap pages Memory footprint
:heap_live_slots Active object slots Current object count
:heap_free_slots Available object slots Allocation headroom
:malloc_increase_bytes External memory usage C extension memory
:malloc_increase_bytes_limit Current malloc threshold Collection trigger point
:total_allocated_objects Objects created lifetime Allocation pressure
:old_objects Objects in old generation Long-lived object count

Collection Timing

Phase Typical Duration Impact Optimization
Mark 1-10ms Low Reduce live object count
Sweep 1-5ms Low Minimize fragmentation
Compact 5-50ms High Control timing and frequency
Major GC 10-100ms High Tune old object limits

Memory Configuration Profiles

Application Type Init Slots Growth Factor Malloc Limit Old Object Factor
Web Application 40000 1.6 32MB 5
Background Jobs 100000 1.4 64MB 10
Data Processing 50000 1.8 128MB 3
Microservice 20000 1.5 16MB 4
Long-running Service 80000 1.3 96MB 8

Troubleshooting Decision Tree

Symptom Check Configuration Adjustment
Frequent GC cycles GC.stat(:count) Increase heap init slots
Memory growth Heap allocated pages Reduce growth factor
Long pause times Major GC frequency Increase old object factor
High malloc pressure malloc_increase_bytes Increase malloc limits
Fragmentation Free/allocated slot ratio Enable auto-compaction