CrackedRuby logo

CrackedRuby

GC Statistics

Comprehensive guide to Ruby's garbage collection statistics API for performance monitoring and memory analysis.

Core Modules GC Module
3.6.2

Overview

Ruby provides garbage collection statistics through the GC module and ObjectSpace module to monitor memory usage and garbage collector performance. The garbage collector automatically manages memory by removing unused objects, and the statistics API exposes detailed metrics about this process including collection counts, execution time, and memory allocation patterns.

The primary interface is GC.stat which returns a hash containing comprehensive statistics about garbage collection activity. Ruby tracks statistics across different object generations, with most objects starting in generation 0 (young objects) and promoting to generation 1 and 2 (older objects) if they survive multiple collection cycles.

# Basic GC statistics
stats = GC.stat
puts stats[:count]           # Total GC runs
puts stats[:heap_live_slots] # Currently used object slots
puts stats[:total_time]      # Time spent in GC (nanoseconds)

The ObjectSpace module provides complementary statistics about object allocation and memory usage patterns. These methods work together to provide insight into application memory behavior and garbage collection efficiency.

# Object allocation tracking
ObjectSpace.count_objects
# => {:TOTAL=>50742, :FREE=>3218, :T_OBJECT=>1435, :T_CLASS=>1033, ...}

# Memory size of specific objects  
ObjectSpace.memsize_of("hello world")
# => 40

Ruby's garbage collection uses a mark-and-sweep algorithm with generational collection. Statistics track both minor collections (young generation only) and major collections (all generations). Understanding these metrics helps identify memory leaks, optimize allocation patterns, and tune garbage collection behavior for better application performance.

Basic Usage

The GC.stat method without arguments returns a hash containing all available statistics. Specific metrics can be retrieved by passing a symbol key, which is more efficient when monitoring individual values.

# Get all statistics
all_stats = GC.stat

# Get specific statistic  
live_objects = GC.stat(:heap_live_slots)
total_collections = GC.stat(:count)

# Multiple specific statistics
heap_stats = GC.stat(:heap_allocated_pages, :heap_sorted_length)

The GC.latest_gc_info method provides information about the most recent garbage collection cycle, including what triggered it and how long it took.

# Information about last GC cycle
gc_info = GC.latest_gc_info
puts "Reason: #{gc_info[:gc_by]}"        # :newobj, :malloc, :method, etc.
puts "Duration: #{gc_info[:time]} ns"    # Time in nanoseconds  
puts "Collected: #{gc_info[:immediate_sweep]}" # Objects collected immediately

Object allocation tracking requires enabling allocation tracing before it becomes available. This adds overhead but provides detailed information about where objects are created in the code.

# Enable allocation tracing
ObjectSpace.trace_object_allocations_start

# Run code to analyze
1000.times { "temporary string" }
arrays = Array.new(100) { [1, 2, 3] }

# Get allocation information
ObjectSpace.trace_object_allocations_stop
ObjectSpace.trace_object_allocations_clear

The ObjectSpace.each_object method iterates over live objects of specific classes, useful for finding memory leaks or analyzing object distribution.

# Count objects by class
string_count = ObjectSpace.each_object(String).count
array_count = ObjectSpace.each_object(Array).count

# Find large arrays
large_arrays = ObjectSpace.each_object(Array).select { |arr| arr.size > 1000 }
puts "Found #{large_arrays.size} large arrays"

Regular monitoring typically focuses on heap growth, collection frequency, and allocation rates to identify performance issues before they impact application responsiveness.

Performance & Memory

Garbage collection performance directly impacts application responsiveness, especially in web applications where GC pauses cause request timeouts. Key metrics for performance analysis include collection frequency, pause duration, and heap growth patterns.

class GCBenchmark
  def self.measure_gc_impact(&block)
    # Disable GC to measure baseline
    GC.disable
    
    start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
    start_stats = GC.stat
    
    result = yield
    
    baseline_time = Process.clock_gettime(Process::CLOCK_MONOTONIC) - start_time
    
    # Re-enable GC and force collection
    GC.enable
    GC.start
    
    end_stats = GC.stat
    gc_time = (end_stats[:total_time] - start_stats[:total_time]) / 1_000_000.0
    
    {
      result: result,
      baseline_time: baseline_time,
      gc_time: gc_time,
      collections: end_stats[:count] - start_stats[:count],
      objects_allocated: end_stats[:total_allocated_objects] - start_stats[:total_allocated_objects]
    }
  end
end

# Benchmark object creation patterns
results = GCBenchmark.measure_gc_impact do
  10_000.times { |i| "string_#{i}" }
end

puts "Baseline: #{results[:baseline_time]}s"
puts "GC Time: #{results[:gc_time]}ms" 
puts "Collections: #{results[:collections]}"
puts "Objects: #{results[:objects_allocated]}"

Memory usage patterns can be analyzed by tracking heap metrics over time. Growing heaps indicate potential memory leaks, while frequent collections suggest excessive object allocation.

class HeapAnalyzer
  def initialize
    @snapshots = []
  end
  
  def take_snapshot(label = nil)
    stats = GC.stat
    @snapshots << {
      label: label,
      timestamp: Time.now,
      heap_live_slots: stats[:heap_live_slots],
      heap_free_slots: stats[:heap_free_slots], 
      heap_final_slots: stats[:heap_final_slots],
      total_allocated: stats[:total_allocated_objects],
      minor_gc_count: stats[:minor_gc_count],
      major_gc_count: stats[:major_gc_count]
    }
  end
  
  def analyze_growth
    return "Insufficient data" if @snapshots.size < 2
    
    first = @snapshots.first
    last = @snapshots.last
    duration = last[:timestamp] - first[:timestamp]
    
    live_growth = last[:heap_live_slots] - first[:heap_live_slots]
    allocation_rate = (last[:total_allocated] - first[:total_allocated]) / duration
    gc_rate = (last[:minor_gc_count] - first[:minor_gc_count]) / duration
    
    {
      duration: duration,
      live_object_growth: live_growth,
      allocations_per_second: allocation_rate,
      minor_gcs_per_second: gc_rate,
      heap_efficiency: last[:heap_live_slots].to_f / (last[:heap_live_slots] + last[:heap_free_slots])
    }
  end
end

analyzer = HeapAnalyzer.new
analyzer.take_snapshot("start")

# Simulate application work
1000.times do
  data = Array.new(100) { rand(1000) }
  processed = data.map(&:to_s).join(",")
end

analyzer.take_snapshot("end")
analysis = analyzer.analyze_growth

puts "Heap efficiency: #{(analysis[:heap_efficiency] * 100).round(1)}%"
puts "Allocation rate: #{analysis[:allocations_per_second].round} objects/sec"

Memory profiling focuses on identifying allocation hotspots and object lifetime patterns. Objects that survive multiple GC cycles consume more resources and may indicate memory leaks.

# Profile object allocation locations
ObjectSpace.trace_object_allocations_start

# Code to profile
data_processor = Class.new do
  def process_batch(items)
    items.map { |item| transform_item(item) }
  end
  
  def transform_item(item)
    {
      id: item[:id],
      processed_at: Time.now.to_s,
      data: item.fetch(:data, {}).transform_values(&:to_s)
    }
  end
end

processor = data_processor.new
1000.times do |i|
  batch = Array.new(10) { { id: i, data: { value: rand(100) } } }
  processor.process_batch(batch)
end

# Analyze allocations by location
allocations = ObjectSpace.dump_all(output: StringIO.new)
ObjectSpace.trace_object_allocations_stop

# Group allocations by source location
allocation_summary = Hash.new(0)
ObjectSpace.each_object do |obj|
  next unless ObjectSpace.allocation_sourcefile(obj)
  
  location = "#{File.basename(ObjectSpace.allocation_sourcefile(obj))}:#{ObjectSpace.allocation_sourceline(obj)}"
  allocation_summary[location] += 1
end

allocation_summary.sort_by { |_, count| -count }.first(10).each do |location, count|
  puts "#{location}: #{count} objects"
end

Production Patterns

Production applications require continuous GC monitoring to maintain performance and identify issues before they impact users. Monitoring strategies typically involve logging key metrics and setting up alerts for abnormal patterns.

class ProductionGCMonitor
  def initialize(logger:, alert_thresholds: {})
    @logger = logger
    @thresholds = {
      max_heap_growth_mb: 50,
      max_gc_time_ms: 100,
      max_gc_frequency: 10
    }.merge(alert_thresholds)
    @baseline_stats = nil
    @monitoring_window = 300 # 5 minutes
  end
  
  def start_monitoring
    @baseline_stats = capture_stats
    @logger.info "GC monitoring started", @baseline_stats
    
    Thread.new do
      loop do
        sleep @monitoring_window
        check_gc_health
      end
    end
  end
  
  private
  
  def capture_stats
    gc_stats = GC.stat
    object_stats = ObjectSpace.count_objects
    
    {
      timestamp: Time.now,
      heap_size_mb: (gc_stats[:heap_live_slots] * 40) / 1024.0 / 1024.0, # Approximate
      total_objects: object_stats[:TOTAL],
      free_objects: object_stats[:FREE],
      gc_count: gc_stats[:count],
      gc_time_ms: gc_stats[:total_time] / 1_000_000.0,
      major_gc_count: gc_stats[:major_gc_count],
      allocated_pages: gc_stats[:heap_allocated_pages]
    }
  end
  
  def check_gc_health
    current_stats = capture_stats
    return unless @baseline_stats
    
    duration = current_stats[:timestamp] - @baseline_stats[:timestamp]
    heap_growth = current_stats[:heap_size_mb] - @baseline_stats[:heap_size_mb]
    gc_time_growth = current_stats[:gc_time_ms] - @baseline_stats[:gc_time_ms]
    gc_frequency = (current_stats[:gc_count] - @baseline_stats[:gc_count]) / (duration / 60.0)
    
    @logger.info "GC health check", {
      heap_growth_mb: heap_growth.round(2),
      gc_time_growth_ms: gc_time_growth.round(2), 
      gc_frequency_per_minute: gc_frequency.round(2),
      current_heap_mb: current_stats[:heap_size_mb].round(2)
    }
    
    check_alerts(heap_growth, gc_time_growth, gc_frequency)
    @baseline_stats = current_stats
  end
  
  def check_alerts(heap_growth, gc_time_growth, gc_frequency)
    alerts = []
    
    alerts << "High heap growth: #{heap_growth.round(1)}MB" if heap_growth > @thresholds[:max_heap_growth_mb]
    alerts << "High GC time: #{gc_time_growth.round(1)}ms" if gc_time_growth > @thresholds[:max_gc_time_ms] 
    alerts << "High GC frequency: #{gc_frequency.round(1)}/min" if gc_frequency > @thresholds[:max_gc_frequency]
    
    if alerts.any?
      @logger.error "GC performance alerts", { alerts: alerts }
      # Trigger external alerting system
      send_alerts(alerts)
    end
  end
  
  def send_alerts(alerts)
    # Integration with monitoring systems like Datadog, New Relic, etc.
    alerts.each { |alert| puts "ALERT: #{alert}" }
  end
end

Web applications benefit from request-level GC tracking to identify endpoints that cause performance issues. Middleware can capture GC metrics before and after request processing.

class GCRequestMiddleware
  def initialize(app, logger:, sample_rate: 0.1)
    @app = app
    @logger = logger  
    @sample_rate = sample_rate
  end
  
  def call(env)
    return @app.call(env) unless should_sample?
    
    start_stats = capture_request_stats
    start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
    
    status, headers, response = @app.call(env)
    
    end_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
    end_stats = capture_request_stats
    
    log_request_metrics(env, start_stats, end_stats, end_time - start_time)
    
    [status, headers, response]
  end
  
  private
  
  def should_sample?
    rand < @sample_rate
  end
  
  def capture_request_stats
    gc_stat = GC.stat
    {
      live_objects: gc_stat[:heap_live_slots],
      total_allocated: gc_stat[:total_allocated_objects],
      gc_count: gc_stat[:count],
      gc_time: gc_stat[:total_time]
    }
  end
  
  def log_request_metrics(env, start_stats, end_stats, duration)
    allocated = end_stats[:total_allocated] - start_stats[:total_allocated]
    gc_runs = end_stats[:gc_count] - start_stats[:gc_count]
    gc_time = (end_stats[:gc_time] - start_stats[:gc_time]) / 1_000_000.0
    
    @logger.info "Request GC metrics", {
      path: env['PATH_INFO'],
      method: env['REQUEST_METHOD'],
      duration_ms: (duration * 1000).round(2),
      objects_allocated: allocated,
      gc_runs: gc_runs,
      gc_time_ms: gc_time.round(2),
      objects_per_ms: allocated / (duration * 1000)
    }
  end
end

Background job processing requires different monitoring strategies since jobs may process large amounts of data and have different performance characteristics than web requests.

class BackgroundJobGCProfiler
  def self.profile_job(job_class, job_args)
    gc_stats_start = GC.stat
    object_count_start = ObjectSpace.count_objects[:TOTAL]
    
    # Force GC to get clean baseline
    3.times { GC.start }
    clean_baseline = GC.stat
    
    start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
    
    result = job_class.new.perform(*job_args)
    
    end_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
    gc_stats_end = GC.stat
    object_count_end = ObjectSpace.count_objects[:TOTAL]
    
    {
      job_class: job_class.name,
      duration: end_time - start_time,
      result: result,
      objects_allocated: gc_stats_end[:total_allocated_objects] - clean_baseline[:total_allocated_objects],
      net_objects: object_count_end - object_count_start,
      gc_runs: gc_stats_end[:count] - clean_baseline[:count],
      gc_time_ms: (gc_stats_end[:total_time] - clean_baseline[:total_time]) / 1_000_000.0,
      heap_growth: gc_stats_end[:heap_live_slots] - clean_baseline[:heap_live_slots]
    }
  end
end

Reference

GC Module Methods

Method Parameters Returns Description
GC.stat key (Symbol, optional) Integer or Hash Returns specific statistic or all statistics hash
GC.stat(hash) hash (Hash) Hash Populates hash with requested statistics
GC.latest_gc_info key (Symbol, optional) Hash or value Information about most recent GC cycle
GC.count None Integer Total number of GC runs (same as GC.stat(:count))
GC.start full_mark: (Boolean) nil Forces garbage collection
GC.enable None Boolean Enables GC and returns previous state
GC.disable None Boolean Disables GC and returns previous state
GC.stress None Boolean Returns current GC stress mode state
GC.stress= flag (Boolean) Boolean Sets GC stress mode (GC after every allocation)

Key GC.stat Metrics

Statistic Type Description
:count Integer Total garbage collections performed
:time Integer Total time spent in GC (nanoseconds)
:heap_allocated_pages Integer Number of heap pages allocated
:heap_sorted_length Integer Number of heap pages that can fit in sorted order
:heap_allocatable_pages Integer Number of pages that can be allocated without GC
:heap_available_slots Integer Total object slots available
:heap_live_slots Integer Object slots currently in use
:heap_free_slots Integer Object slots available for new objects
:heap_final_slots Integer Object slots with finalizers
:heap_marked_slots Integer Object slots marked during last GC
:heap_eden_pages Integer Pages in eden space (young generation)
:heap_tomb_pages Integer Pages in tomb space (swept pages)
:total_allocated_objects Integer Total objects allocated since startup
:total_freed_objects Integer Total objects freed since startup
:minor_gc_count Integer Minor (young generation) GC count
:major_gc_count Integer Major (full heap) GC count
:remembered_wb_unprotected_objects Integer Objects in write barrier unprotected state
:remembered_wb_unprotected_objects_limit Integer Write barrier limit
:old_objects Integer Objects in old generation
:old_objects_limit Integer Old generation limit before major GC
:oldmalloc_increase Integer Increase in malloc'd memory by old objects
:oldmalloc_increase_limit Integer malloc increase limit for major GC

ObjectSpace Methods

Method Parameters Returns Description
ObjectSpace.count_objects result_hash (Hash, optional) Hash Count of objects by type
ObjectSpace.each_object klass (Class, optional) Enumerator Iterates over live objects of class
ObjectSpace.memsize_of obj (Object) Integer Memory size of object in bytes
ObjectSpace.memsize_of_all klass (Class, optional) Integer Total memory size of objects
ObjectSpace.trace_object_allocations_start None nil Enables allocation location tracking
ObjectSpace.trace_object_allocations_stop None nil Disables allocation tracking
ObjectSpace.trace_object_allocations_clear None nil Clears allocation tracking data
ObjectSpace.allocation_sourcefile obj (Object) String or nil Source file where object was allocated
ObjectSpace.allocation_sourceline obj (Object) Integer or nil Source line where object was allocated
ObjectSpace.allocation_class_path obj (Object) String or nil Class context during allocation
ObjectSpace.allocation_method_id obj (Object) Symbol or nil Method context during allocation
ObjectSpace.dump obj (Object), options String JSON dump of object
ObjectSpace.dump_all options nil Dumps all objects to output

GC.latest_gc_info Keys

Key Type Description
:gc_by Symbol What triggered GC (:newobj, :malloc, :method, :capi, :stress)
:have_finalizer Boolean Whether objects with finalizers were processed
:immediate_sweep Boolean Whether sweep phase ran immediately
:state Symbol GC state during collection
:time Integer Time spent in this GC cycle (nanoseconds)

Object Count Types

Type Symbol Description
:TOTAL Total number of objects
:FREE Free object slots
:T_OBJECT Basic objects
:T_CLASS Class objects
:T_MODULE Module objects
:T_FLOAT Float objects
:T_STRING String objects
:T_REGEXP Regular expression objects
:T_ARRAY Array objects
:T_HASH Hash objects
:T_STRUCT Struct objects
:T_BIGNUM Large integer objects
:T_FILE File objects
:T_DATA C extension data objects
:T_MATCH MatchData objects
:T_COMPLEX Complex number objects
:T_RATIONAL Rational number objects
:T_NIL nil object
:T_TRUE true object
:T_FALSE false object
:T_SYMBOL Symbol objects
:T_FIXNUM Small integer objects (Ruby < 2.4)
:T_UNDEF Undefined objects
:T_NODE Parse tree nodes
:T_ICLASS Internal class objects
:T_ZOMBIE Zombie objects (freed but not swept)