Overview
Ruby's garbage collection system runs automatically to reclaim memory from objects no longer referenced by the program. The GC
module provides methods to manually control collection timing, gather statistics, and monitor memory usage patterns. Ruby uses a mark-and-sweep garbage collector with generational collection that divides objects into young and old generations.
The garbage collector identifies unreachable objects by marking all objects accessible from root references, then sweeping through memory to free unmarked objects. Ruby's GC operates in phases: marking, sweeping, and compacting (in newer versions). Each phase impacts application performance differently.
# Basic GC information
GC.count # => 42 (number of GC runs)
GC.stat[:count] # => 42 (same information via stats)
GC.stat[:heap_allocated_pages] # => 245
Ruby automatically triggers garbage collection when memory allocation reaches certain thresholds. The collector runs more frequently for young generation objects and less frequently for long-lived objects. This generational approach improves performance since most objects become garbage quickly.
# Check current GC status
GC.stat.select { |k, v| k.to_s.include?('count') }
# => {:count=>42, :major_gc_count=>12, :minor_gc_count=>30}
# Memory usage information
GC.stat[:heap_live_slots] # => 125432
GC.stat[:heap_free_slots] # => 8765
GC.stat[:total_allocated_objects] # => 1456789
The GC module exposes methods for manual collection, enabling/disabling automatic collection, and retrieving detailed statistics about memory usage and collection frequency. These controls allow developers to optimize performance-critical sections and gather memory profiling data.
Basic Usage
Manual garbage collection control starts with GC.start
, which immediately triggers a full collection cycle. This method blocks execution until collection completes, making it suitable for controlled environments but problematic in request-response cycles.
# Force garbage collection
objects_before = ObjectSpace.count_objects[:T_STRING]
1000.times { "temporary string #{rand}" }
objects_after = ObjectSpace.count_objects[:T_STRING]
puts "Created #{objects_after - objects_before} strings"
GC.start
objects_final = ObjectSpace.count_objects[:T_STRING]
puts "Collected #{objects_after - objects_final} strings"
Disabling automatic garbage collection with GC.disable
prevents Ruby from running collection cycles automatically. This creates memory pressure but eliminates GC pauses during critical operations. Memory allocation continues until manual collection or re-enabling automatic collection.
# Disable GC for performance-critical section
GC.disable
start_time = Time.now
# Memory-intensive operation without GC interruption
large_array = []
10_000.times do |i|
large_array << { id: i, data: "item_#{i}" * 100 }
end
processing_time = Time.now - start_time
memory_used = GC.stat[:heap_live_slots]
GC.enable
GC.start # Clean up accumulated garbage
puts "Processed without GC in #{processing_time}s"
puts "Memory slots used: #{memory_used}"
The GC.enable
method reactivates automatic garbage collection after being disabled. Ruby immediately evaluates whether collection is needed based on current memory pressure and allocation patterns.
# Check if GC is enabled
puts "GC enabled: #{GC.enable?}" # => true
GC.disable
puts "GC enabled: #{GC.enable?}" # => false
# Process some data
data = (1..5000).map { |n| n.to_s * 10 }
GC.enable
puts "GC enabled: #{GC.enable?}" # => true
Accessing garbage collection statistics provides insight into memory usage patterns and collection frequency. The GC.stat
method returns a hash with detailed information about heap usage, object counts, and collection timing.
# Comprehensive GC statistics
stats = GC.stat
puts "Total collections: #{stats[:count]}"
puts "Major collections: #{stats[:major_gc_count]}"
puts "Minor collections: #{stats[:minor_gc_count]}"
puts "Live objects: #{stats[:heap_live_slots]}"
puts "Free slots: #{stats[:heap_free_slots]}"
puts "Allocated objects: #{stats[:total_allocated_objects]}"
puts "Freed objects: #{stats[:total_freed_objects]}"
Performance & Memory
Garbage collection timing significantly impacts application performance. Each collection cycle pauses execution while marking and sweeping memory, creating latency spikes in responsive applications. Understanding collection patterns helps optimize memory allocation strategies.
# Measure GC impact on performance
def benchmark_with_gc_stats
start_stats = GC.stat
start_time = Time.now
# Memory-intensive operation
arrays = []
1000.times do
arrays << Array.new(1000) { rand(1000) }
end
end_time = Time.now
end_stats = GC.stat
{
duration: end_time - start_time,
gc_runs: end_stats[:count] - start_stats[:count],
objects_allocated: end_stats[:total_allocated_objects] - start_stats[:total_allocated_objects],
objects_freed: end_stats[:total_freed_objects] - start_stats[:total_freed_objects]
}
end
results = benchmark_with_gc_stats
puts "Duration: #{results[:duration]}s"
puts "GC runs: #{results[:gc_runs]}"
puts "Objects allocated: #{results[:objects_allocated]}"
puts "Objects freed: #{results[:objects_freed]}"
Memory allocation patterns affect garbage collection frequency. Creating many short-lived objects triggers frequent minor collections, while long-lived objects accumulate in older generations and require major collections.
# Compare allocation strategies
def create_many_small_objects
GC.start # Clean slate
start_count = GC.count
10_000.times { "string_#{rand(1000)}" }
GC.count - start_count
end
def create_few_large_objects
GC.start # Clean slate
start_count = GC.count
100.times { "x" * 100_000 }
GC.count - start_count
end
small_gc_count = create_many_small_objects
large_gc_count = create_few_large_objects
puts "Small objects triggered #{small_gc_count} collections"
puts "Large objects triggered #{large_gc_count} collections"
Heap growth patterns indicate memory usage efficiency. Ruby allocates memory in pages, and excessive page allocation suggests memory pressure or inefficient object lifecycle management.
# Monitor heap growth during processing
def monitor_heap_growth(&block)
initial_pages = GC.stat[:heap_allocated_pages]
initial_slots = GC.stat[:heap_available_slots]
yield
final_pages = GC.stat[:heap_allocated_pages]
final_slots = GC.stat[:heap_available_slots]
{
pages_added: final_pages - initial_pages,
slots_added: final_slots - initial_slots,
pages_total: final_pages,
slots_total: final_slots
}
end
# Test with different workloads
small_growth = monitor_heap_growth { 1000.times { |i| i.to_s } }
large_growth = monitor_heap_growth { 1000.times { |i| "data" * 1000 } }
puts "Small strings: #{small_growth[:pages_added]} pages, #{small_growth[:slots_added]} slots"
puts "Large strings: #{large_growth[:pages_added]} pages, #{large_growth[:slots_added]} slots"
Object lifecycle optimization reduces garbage collection pressure by reusing objects instead of creating new instances. Pooling strategies and in-place modifications minimize allocation overhead.
# Compare object creation vs reuse
class StringProcessor
def initialize
@buffer = String.new
end
def process_with_reuse(data)
@buffer.clear
data.each do |item|
@buffer << item.to_s
@buffer << "\n"
end
@buffer.dup
end
def process_with_creation(data)
result = ""
data.each do |item|
result += item.to_s + "\n"
end
result
end
end
processor = StringProcessor.new
test_data = (1..1000).to_a
# Benchmark both approaches
reuse_stats = GC.stat
processor.process_with_reuse(test_data)
reuse_allocated = GC.stat[:total_allocated_objects] - reuse_stats[:total_allocated_objects]
creation_stats = GC.stat
processor.process_with_creation(test_data)
creation_allocated = GC.stat[:total_allocated_objects] - creation_stats[:total_allocated_objects]
puts "Reuse approach allocated: #{reuse_allocated} objects"
puts "Creation approach allocated: #{creation_allocated} objects"
puts "Reduction: #{((creation_allocated - reuse_allocated) / creation_allocated.to_f * 100).round(2)}%"
Thread Safety & Concurrency
Ruby's garbage collector operates across all threads simultaneously, pausing execution in all threads during collection phases. This stop-the-world behavior affects multithreaded applications differently than single-threaded programs, requiring careful consideration of GC timing and thread coordination.
# Demonstrate GC impact on multiple threads
require 'thread'
def create_worker_thread(name, work_size)
Thread.new do
puts "#{name} starting work"
start_time = Time.now
# Create objects that will become garbage
work_size.times do |i|
data = Array.new(100) { rand(1000) }
# Process data briefly then discard
data.sum if i % 100 == 0
end
duration = Time.now - start_time
puts "#{name} completed in #{duration}s"
end
end
# Start multiple threads with different workloads
threads = [
create_worker_thread("Heavy", 10_000),
create_worker_thread("Medium", 5_000),
create_worker_thread("Light", 1_000)
]
# Monitor GC while threads run
gc_monitor = Thread.new do
initial_count = GC.count
while threads.any?(&:alive?)
current_count = GC.count
if current_count > initial_count
puts "GC occurred - pausing all threads"
initial_count = current_count
end
sleep 0.1
end
end
threads.each(&:join)
gc_monitor.kill
Manual garbage collection in multithreaded environments affects all threads simultaneously. Calling GC.start
from any thread pauses execution across the entire process, making it unsuitable for background collection in responsive applications.
# Show cross-thread GC impact
require 'thread'
shared_data = []
mutex = Mutex.new
gc_thread_active = true
# Background thread adding data
producer = Thread.new do
counter = 0
while gc_thread_active
mutex.synchronize do
shared_data << { id: counter, timestamp: Time.now }
counter += 1
end
sleep 0.01
end
end
# Foreground thread triggering GC
gc_controller = Thread.new do
sleep 1 # Let producer run
puts "Triggering GC - will pause producer"
start_size = nil
mutex.synchronize { start_size = shared_data.size }
GC.start # This pauses ALL threads
end_size = nil
mutex.synchronize { end_size = shared_data.size }
puts "Data added during GC: #{end_size - start_size} (should be 0)"
gc_thread_active = false
end
[producer, gc_controller].each(&:join)
Disabling garbage collection affects memory pressure across all threads. When one thread disables GC, memory allocation continues in all threads until collection is manually triggered or re-enabled from any thread.
# Memory pressure across threads with disabled GC
require 'thread'
memory_data = {}
threads_finished = false
# Thread 1: Allocates large objects
allocator = Thread.new do
counter = 0
while !threads_finished
# Create large temporary objects
large_data = Array.new(10_000) { "data_#{counter}_#{rand(1000)}" }
counter += 1
sleep 0.05
end
puts "Allocator created #{counter} large arrays"
end
# Thread 2: Monitors memory usage
monitor = Thread.new do
GC.disable # Disable GC from this thread affects all threads
while !threads_finished
stats = GC.stat
memory_data[Time.now] = {
live_slots: stats[:heap_live_slots],
free_slots: stats[:heap_free_slots],
pages: stats[:heap_allocated_pages]
}
sleep 0.1
end
GC.enable
GC.start # Clean up accumulated objects
puts "Memory monitoring complete"
end
sleep 2
threads_finished = true
[allocator, monitor].each(&:join)
# Show memory growth
sorted_times = memory_data.keys.sort
first_sample = memory_data[sorted_times.first]
last_sample = memory_data[sorted_times.last]
puts "Live slots grew from #{first_sample[:live_slots]} to #{last_sample[:live_slots]}"
puts "Pages grew from #{first_sample[:pages]} to #{last_sample[:pages]}"
Production Patterns
Production applications require garbage collection monitoring to identify memory leaks, optimize performance, and prevent out-of-memory conditions. Establishing baseline GC metrics helps detect abnormal memory usage patterns before they impact system stability.
# Production GC monitoring system
class GCMonitor
def initialize(alert_threshold: 100, sample_interval: 30)
@alert_threshold = alert_threshold
@sample_interval = sample_interval
@baseline_stats = nil
@alert_callbacks = []
end
def establish_baseline
# Take multiple samples to establish normal GC patterns
samples = []
5.times do
samples << GC.stat.dup
sleep @sample_interval / 5
end
@baseline_stats = {
avg_major_gc_time: samples.map { |s| s[:major_gc_count] }.sum / samples.size.to_f,
avg_minor_gc_time: samples.map { |s| s[:minor_gc_count] }.sum / samples.size.to_f,
avg_heap_pages: samples.map { |s| s[:heap_allocated_pages] }.sum / samples.size.to_f
}
puts "Baseline established: #{@baseline_stats}"
end
def monitor_continuously
Thread.new do
loop do
current_stats = GC.stat
check_for_anomalies(current_stats)
sleep @sample_interval
end
end
end
def on_alert(&callback)
@alert_callbacks << callback
end
private
def check_for_anomalies(stats)
return unless @baseline_stats
alerts = []
# Check for excessive major GC
if stats[:major_gc_count] > @baseline_stats[:avg_major_gc_time] * 2
alerts << "High major GC count: #{stats[:major_gc_count]}"
end
# Check for heap growth
if stats[:heap_allocated_pages] > @baseline_stats[:avg_heap_pages] * 1.5
alerts << "Heap growth: #{stats[:heap_allocated_pages]} pages"
end
# Check for low free slots (memory pressure)
if stats[:heap_free_slots] < stats[:heap_live_slots] * 0.1
alerts << "Low free slots: #{stats[:heap_free_slots]}"
end
alerts.each { |alert| trigger_alert(alert, stats) }
end
def trigger_alert(message, stats)
@alert_callbacks.each { |callback| callback.call(message, stats) }
end
end
# Set up production monitoring
monitor = GCMonitor.new(sample_interval: 60)
monitor.on_alert do |message, stats|
puts "[ALERT] #{Time.now}: #{message}"
puts " Live objects: #{stats[:heap_live_slots]}"
puts " GC count: #{stats[:count]}"
# In production: send to logging system, metrics service, etc.
end
monitor.establish_baseline
monitoring_thread = monitor.monitor_continuously
# Simulate production load
load_simulation = Thread.new do
1000.times do |i|
# Simulate request processing
request_data = Array.new(rand(100..500)) { "request_#{i}_#{rand(1000)}" }
# Occasionally create memory pressure
if i % 100 == 0
large_response = "x" * 100_000
end
sleep 0.1
end
end
# Let monitoring run
load_simulation.join
sleep 10 # Allow final monitoring samples
Web applications benefit from strategic garbage collection timing between requests to minimize response latency. Middleware can trigger collection during idle periods or after memory-intensive operations.
# Rack middleware for GC optimization
class GCOptimizationMiddleware
def initialize(app, options = {})
@app = app
@gc_frequency = options[:gc_frequency] || 10
@memory_threshold = options[:memory_threshold] || 50_000
@request_count = 0
end
def call(env)
pre_request_stats = GC.stat
# Process the request
status, headers, response = @app.call(env)
post_request_stats = GC.stat
objects_allocated = post_request_stats[:total_allocated_objects] -
pre_request_stats[:total_allocated_objects]
# Decide whether to trigger GC
should_gc = should_trigger_gc?(objects_allocated)
if should_gc
GC.start
log_gc_decision(env, objects_allocated, true)
end
@request_count += 1
[status, headers, response]
end
private
def should_trigger_gc?(objects_allocated)
# Trigger GC based on multiple criteria
return true if objects_allocated > @memory_threshold
return true if @request_count % @gc_frequency == 0
return true if GC.stat[:heap_free_slots] < GC.stat[:heap_live_slots] * 0.2
false
end
def log_gc_decision(env, objects_allocated, triggered)
path = env['PATH_INFO']
method = env['REQUEST_METHOD']
if triggered
puts "[GC] Triggered after #{method} #{path} (#{objects_allocated} objects)"
end
end
end
# Usage in web application
# use GCOptimizationMiddleware, gc_frequency: 5, memory_threshold: 100_000
Database-heavy applications require careful GC management during bulk operations to prevent memory exhaustion while processing large result sets.
# Database batch processing with GC management
class BatchProcessor
def initialize(batch_size: 1000, gc_interval: 10)
@batch_size = batch_size
@gc_interval = gc_interval
end
def process_large_dataset(query)
batch_count = 0
total_processed = 0
gc_stats = { initial: GC.stat.dup }
# Simulate database cursor/streaming
simulate_database_results(query) do |batch|
process_batch(batch)
batch_count += 1
total_processed += batch.size
# Periodic GC to manage memory
if batch_count % @gc_interval == 0
before_gc = GC.stat[:heap_live_slots]
GC.start
after_gc = GC.stat[:heap_live_slots]
puts "Batch #{batch_count}: Processed #{total_processed} records"
puts " GC freed #{before_gc - after_gc} slots"
puts " Memory pages: #{GC.stat[:heap_allocated_pages]}"
end
end
gc_stats[:final] = GC.stat.dup
report_processing_stats(total_processed, gc_stats)
end
private
def simulate_database_results(query)
# Simulate large result set processing
total_records = 50_000
(0...total_records).each_slice(@batch_size) do |batch_ids|
# Simulate fetching batch from database
batch = batch_ids.map do |id|
{
id: id,
data: "record_data_#{id}" * rand(10..50),
metadata: { processed_at: Time.now, batch: id / @batch_size }
}
end
yield batch
end
end
def process_batch(batch)
# Simulate processing work that creates temporary objects
batch.each do |record|
# Transform data (creates intermediate objects)
processed = record[:data].upcase.split('_').join('-')
# Validate (creates temporary objects)
validation_result = processed.length > 10 && processed.include?('-')
# Store or transmit result (retain only necessary data)
store_processed_record(record[:id], validation_result)
end
end
def store_processed_record(id, valid)
# Simulate storing minimal data
@results ||= {}
@results[id] = valid
end
def report_processing_stats(total_processed, gc_stats)
initial = gc_stats[:initial]
final = gc_stats[:final]
puts "\nProcessing complete:"
puts " Records processed: #{total_processed}"
puts " GC runs: #{final[:count] - initial[:count]}"
puts " Objects allocated: #{final[:total_allocated_objects] - initial[:total_allocated_objects]}"
puts " Objects freed: #{final[:total_freed_objects] - initial[:total_freed_objects]}"
puts " Final heap pages: #{final[:heap_allocated_pages]}"
puts " Results stored: #{@results&.size || 0}"
end
end
# Process large dataset with controlled GC
processor = BatchProcessor.new(batch_size: 500, gc_interval: 5)
processor.process_large_dataset("SELECT * FROM large_table WHERE active = true")
Reference
Core Methods
Method | Parameters | Returns | Description |
---|---|---|---|
GC.start |
full_mark: true, immediate_sweep: true |
nil |
Triggers garbage collection immediately |
GC.enable |
none | true or false |
Enables automatic GC, returns previous state |
GC.disable |
none | true or false |
Disables automatic GC, returns previous state |
GC.enable? |
none | true or false |
Returns current automatic GC state |
GC.count |
none | Integer | Returns number of GC runs since start |
GC.stat |
hash = nil, symbol = nil |
Hash or Integer | Returns GC statistics |
GC Statistics Keys
Statistic | Type | Description |
---|---|---|
:count |
Integer | Total garbage collections |
:major_gc_count |
Integer | Major (full) garbage collections |
:minor_gc_count |
Integer | Minor (generational) garbage collections |
:heap_allocated_pages |
Integer | Total memory pages allocated |
:heap_available_slots |
Integer | Total object slots available |
:heap_live_slots |
Integer | Object slots currently in use |
:heap_free_slots |
Integer | Object slots available for allocation |
:heap_final_slots |
Integer | Object slots swept in last GC |
:total_allocated_objects |
Integer | Objects allocated since start |
:total_freed_objects |
Integer | Objects freed by GC since start |
Memory Metrics Reference
Metric | Calculation | Interpretation |
---|---|---|
Memory Utilization | heap_live_slots / heap_available_slots |
Percentage of allocated memory in use |
Allocation Rate | total_allocated_objects / uptime |
Objects allocated per second |
GC Frequency | count / uptime |
Collections per second |
Collection Efficiency | total_freed_objects / total_allocated_objects |
Percentage of objects eventually freed |
Heap Growth Rate | heap_allocated_pages over time |
Memory expansion pattern |
GC Tuning Environment Variables
Variable | Values | Effect |
---|---|---|
RUBY_GC_HEAP_INIT_SLOTS |
Integer | Initial object slots allocated |
RUBY_GC_HEAP_FREE_SLOTS |
Integer | Minimum free slots maintained |
RUBY_GC_HEAP_GROWTH_FACTOR |
Float | Heap expansion multiplier |
RUBY_GC_HEAP_GROWTH_MAX_SLOTS |
Integer | Maximum slots added per expansion |
RUBY_GC_MALLOC_LIMIT |
Integer | Malloc bytes triggering GC |
RUBY_GC_MALLOC_LIMIT_MAX |
Integer | Maximum malloc limit |
RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR |
Float | Malloc limit growth rate |
ObjectSpace Integration
Method | Parameters | Returns | Description |
---|---|---|---|
ObjectSpace.count_objects |
result_hash = nil |
Hash | Object counts by type |
ObjectSpace.count_objects_size |
result_hash = nil |
Hash | Memory usage by object type |
ObjectSpace.memsize_of(obj) |
Object | Integer | Memory size of specific object |
ObjectSpace.memsize_of_all |
klass = nil |
Integer | Total memory usage by class |
Performance Patterns
Pattern | Code Example | Use Case |
---|---|---|
Batch GC | GC.disable; process_batch; GC.enable; GC.start |
Eliminate GC pauses during critical work |
Memory Monitoring | before = GC.stat; work; after = GC.stat |
Track allocation patterns |
Heap Preallocation | Set RUBY_GC_HEAP_INIT_SLOTS |
Reduce early heap expansions |
Object Reuse | @buffer.clear; populate_buffer |
Minimize allocation overhead |
Periodic Collection | GC.start if counter % interval == 0 |
Control collection timing |
Common GC Statistics Combinations
# Memory pressure indicators
def memory_pressure_score
stats = GC.stat
live_ratio = stats[:heap_live_slots].to_f / stats[:heap_available_slots]
free_ratio = stats[:heap_free_slots].to_f / stats[:heap_available_slots]
pressure = (live_ratio * 0.7) + ((1.0 - free_ratio) * 0.3)
(pressure * 100).round(2)
end
# Allocation efficiency metrics
def allocation_efficiency
stats = GC.stat
return 0 if stats[:total_allocated_objects] == 0
freed_ratio = stats[:total_freed_objects].to_f / stats[:total_allocated_objects]
(freed_ratio * 100).round(2)
end
# GC overhead estimation
def gc_overhead_estimate
stats = GC.stat
# Rough estimate: each major GC ~5ms, minor GC ~1ms
major_time = stats[:major_gc_count] * 0.005
minor_time = stats[:minor_gc_count] * 0.001
{
major_gc_time: major_time,
minor_gc_time: minor_time,
total_gc_time: major_time + minor_time
}
end