Overview
Ruby manages memory automatically through a mark-and-sweep garbage collector that reclaims unused objects. The garbage collector runs in generational phases, tracking object references and freeing memory when objects become unreachable. Ruby's memory management operates transparently but provides interfaces for monitoring and tuning performance-critical applications.
The garbage collector divides objects into generations based on survival patterns. Young objects occupy eden space and survivor spaces, while long-lived objects move to old space. This generational approach optimizes collection frequency since most objects die young. Ruby also maintains separate spaces for large objects and uses copy collection for young generations.
# Object creation triggers memory allocation
user = User.new(name: "Alice")
data = Array.new(1000) { |i| "item_#{i}" }
# Objects become eligible for collection when unreferenced
user = nil
data = nil
ObjectSpace provides the primary interface for memory introspection. The module exposes object counting, memory statistics, and garbage collection controls. These tools help diagnose memory usage patterns and optimize allocation-intensive code.
# Count objects by class
ObjectSpace.count_objects
# => {:TOTAL=>47891, :FREE=>3421, :T_OBJECT=>1205, ...}
# Force garbage collection
GC.start
# Examine memory statistics
GC.stat
# => {:count=>23, :heap_allocated_pages=>117, :heap_sorted_length=>117, ...}
Ruby's memory model handles both stack and heap allocation. Local variables and method parameters use stack allocation, while objects created with .new
allocate on the heap. The garbage collector only manages heap memory, making stack allocation extremely fast but limited in scope.
Basic Usage
Garbage collection runs automatically but applications can trigger collection manually using GC.start
. Manual collection helps control timing in performance-sensitive code sections. The collector provides statistics through GC.stat
and object counting through ObjectSpace.count_objects
.
# Manual garbage collection
GC.start
# Disable automatic collection temporarily
GC.disable
# ... performance-critical code
GC.enable
# Check if GC is enabled
puts "GC enabled: #{GC.enable?}"
ObjectSpace.each_object iterates through all live objects of a specified class. This method helps identify memory leaks and analyze object distribution. The iteration occurs during garbage collection suspension to ensure consistent results.
# Count string objects
string_count = 0
ObjectSpace.each_object(String) { |str| string_count += 1 }
puts "Live strings: #{string_count}"
# Find large arrays
large_arrays = []
ObjectSpace.each_object(Array) do |arr|
large_arrays << arr if arr.size > 1000
end
Weak references through ObjectSpace.define_finalizer attach cleanup code to objects. Finalizers execute during garbage collection but cannot access the original object. This mechanism helps release external resources like file handles or network connections.
class FileManager
def initialize(filename)
@file = File.open(filename)
# Register finalizer for cleanup
ObjectSpace.define_finalizer(self, self.class.finalizer(@file))
end
def self.finalizer(file)
proc { file.close unless file.closed? }
end
end
Memory profiling uses ObjectSpace callbacks to track allocation patterns. The trace_object_allocations
method records allocation sites and counts for debugging memory growth. This tracing adds overhead but provides detailed allocation information.
# Enable allocation tracing
ObjectSpace.trace_object_allocations_start
# Run code under analysis
1000.times { "string_#{rand(100)}" }
# Examine allocations
ObjectSpace.allocation_sourcefile("test string") # => "filename.rb"
ObjectSpace.allocation_sourceline("test string") # => 15
ObjectSpace.trace_object_allocations_stop
Performance & Memory
Memory allocation performance varies significantly between object types and sizes. String and array allocation dominates most application profiles. Small objects allocate faster than large objects due to memory pool management. Understanding allocation patterns helps optimize critical code paths.
Ruby uses size-segregated free lists for small objects and direct malloc for large objects. Objects smaller than 40 bytes use optimized allocation pools. Arrays and hashes pre-allocate capacity to reduce reallocations during growth. String objects use copy-on-write optimization for substring operations.
# Benchmark different allocation patterns
require 'benchmark'
Benchmark.bm(20) do |x|
x.report("small objects:") do
100_000.times { Object.new }
end
x.report("large arrays:") do
1000.times { Array.new(10_000, 0) }
end
x.report("string allocation:") do
100_000.times { |i| "string_#{i}" }
end
end
Garbage collection frequency depends on heap growth and allocation rate. The collector triggers when heap occupancy reaches threshold percentages. Tuning RUBY_GC_HEAP_GROWTH_FACTOR
and RUBY_GC_HEAP_GROWTH_MAX_SLOTS
controls collection timing. Lower factors reduce memory usage but increase collection frequency.
Memory-intensive applications benefit from heap preallocation using GC::Profiler
. The profiler tracks collection timing and frequency across program execution. Applications can use this data to optimize allocation patterns and reduce collection overhead.
# Profile garbage collection
GC::Profiler.enable
GC::Profiler.clear
# Run memory-intensive code
data = Array.new(100_000) { |i| { id: i, value: "item_#{i}" } }
processed = data.map { |item| item[:value].upcase }
# Analyze collection profile
report = GC::Profiler.result
puts report
# Examine specific metrics
GC::Profiler.total_time # => 0.0123 seconds
Object pooling reduces allocation overhead for frequently created objects. Pools maintain pre-allocated objects for reuse instead of creating new instances. This technique works best for objects with expensive initialization or high allocation rates.
class ConnectionPool
def initialize(size = 10)
@pool = []
@mutex = Mutex.new
size.times { @pool << create_connection }
end
def checkout
@mutex.synchronize do
@pool.pop || create_connection
end
end
def checkin(connection)
@mutex.synchronize do
@pool.push(connection) if @pool.size < 20
end
end
private
def create_connection
# Expensive connection creation
{ socket: TCPSocket.new("localhost", 8080), created_at: Time.now }
end
end
Thread Safety & Concurrency
Ruby's Global Interpreter Lock (GIL) serializes thread execution but allows concurrent garbage collection. The collector can interrupt any thread and must coordinate with all threads before proceeding. This coordination creates brief synchronization points across the entire application.
Garbage collection uses a tri-color marking algorithm that operates concurrently with application threads. The collector marks objects as white (unmarked), gray (marked but not scanned), or black (marked and scanned). This approach reduces stop-the-world pause times but requires careful synchronization.
# Demonstrate GC coordination across threads
threads = []
mutex = Mutex.new
gc_count = 0
5.times do |i|
threads << Thread.new do
1000.times do
# Allocate objects in each thread
Array.new(100) { "thread_#{i}_data" }
# Track GC occurrences
current_count = GC.count
mutex.synchronize do
if current_count > gc_count
gc_count = current_count
puts "GC ##{gc_count} occurred during thread #{i}"
end
end
end
end
end
threads.each(&:join)
Thread-local allocation reduces contention for memory pools. Each thread maintains separate allocation buffers for small objects. This design minimizes lock contention during allocation but requires coordination during garbage collection. Large object allocation still uses shared pools.
Finalizers execute on arbitrary threads and must handle concurrent access correctly. The garbage collector thread runs finalizers during collection cycles. Finalizers should avoid accessing shared state or use appropriate synchronization primitives.
class ThreadSafeResource
@@cleanup_mutex = Mutex.new
@@resources = []
def initialize
@resource_id = SecureRandom.uuid
@@cleanup_mutex.synchronize do
@@resources << @resource_id
end
ObjectSpace.define_finalizer(self, self.class.finalizer(@resource_id))
end
def self.finalizer(resource_id)
proc do
@@cleanup_mutex.synchronize do
@@resources.delete(resource_id)
# Cleanup external resources safely
end
end
end
def self.active_resources
@@cleanup_mutex.synchronize { @@resources.dup }
end
end
Common Pitfalls
Memory leaks occur when objects remain reachable through unintended references. Global variables, class variables, and constants prevent garbage collection of referenced objects. Circular references between objects also prevent collection in some cases.
# Problematic global reference
$cached_data = []
def process_request(data)
# Accidentally accumulates data forever
$cached_data << data
# ... process data
end
# Better approach with size limits
class RequestCache
def initialize(max_size = 1000)
@cache = []
@max_size = max_size
end
def add(data)
@cache << data
@cache.shift if @cache.size > @max_size
end
end
String interpolation creates new string objects on each evaluation. Repeated interpolation in loops generates significant garbage. Using string concatenation with <<
or building arrays and joining reduces allocations.
# Inefficient string building
result = ""
1000.times { |i| result += "item_#{i}," }
# More efficient approaches
result = []
1000.times { |i| result << "item_#{i}" }
final_string = result.join(",")
# Or using string buffer
result = String.new
1000.times { |i| result << "item_#{i}," }
Hash and array growth triggers internal reallocation and copying. Pre-allocating capacity using Array.new(size)
or Hash.new
with capacity hints reduces reallocations. The capacity
method reveals internal buffer sizes for debugging growth patterns.
Finalizers create subtle ordering dependencies and resource management issues. Finalizers cannot reference the original object and execute on unpredictable threads. External resource cleanup should use explicit close methods rather than relying on finalization.
# Problematic finalizer usage
class BadFileHandler
def initialize(filename)
@filename = filename
@file = File.open(filename)
# Finalizer cannot access @file reliably
ObjectSpace.define_finalizer(self) do
@file.close # This won't work!
end
end
end
# Better explicit resource management
class GoodFileHandler
def initialize(filename)
@file = File.open(filename)
end
def close
@file.close unless @file.closed?
@file = nil
end
def self.open(filename)
handler = new(filename)
if block_given?
begin
yield handler
ensure
handler.close
end
else
handler
end
end
end
ObjectSpace iteration holds references to enumerated objects, preventing collection during iteration. Long-running iterations can cause memory buildup. Breaking large iterations into smaller chunks allows intermediate collection.
Closure captures retain references to enclosing scopes, potentially preventing garbage collection of large objects. Proc and lambda objects capture the entire binding, not just referenced variables. Explicit niling of unused variables helps release references.
# Closure retains large dataset unnecessarily
large_dataset = Array.new(1_000_000) { rand }
small_value = large_dataset.first
processor = proc { puts "Processing: #{small_value}" }
# large_dataset remains referenced through closure
# Better to explicitly clear
large_dataset = nil
processor = proc { puts "Processing: #{small_value}" }
Reference
Garbage Collection Methods
Method | Parameters | Returns | Description |
---|---|---|---|
GC.start |
full_mark: true , immediate_sweep: true |
nil |
Initiates garbage collection cycle |
GC.enable |
None | true/false |
Enables automatic garbage collection |
GC.disable |
None | true/false |
Disables automatic garbage collection |
GC.count |
None | Integer |
Returns total number of GC runs |
GC.stat |
key = nil |
Hash/Integer |
Returns GC statistics hash or specific value |
ObjectSpace Methods
Method | Parameters | Returns | Description |
---|---|---|---|
ObjectSpace.count_objects |
result_hash = nil |
Hash |
Counts objects by type |
ObjectSpace.each_object |
class = nil , &block |
Integer |
Iterates through live objects |
ObjectSpace.define_finalizer |
obj , callable |
Array |
Registers object finalizer |
ObjectSpace.undefine_finalizer |
obj |
obj |
Removes object finalizer |
ObjectSpace.garbage_collect |
full_mark: true , immediate_sweep: true |
nil |
Triggers garbage collection |
Allocation Tracing Methods
Method | Parameters | Returns | Description |
---|---|---|---|
ObjectSpace.trace_object_allocations_start |
None | nil |
Enables allocation tracing |
ObjectSpace.trace_object_allocations_stop |
None | nil |
Disables allocation tracing |
ObjectSpace.allocation_sourcefile |
object |
String/nil |
Returns allocation source file |
ObjectSpace.allocation_sourceline |
object |
Integer/nil |
Returns allocation source line |
ObjectSpace.allocation_class_path |
object |
String/nil |
Returns allocation class path |
GC Statistics Keys
Key | Type | Description |
---|---|---|
:count |
Integer | Total garbage collections performed |
:heap_allocated_pages |
Integer | Number of heap pages allocated |
:heap_sorted_length |
Integer | Number of heap pages that can hold objects |
:heap_allocatable_pages |
Integer | Number of heap pages available for allocation |
:heap_available_slots |
Integer | Number of heap slots available for objects |
:heap_live_slots |
Integer | Number of heap slots containing live objects |
:heap_free_slots |
Integer | Number of heap slots available for allocation |
:heap_final_slots |
Integer | Number of heap slots pending finalization |
:old_objects |
Integer | Number of old generation objects |
:old_objects_limit |
Integer | Threshold for old generation collection |
:oldmalloc_increase |
Integer | Bytes allocated outside heap since last GC |
:oldmalloc_limit |
Integer | Threshold for malloc-triggered collection |
Object Count Types
Type | Description |
---|---|
:TOTAL |
Total number of allocated objects |
:FREE |
Number of free object slots |
:T_OBJECT |
Basic Ruby objects |
:T_CLASS |
Class objects |
:T_MODULE |
Module objects |
:T_FLOAT |
Float objects |
:T_STRING |
String objects |
:T_REGEXP |
Regular expression objects |
:T_ARRAY |
Array objects |
:T_HASH |
Hash objects |
:T_FILE |
File objects |
:T_DATA |
Extension data objects |
:T_MATCH |
MatchData objects |
:T_COMPLEX |
Complex number objects |
:T_RATIONAL |
Rational number objects |
Environment Variables
Variable | Default | Description |
---|---|---|
RUBY_GC_HEAP_INIT_SLOTS |
10000 | Initial heap slot count |
RUBY_GC_HEAP_FREE_SLOTS |
4096 | Minimum free slots maintained |
RUBY_GC_HEAP_GROWTH_FACTOR |
1.8 | Heap growth multiplier |
RUBY_GC_HEAP_GROWTH_MAX_SLOTS |
0 | Maximum slots per growth (0 = unlimited) |
RUBY_GC_MALLOC_LIMIT |
16MB | Malloc limit before collection |
RUBY_GC_MALLOC_LIMIT_MAX |
32MB | Maximum malloc limit |
RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR |
1.4 | Malloc limit growth factor |
RUBY_GC_OLDMALLOC_LIMIT |
16MB | Old malloc limit before collection |
RUBY_GC_OLDMALLOC_LIMIT_MAX |
128MB | Maximum old malloc limit |