CrackedRuby logo

CrackedRuby

Heap Compaction

Ruby's heap compaction feature that reduces memory fragmentation by relocating objects to create contiguous memory spaces.

Performance Optimization Memory Management
7.2.4

Overview

Heap compaction addresses memory fragmentation in Ruby's garbage collector by moving objects to create larger contiguous blocks of free memory. Ruby's mark-and-sweep garbage collector can leave gaps between allocated objects over time, leading to fragmented heap pages that cannot be returned to the operating system even when total memory usage is low.

The compaction process works by identifying moveable objects during garbage collection and relocating them to eliminate gaps. Ruby updates all references to moved objects automatically, maintaining program correctness while improving memory layout. This process requires cooperation between the garbage collector and the Ruby virtual machine to track and update object references.

Ruby implements compaction through several key components. The GC module provides the primary interface with methods like GC.compact and GC.verify_compaction_references. The garbage collector maintains internal structures to track object movement and reference updates. Configuration options control when and how aggressively compaction runs.

# Check current compaction support
GC.respond_to?(:compact)
# => true

# Trigger manual compaction
GC.compact
# => {...}  # Returns statistics about compaction

# Enable automatic compaction
GC.auto_compact = true

Compaction operates on the principle that many objects become long-lived and don't need frequent relocation. The garbage collector identifies these stable objects and moves them to create dense regions of memory. Short-lived objects that get collected quickly don't contribute significantly to fragmentation, so the overhead of moving them would outweigh benefits.

The process maintains object identity and behavior while changing physical memory locations. Ruby handles reference updates transparently, including updates to constants, global variables, local variables, and object instance variables. This transparency means most Ruby code works unchanged with compaction enabled.

Basic Usage

Manual compaction provides direct control over when memory defragmentation occurs. The GC.compact method triggers an immediate compaction cycle and returns detailed statistics about the operation. This approach works well for applications that can predict optimal compaction timing.

# Perform compaction and examine results
stats = GC.compact
puts "Objects moved: #{stats[:considered]}"
puts "References updated: #{stats[:moved]}"

# Compaction during low-activity periods
def compact_during_maintenance
  puts "Starting compaction..."
  result = GC.compact
  puts "Moved #{result[:moved]} objects"
  result
end

Automatic compaction enables the garbage collector to perform compaction without explicit application control. Ruby determines appropriate times to run compaction based on allocation patterns and heap fragmentation levels. Applications can configure the frequency and conditions for automatic compaction.

# Enable automatic compaction
GC.auto_compact = true

# Check current setting
puts "Auto compaction: #{GC.auto_compact}"

# Disable automatic compaction
GC.auto_compact = false

The GC.latest_compact_info method provides information about the most recent compaction operation, whether triggered manually or automatically. This data helps applications understand compaction frequency and effectiveness.

# Get information about last compaction
info = GC.latest_compact_info
if info
  puts "Last compaction at: #{Time.at(info[:time])}"
  puts "Objects moved: #{info[:moved]}"
  puts "Pages freed: #{info[:freed_pages]}"
else
  puts "No compaction has occurred yet"
end

Verification methods help ensure compaction correctness during development and testing. The GC.verify_compaction_references method checks that all object references were updated correctly after compaction. This verification runs slowly and should only be used in development.

# Verify reference consistency (development only)
begin
  GC.verify_compaction_references
  puts "All references verified successfully"
rescue => error
  puts "Reference verification failed: #{error.message}"
end

Performance & Memory

Heap compaction trades CPU time for memory efficiency improvements. The compaction process requires additional work during garbage collection cycles but can significantly reduce memory fragmentation and improve cache locality. Applications must evaluate whether the memory benefits justify the processing overhead.

Memory fragmentation occurs when allocated objects leave gaps in heap pages after some objects get garbage collected. These gaps prevent Ruby from releasing entire pages back to the operating system, leading to higher resident memory usage than necessary. Compaction eliminates these gaps by moving surviving objects together.

# Measure memory usage before and after compaction
def measure_compaction_impact
  GC.start # Clean up before measurement
  before_rss = `ps -o rss= -p #{Process.pid}`.to_i
  
  stats = GC.compact
  
  after_rss = `ps -o rss= -p #{Process.pid}`.to_i
  
  puts "RSS before: #{before_rss} KB"
  puts "RSS after: #{after_rss} KB" 
  puts "Memory saved: #{before_rss - after_rss} KB"
  puts "Objects moved: #{stats[:moved]}"
end

The compaction overhead varies based on heap size and object connectivity. Applications with many interconnected objects require more work to update references during compaction. Dense object graphs with frequent cross-references create more update work than simple object hierarchies.

Cache locality improvements from compaction can boost application performance by reducing memory access costs. When related objects are physically close in memory, the processor can load them into cache more efficiently. This benefit particularly helps workloads that traverse object graphs or process related data structures.

# Benchmark object traversal before and after compaction
require 'benchmark'

class Node
  attr_accessor :value, :next
  
  def initialize(value)
    @value = value
  end
end

# Create a linked list
head = Node.new(1)
current = head
(2..10_000).each do |i|
  current.next = Node.new(i)
  current = current.next
end

def traverse_list(head)
  count = 0
  current = head
  while current
    count += current.value
    current = current.next
  end
  count
end

# Benchmark before compaction
time_before = Benchmark.realtime { 100.times { traverse_list(head) } }

GC.compact

# Benchmark after compaction  
time_after = Benchmark.realtime { 100.times { traverse_list(head) } }

puts "Time before compaction: #{time_before.round(4)}s"
puts "Time after compaction: #{time_after.round(4)}s"
puts "Performance change: #{((time_after - time_before) / time_before * 100).round(2)}%"

Automatic compaction frequency affects both memory usage and performance overhead. More frequent compaction keeps fragmentation low but increases garbage collection time. Less frequent compaction reduces CPU overhead but allows more fragmentation to accumulate between compaction cycles.

The garbage collector considers heap size, allocation rate, and fragmentation level when scheduling automatic compaction. Applications with steady allocation patterns benefit from predictable compaction intervals. Applications with bursty allocation patterns may need manual compaction control to optimize performance.

Production Patterns

Production deployments require careful compaction planning to balance memory optimization with application responsiveness. High-traffic applications cannot afford lengthy garbage collection pauses, while memory-constrained environments need aggressive fragmentation control. The optimal approach depends on specific application characteristics and infrastructure constraints.

Scheduled compaction during maintenance windows provides memory optimization without impacting user-facing operations. Applications can trigger compaction during low-traffic periods or planned maintenance intervals. This pattern works well for applications with predictable traffic patterns and defined maintenance schedules.

# Schedule compaction during off-peak hours
require 'cron_syntax'

class CompactionScheduler
  def initialize
    @cron = CronSyntax::Cron.new('0 2 * * *') # 2 AM daily
  end
  
  def should_compact?
    @cron.should_run?(Time.now) && low_traffic_period?
  end
  
  def perform_scheduled_compaction
    if should_compact?
      Rails.logger.info "Starting scheduled heap compaction"
      stats = GC.compact
      Rails.logger.info "Compaction completed: #{stats[:moved]} objects moved"
      
      # Report memory savings
      report_memory_impact(stats)
    end
  end
  
  private
  
  def low_traffic_period?
    # Check request rate or other metrics
    current_requests_per_minute < 100
  end
  
  def report_memory_impact(stats)
    # Send metrics to monitoring system
    StatsClient.gauge('heap_compaction.objects_moved', stats[:moved])
    StatsClient.gauge('heap_compaction.pages_freed', stats[:freed_pages] || 0)
  end
end

Memory threshold-based compaction triggers optimization when fragmentation exceeds acceptable levels. Applications monitor heap statistics and trigger compaction when fragmentation metrics indicate potential memory waste. This approach adapts to changing application behavior without fixed schedules.

class FragmentationMonitor
  def initialize(threshold_mb: 50)
    @threshold_bytes = threshold_mb * 1024 * 1024
  end
  
  def fragmentation_level
    stat = GC.stat
    heap_pages = stat[:heap_eden_pages] + stat[:heap_tomb_pages]
    heap_size = heap_pages * stat[:heap_page_size]
    
    live_objects = stat[:heap_live_slots]
    estimated_live_size = live_objects * average_object_size
    
    heap_size - estimated_live_size
  end
  
  def should_compact?
    fragmentation_level > @threshold_bytes
  end
  
  def auto_compact_if_needed
    if should_compact?
      Rails.logger.info "Fragmentation threshold exceeded, compacting heap"
      stats = GC.compact
      log_compaction_results(stats)
    end
  end
  
  private
  
  def average_object_size
    # Estimate based on application characteristics
    64 # bytes per object average
  end
  
  def log_compaction_results(stats)
    fragmentation_after = fragmentation_level
    Rails.logger.info "Compaction reduced fragmentation to #{fragmentation_after / 1024 / 1024} MB"
  end
end

Container environments benefit from compaction strategies that work within memory limits and orchestration constraints. Kubernetes and Docker deployments need compaction approaches that consider container lifecycle and resource constraints. Memory-limited containers require aggressive compaction to stay within limits.

class ContainerCompactionStrategy
  def initialize
    @memory_limit = detect_container_memory_limit
    @warning_threshold = @memory_limit * 0.8
    @critical_threshold = @memory_limit * 0.9
  end
  
  def monitor_and_compact
    current_usage = current_memory_usage
    
    case
    when current_usage > @critical_threshold
      emergency_compact
    when current_usage > @warning_threshold
      gentle_compact
    end
  end
  
  private
  
  def detect_container_memory_limit
    # Read from cgroup or environment
    limit_bytes = File.read('/sys/fs/cgroup/memory/memory.limit_in_bytes').to_i
    # Handle unlimited case
    limit_bytes > (1 << 62) ? 2 * 1024 * 1024 * 1024 : limit_bytes
  rescue
    2 * 1024 * 1024 * 1024 # 2GB default
  end
  
  def current_memory_usage
    `ps -o rss= -p #{Process.pid}`.to_i * 1024 # Convert KB to bytes
  end
  
  def emergency_compact
    Rails.logger.warn "Emergency compaction triggered"
    GC.compact
    GC.start # Follow up with full GC
  end
  
  def gentle_compact
    return unless GC.auto_compact == false
    
    Rails.logger.info "Gentle compaction triggered"  
    GC.compact
  end
end

Common Pitfalls

Object identity changes during compaction can break code that relies on object addresses or object_id values. While Ruby maintains logical object identity, the physical memory addresses change when objects move. Code that stores or compares object IDs may malfunction after compaction occurs.

# PROBLEMATIC: Storing object IDs for later comparison
class ObjectTracker
  def initialize
    @tracked_objects = {}
  end
  
  def track(obj, name)
    @tracked_objects[obj.object_id] = name
  end
  
  def find_name(obj)
    # This breaks after compaction - object_id changes
    @tracked_objects[obj.object_id]
  end
end

# CORRECT: Using object references directly  
class ObjectTracker
  def initialize
    @tracked_objects = {}
  end
  
  def track(obj, name)
    @tracked_objects[obj] = name
  end
  
  def find_name(obj)
    # This works correctly - uses object identity, not address
    @tracked_objects[obj]
  end
end

Native extensions and C code may not handle object movement correctly. Extensions that store Ruby object pointers directly can access invalid memory after compaction moves objects. Properly written extensions use Ruby's C API functions that handle object movement automatically.

# Check for compaction compatibility issues
def test_extension_compatibility
  # Create objects that extension might reference
  objects = Array.new(1000) { "test string #{rand}" }
  
  # Call extension methods
  results_before = objects.map { |obj| SomeExtension.process(obj) }
  
  # Trigger compaction
  GC.verify_compaction_references(double_heap: true)
  
  # Test extension again - should produce same results
  results_after = objects.map { |obj| SomeExtension.process(obj) }
  
  if results_before == results_after
    puts "Extension appears compaction-safe"
  else
    puts "WARNING: Extension may not handle compaction correctly"
  end
rescue => error
  puts "Compaction compatibility test failed: #{error.message}"
end

Performance degradation can occur when compaction overhead exceeds memory benefits. Applications with short-lived objects or already-good cache locality may experience slower performance with compaction enabled. The CPU cost of moving objects and updating references may outweigh fragmentation benefits.

# Monitor compaction performance impact
class CompactionProfiler
  def initialize
    @baseline_performance = nil
    @compaction_enabled = false
  end
  
  def establish_baseline
    GC.auto_compact = false
    @baseline_performance = measure_application_performance
    @compaction_enabled = false
  end
  
  def test_compaction_impact
    GC.auto_compact = true
    @compaction_enabled = true
    
    sleep(60) # Let compaction occur naturally
    
    performance_with_compaction = measure_application_performance
    
    impact = calculate_performance_impact(performance_with_compaction)
    
    if impact < -10 # More than 10% slower
      puts "WARNING: Compaction causing significant performance degradation"
      GC.auto_compact = false
    else
      puts "Compaction performance impact: #{impact.round(2)}%"
    end
  end
  
  private
  
  def measure_application_performance
    # Measure key application metrics
    start_time = Time.now
    
    # Simulate typical application workload
    1000.times do
      # Your application's typical operations
      perform_typical_work
    end
    
    Time.now - start_time
  end
  
  def calculate_performance_impact(current_performance)
    return 0 unless @baseline_performance
    
    ((current_performance - @baseline_performance) / @baseline_performance * 100)
  end
  
  def perform_typical_work
    # Replace with actual application operations
    data = Array.new(100) { rand(1000) }
    data.sort.reverse
  end
end

Memory growth can paradoxically increase temporarily during compaction. The garbage collector may allocate additional memory for tracking object movements and maintaining reference maps. Applications running close to memory limits might exceed thresholds during compaction operations.

Compaction timing conflicts with application-critical operations can cause service disruptions. Automatic compaction that triggers during high-load periods increases garbage collection pause times when applications can least afford delays. Manual compaction timing becomes crucial for maintaining responsiveness.

# Prevent compaction during critical operations
class CriticalSectionManager
  def initialize
    @critical_section = false
    @pending_compaction = false
  end
  
  def critical_section
    @critical_section = true
    original_auto_compact = GC.auto_compact
    GC.auto_compact = false
    
    yield
  ensure
    @critical_section = false
    
    # Restore auto compaction
    GC.auto_compact = original_auto_compact
    
    # Run pending manual compaction if needed
    if @pending_compaction
      @pending_compaction = false
      GC.compact
    end
  end
  
  def compact_unless_critical
    if @critical_section
      @pending_compaction = true
    else
      GC.compact
    end
  end
end

Error Handling & Debugging

Compaction-related errors often manifest as segmentation faults or unexpected behavior in native extensions. These errors occur when C extensions store raw pointers to Ruby objects without using proper Ruby C API functions. The errors typically appear after compaction moves objects to new memory locations.

# Test for compaction-related crashes
def test_compaction_safety
  # Force compaction to happen frequently for testing
  original_auto_compact = GC.auto_compact
  GC.auto_compact = true
  
  # Run typical application operations
  begin
    stress_test_application
    puts "Compaction stress test passed"
  rescue SystemExit, SignalException
    puts "Application crashed during compaction test"
    raise
  rescue => error
    puts "Error during compaction test: #{error.message}"
    puts error.backtrace.first(5)
  ensure
    GC.auto_compact = original_auto_compact
  end
end

def stress_test_application
  1000.times do |i|
    # Create various object types
    strings = Array.new(100) { "test string #{i}" }
    hashes = Array.new(50) { {key: "value#{i}"} }
    
    # Trigger compaction periodically
    GC.compact if i % 100 == 0
    
    # Call methods that might use native extensions
    strings.each(&:upcase)
    hashes.each(&:to_json) if defined?(JSON)
  end
end

Reference verification catches compaction bugs during development by checking that all object references remain valid after compaction. The GC.verify_compaction_references method performs exhaustive validation but runs too slowly for production use.

# Comprehensive reference verification for development
def verify_compaction_safety(double_heap: true)
  puts "Running compaction reference verification..."
  
  begin
    # Double heap size to ensure objects can move
    if double_heap
      GC.verify_compaction_references(double_heap: true)
    else
      GC.verify_compaction_references
    end
    
    puts "✓ All references verified successfully"
    true
  rescue => error
    puts "✗ Reference verification failed:"
    puts "  Error: #{error.message}"
    puts "  Type: #{error.class}"
    
    # Try to provide more context
    if error.message.include?("object")
      puts "  This suggests an object reference was not updated correctly"
    end
    
    false
  end
end

# Run verification in test environment
if Rails.env.test? || Rails.env.development?
  verify_compaction_safety
end

Memory corruption detection requires careful monitoring of application behavior before and after compaction operations. Corruption often appears as data inconsistencies, unexpected object states, or crashes that occur some time after compaction completes.

# Monitor for post-compaction corruption
class CorruptionDetector
  def initialize
    @object_checksums = {}
  end
  
  def snapshot_objects(objects)
    @object_checksums.clear
    
    objects.each do |obj|
      checksum = calculate_checksum(obj)
      @object_checksums[obj] = checksum
    end
    
    puts "Captured checksums for #{@object_checksums.size} objects"
  end
  
  def verify_objects_after_compaction
    corrupted = []
    
    @object_checksums.each do |obj, expected_checksum|
      current_checksum = calculate_checksum(obj)
      
      if current_checksum != expected_checksum
        corrupted << {
          object: obj,
          expected: expected_checksum,
          actual: current_checksum
        }
      end
    end
    
    if corrupted.any?
      puts "CORRUPTION DETECTED in #{corrupted.size} objects:"
      corrupted.each do |corruption|
        puts "  Object: #{corruption[:object].inspect}"
        puts "  Expected checksum: #{corruption[:expected]}"
        puts "  Actual checksum: #{corruption[:actual]}"
      end
      false
    else
      puts "No corruption detected"
      true
    end
  end
  
  private
  
  def calculate_checksum(obj)
    # Create a reproducible checksum based on object content
    case obj
    when String
      obj.hash
    when Array
      obj.map(&:hash).hash
    when Hash
      obj.to_a.sort.hash
    else
      obj.instance_variables.map { |var| [var, obj.instance_variable_get(var)] }.hash
    end
  rescue
    # Fallback for objects that can't be introspected
    obj.class.name.hash
  end
end

Debugging compaction issues requires systematic approaches to isolate problematic code. Applications should test compaction behavior in development environments and use verification tools to catch problems before production deployment.

# Systematic compaction debugging
class CompactionDebugger
  def debug_compaction_issues
    puts "=== Compaction Debugging Session ==="
    
    # Step 1: Test basic compaction functionality
    test_basic_compaction
    
    # Step 2: Test with verification enabled
    test_with_verification
    
    # Step 3: Test incremental operations
    test_incremental_compaction
    
    # Step 4: Test with typical workload
    test_workload_compaction
  end
  
  private
  
  def test_basic_compaction
    puts "\n1. Testing basic compaction..."
    
    begin
      stats = GC.compact
      puts "✓ Basic compaction successful: #{stats[:moved]} objects moved"
    rescue => error
      puts "✗ Basic compaction failed: #{error.message}"
      return false
    end
    
    true
  end
  
  def test_with_verification
    puts "\n2. Testing compaction with verification..."
    
    begin
      GC.verify_compaction_references
      puts "✓ Compaction verification passed"
    rescue => error
      puts "✗ Compaction verification failed: #{error.message}"
      diagnose_verification_failure(error)
      return false
    end
    
    true
  end
  
  def test_incremental_compaction
    puts "\n3. Testing incremental compaction..."
    
    # Create objects gradually and compact periodically
    objects = []
    
    10.times do |i|
      objects.concat(Array.new(100) { "object #{i}" })
      
      begin
        GC.compact
        puts "  Compaction #{i + 1}/10 successful"
      rescue => error
        puts "  ✗ Compaction #{i + 1}/10 failed: #{error.message}"
        return false
      end
    end
    
    puts "✓ Incremental compaction test passed"
    true
  end
  
  def test_workload_compaction
    puts "\n4. Testing compaction under workload..."
    
    begin
      # Simulate application workload while compacting
      thread = Thread.new do
        100.times do
          simulate_workload
          sleep(0.01)
        end
      end
      
      # Compact during workload
      5.times do
        GC.compact
        sleep(0.1)
      end
      
      thread.join
      puts "✓ Workload compaction test passed"
    rescue => error
      puts "✗ Workload compaction test failed: #{error.message}"
      return false
    end
    
    true
  end
  
  def simulate_workload
    # Replace with actual application operations
    data = Hash.new { |h, k| h[k] = [] }
    100.times { |i| data[i % 10] << "item #{i}" }
    data.values.flatten.sort
  end
  
  def diagnose_verification_failure(error)
    puts "  Diagnosis suggestions:"
    puts "  - Check for native extensions that store raw object pointers"
    puts "  - Look for code that relies on object_id values remaining constant"
    puts "  - Review C extensions for proper use of Ruby C API"
  end
end

Reference

Core Methods

Method Parameters Returns Description
GC.compact None Hash Performs heap compaction and returns statistics
GC.auto_compact None Boolean Gets current automatic compaction setting
GC.auto_compact= enabled (Boolean) Boolean Enables or disables automatic compaction
GC.latest_compact_info None Hash or nil Returns information about the most recent compaction
GC.verify_compaction_references **opts (Hash) nil Verifies all references after compaction (development only)

Compaction Statistics

Key Type Description
:considered Integer Number of objects considered for compaction
:moved Integer Number of objects actually moved during compaction
:time Float Timestamp when compaction completed
:freed_pages Integer Number of heap pages freed (when available)

Verification Options

Option Type Default Description
:double_heap Boolean false Doubles heap size to ensure objects can move
:expand_heap Boolean false Expands heap to create movement space

Configuration Constants

Constant Value Description
GC::INTERNAL_CONSTANTS[:HEAP_PAGE_SIZE] Integer Size of each heap page in bytes
GC::INTERNAL_CONSTANTS[:HEAP_PAGE_OBJ_LIMIT] Integer Maximum objects per heap page

GC Statistics Related to Compaction

# Key GC.stat values for monitoring compaction effectiveness
stat = GC.stat

# Heap structure
heap_pages = stat[:heap_eden_pages] + stat[:heap_tomb_pages]
live_objects = stat[:heap_live_slots]
free_slots = stat[:heap_free_slots]

# Memory calculations
page_size = GC::INTERNAL_CONSTANTS[:HEAP_PAGE_SIZE]
total_heap_size = heap_pages * page_size
fragmentation_estimate = heap_pages - (live_objects / stat[:heap_page_obj_limit])

Error Types

Error Class Cause Resolution
NoMemoryError Insufficient memory during compaction Reduce memory usage or disable compaction
SystemStackError Deep object graphs during reference updates Increase stack size or reduce object depth
ArgumentError Invalid verification options Check option names and types

Performance Monitoring

# Monitor compaction frequency and effectiveness
def compaction_metrics
  info = GC.latest_compact_info
  stat = GC.stat
  
  {
    last_compaction: info ? Time.at(info[:time]) : nil,
    objects_moved: info ? info[:moved] : 0,
    heap_pages: stat[:heap_eden_pages] + stat[:heap_tomb_pages],
    live_objects: stat[:heap_live_slots],
    auto_compact_enabled: GC.auto_compact
  }
end

Compatibility Matrix

Ruby Version Compaction Support Auto Compaction Verification
2.7+ ✓ Full ✓ Available ✓ Available
2.6 and earlier ✗ Not available ✗ Not available ✗ Not available

Best Practices Summary

# Production-ready compaction configuration
class ProductionCompactionConfig
  def self.configure
    # Enable auto compaction for most applications
    GC.auto_compact = true
    
    # Monitor effectiveness
    at_exit { log_compaction_stats }
  end
  
  def self.log_compaction_stats
    info = GC.latest_compact_info
    if info
      Rails.logger.info "Final compaction stats: #{info[:moved]} objects moved"
    end
  end
end