CrackedRuby logo

CrackedRuby

ruby-prof

A guide to profiling Ruby applications using ruby-prof for performance analysis and optimization.

Performance Optimization Profiling Tools
7.3.1

Overview

Ruby-prof measures application performance by collecting execution statistics during program runtime. The profiler tracks method call frequencies, execution times, memory allocations, and call graphs to identify performance bottlenecks. Ruby-prof operates through multiple profiling modes, each capturing different performance metrics.

The core profiling interface centers around RubyProf.profile blocks that wrap code sections requiring measurement. The profiler collects data during execution and returns RubyProf::Profile objects containing detailed performance statistics.

require 'ruby-prof'

# Basic profiling block
result = RubyProf.profile do
  1000.times { "string" + "concatenation" }
end

# Access profiling data
result.threads.each do |thread_id, methods|
  methods.each do |method_info|
    puts "#{method_info.full_name}: #{method_info.total_time}"
  end
end

Ruby-prof supports four distinct profiling modes through the RubyProf::MeasureMode constants. WALL_TIME measures real elapsed time including I/O operations and system calls. PROCESS_TIME tracks CPU time consumed by the current process, excluding I/O wait time. ALLOCATIONS counts object allocations during execution. MEMORY measures total memory consumption in bytes.

# Configure profiler mode
RubyProf.measure_mode = RubyProf::PROCESS_TIME

result = RubyProf.profile do
  File.read('/large/file.txt')
  complex_calculation
end

The profiler generates output through printer classes that format results for different consumption methods. RubyProf::FlatPrinter produces tabular summaries, RubyProf::GraphPrinter creates call graph visualizations, and RubyProf::CallTreePrinter outputs data compatible with external analysis tools.

Basic Usage

Profiling begins by wrapping target code with RubyProf.profile blocks. The profiler activates during block execution and deactivates upon completion, returning collected performance data.

require 'ruby-prof'

def slow_method
  sleep(0.1)
  expensive_calculation
end

def expensive_calculation
  (1..1000).map { |i| Math.sqrt(i) }.sum
end

# Profile the slow method
result = RubyProf.profile do
  slow_method
end

# Print flat profile to stdout
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT)

The RubyProf.start and RubyProf.stop methods provide granular control over profiling sessions. This approach allows profiling across multiple method calls or complex control flows where block syntax proves insufficient.

# Manual profiling control
RubyProf.start

process_user_request
handle_database_operations
render_response_template

result = RubyProf.stop

# Generate call graph
printer = RubyProf::GraphPrinter.new(result)
printer.print(File.open('profile_output.txt', 'w'))

Profiling specific threads requires thread filtering through the :include_threads and :exclude_threads options. Thread filtering reduces noise in multi-threaded applications by focusing on relevant execution paths.

# Profile main thread only
main_thread = Thread.current
result = RubyProf.profile(include_threads: [main_thread]) do
  Thread.new { background_task }.join
  main_processing_logic
end

Method filtering excludes standard library methods or framework code from profiling results. The :exclude_common_methods option removes frequently called methods that rarely indicate performance problems.

# Exclude common Ruby methods
result = RubyProf.profile(exclude_common_methods: true) do
  data = [1, 2, 3, 4, 5]
  data.map(&:to_s).join(',')
end

Multiple printer formats accommodate different analysis requirements. Flat printers show method execution summaries, graph printers reveal call relationships, and call tree printers integrate with external profiling tools.

result = RubyProf.profile { complex_application_logic }

# Multiple output formats
RubyProf::FlatPrinter.new(result).print(File.open('flat.txt', 'w'))
RubyProf::GraphPrinter.new(result).print(File.open('graph.txt', 'w'))
RubyProf::CallTreePrinter.new(result).print(File.open('callgrind.out', 'w'))

Performance & Memory

Profiling overhead varies significantly based on measurement mode and profiled code characteristics. WALL_TIME mode imposes minimal overhead, typically adding 1-3% execution time to profiled code. PROCESS_TIME measurements increase overhead to 5-10% due to additional system calls. ALLOCATIONS and MEMORY modes create substantial overhead, often doubling or tripling execution time.

# Measure profiling overhead
iterations = 10000

# Baseline timing
start_time = Time.now
iterations.times { Math.sqrt(rand) }
baseline = Time.now - start_time

# Profiled timing
start_time = Time.now
RubyProf.profile(measure_mode: RubyProf::WALL_TIME) do
  iterations.times { Math.sqrt(rand) }
end
profiled = Time.now - start_time

overhead = ((profiled - baseline) / baseline * 100).round(2)
puts "Profiling overhead: #{overhead}%"

Memory profiling requires careful interpretation since Ruby's garbage collector affects allocation patterns. The MEMORY mode measures total allocated bytes, including objects later collected by GC. Memory measurements represent allocation pressure rather than current memory usage.

# Memory allocation profiling
RubyProf.measure_mode = RubyProf::MEMORY

result = RubyProf.profile do
  strings = []
  1000.times { |i| strings << "string_#{i}" * 100 }
  strings.clear  # Objects still counted despite clearing
end

# Find memory-intensive methods
methods = result.threads.values.flatten
memory_hogs = methods.select { |m| m.total_time > 50000 }
memory_hogs.each do |method|
  puts "#{method.full_name}: #{method.total_time} bytes"
end

Profiling I/O-bound operations reveals different performance characteristics depending on measurement mode. WALL_TIME captures total elapsed time including I/O wait, while PROCESS_TIME measures only CPU consumption.

# Compare I/O profiling modes
file_path = '/tmp/large_test_file.txt'
File.write(file_path, 'x' * 1_000_000)

# Wall time includes I/O wait
wall_result = RubyProf.profile(measure_mode: RubyProf::WALL_TIME) do
  10.times { File.read(file_path) }
end

# Process time excludes I/O wait  
cpu_result = RubyProf.profile(measure_mode: RubyProf::PROCESS_TIME) do
  10.times { File.read(file_path) }
end

wall_time = wall_result.threads.values.flatten.sum(&:total_time)
cpu_time = cpu_result.threads.values.flatten.sum(&:total_time)

puts "Wall time: #{wall_time.round(4)}s"
puts "CPU time: #{cpu_time.round(4)}s" 
puts "I/O wait ratio: #{((wall_time - cpu_time) / wall_time * 100).round(1)}%"

Large-scale profiling sessions consume significant memory storing call graph data. Profiling sessions tracking thousands of method calls can require hundreds of megabytes of memory. Method filtering and thread filtering reduce memory consumption in complex applications.

# Memory-efficient profiling for large applications
result = RubyProf.profile(
  exclude_common_methods: true,
  exclude_threads: background_threads
) do
  large_scale_data_processing
end

# Analyze memory usage of profiling session
profile_size = ObjectSpace.memsize_of_all(RubyProf::Profile)
puts "Profile data size: #{profile_size / 1024}KB"

Advanced Usage

Custom profiling configurations combine multiple options to focus measurements on specific performance aspects. The :track_allocations option enables detailed allocation tracking, while :merge_fibers consolidates fiber-based concurrency results.

# Advanced profiling configuration
result = RubyProf.profile(
  measure_mode: RubyProf::ALLOCATIONS,
  track_allocations: true,
  merge_fibers: false,
  exclude_common_methods: true
) do
  fiber_based_processing
  concurrent_data_transformation
end

# Analyze fiber-specific allocations
result.threads.each do |thread_id, methods|
  fiber_methods = methods.select { |m| m.full_name.include?('Fiber') }
  total_allocations = fiber_methods.sum(&:total_time)
  puts "Thread #{thread_id}: #{total_allocations} allocations"
end

Recursive profiling captures nested profiling sessions within larger profiling contexts. This technique isolates performance measurements for specific subsystems while maintaining overall application profiling.

class PerformanceAnalyzer
  def self.profile_subsystem(name)
    if RubyProf.running?
      # Nested profiling within existing session
      start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
      result = yield
      end_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
      puts "#{name}: #{(end_time - start_time).round(4)}s"
      result
    else
      # Start new profiling session
      RubyProf.profile { yield }
    end
  end
end

# Usage in nested contexts
main_result = RubyProf.profile do
  data_loading = PerformanceAnalyzer.profile_subsystem("Data Loading") do
    load_user_data
  end
  
  processing = PerformanceAnalyzer.profile_subsystem("Processing") do
    process_business_logic(data_loading)
  end
  
  rendering = PerformanceAnalyzer.profile_subsystem("Rendering") do
    render_output_format(processing)
  end
end

Method exclusion patterns filter profiling results using regular expressions or callable objects. Complex filtering logic removes framework overhead and focuses measurements on application-specific code.

# Advanced method filtering
framework_pattern = /^(ActiveRecord|ActionController|Rails)::/
gem_pattern = /gems\//

custom_filter = proc do |method|
  # Exclude framework methods and gem code
  method.source_file.match?(gem_pattern) ||
  method.full_name.match?(framework_pattern) ||
  method.total_time < 0.001  # Exclude fast methods
end

result = RubyProf.profile(exclude_methods: custom_filter) do
  rails_controller_action
end

Profiling hooks integrate measurement collection with application lifecycle events. Hooks enable automatic profiling activation during specific request types or background job processing.

class AutoProfiler
  def self.profile_on_condition(&block)
    should_profile = ENV['RAILS_ENV'] == 'development' || 
                    Thread.current[:profile_request]
    
    if should_profile
      result = RubyProf.profile(measure_mode: RubyProf::WALL_TIME, &block)
      save_profile_data(result)
      result
    else
      block.call
    end
  end
  
  private
  
  def self.save_profile_data(result)
    filename = "profile_#{Time.now.to_f}.callgrind"
    File.open(filename, 'w') do |file|
      RubyProf::CallTreePrinter.new(result).print(file)
    end
  end
end

# Conditional profiling wrapper
AutoProfiler.profile_on_condition do
  expensive_business_operation
end

Production Patterns

Production profiling requires careful consideration of performance overhead and data collection strategies. Sampling approaches reduce profiling impact by measuring subsets of requests or operations rather than comprehensive profiling.

class ProductionProfiler
  class << self
    def sample_profile(sample_rate: 0.01, &block)
      if rand < sample_rate
        profile_id = SecureRandom.hex(8)
        result = RubyProf.profile(measure_mode: RubyProf::WALL_TIME, &block)
        store_profile_async(profile_id, result)
        result
      else
        block.call
      end
    end
    
    private
    
    def store_profile_async(profile_id, result)
      Thread.new do
        begin
          compressed_data = compress_profile_data(result)
          ProfileStorage.store(profile_id, compressed_data)
        rescue => error
          Rails.logger.error("Profile storage failed: #{error}")
        end
      end
    end
    
    def compress_profile_data(result)
      output = StringIO.new
      RubyProf::CallTreePrinter.new(result).print(output)
      Zlib::Deflate.deflate(output.string)
    end
  end
end

Request-based profiling in web applications targets specific endpoints or user actions showing performance degradation. Middleware integration enables automatic profiling activation based on request parameters or response times.

class ProfilingMiddleware
  def initialize(app, options = {})
    @app = app
    @slow_threshold = options[:slow_threshold] || 1.0
    @profile_param = options[:profile_param] || 'profile'
  end
  
  def call(env)
    request = Rack::Request.new(env)
    should_profile = request.params[@profile_param] || 
                    env['HTTP_X_PROFILE_REQUEST']
    
    if should_profile
      profile_request(env)
    else
      monitor_request(env)
    end
  end
  
  private
  
  def profile_request(env)
    result = RubyProf.profile(measure_mode: RubyProf::WALL_TIME) do
      @app.call(env)
    end
    
    response = result.first
    save_profile_for_request(env, result.last)
    response
  end
  
  def monitor_request(env)
    start_time = Time.now
    response = @app.call(env)
    duration = Time.now - start_time
    
    if duration > @slow_threshold
      Rails.logger.warn("Slow request: #{env['REQUEST_URI']} (#{duration.round(3)}s)")
    end
    
    response
  end
end

Background job profiling captures performance characteristics of asynchronous processing systems. Job-specific profiling data helps identify bottlenecks in queue processing and worker performance.

module JobProfiler
  def self.included(base)
    base.extend(ClassMethods)
    base.around_perform :profile_job_execution
  end
  
  module ClassMethods
    def profile_jobs(options = {})
      @profiling_options = {
        measure_mode: RubyProf::WALL_TIME,
        exclude_common_methods: true
      }.merge(options)
    end
  end
  
  private
  
  def profile_job_execution
    if should_profile_job?
      result = RubyProf.profile(self.class.instance_variable_get(:@profiling_options) || {}) do
        yield
      end
      
      store_job_profile(result)
    else
      yield
    end
  end
  
  def should_profile_job?
    Rails.env.development? || 
    ENV['PROFILE_JOBS'] == 'true' ||
    self.class.name.in?(ENV['PROFILE_JOB_CLASSES']&.split(',') || [])
  end
  
  def store_job_profile(result)
    profile_key = "#{self.class.name}_#{job_id}_#{Time.now.to_i}"
    ProfileCache.set(profile_key, serialize_profile(result), expires_in: 1.hour)
  end
end

# Usage in job classes
class DataProcessingJob < ApplicationJob
  include JobProfiler
  profile_jobs measure_mode: RubyProf::ALLOCATIONS
  
  def perform(dataset_id)
    dataset = Dataset.find(dataset_id)
    process_dataset(dataset)
  end
end

Common Pitfalls

Profiling accuracy degrades when measurement overhead becomes comparable to measured operation duration. Short-duration methods show inflated execution times due to profiler instrumentation costs, leading to misleading performance analysis.

# Problematic: Profiling fast operations
result = RubyProf.profile do
  1000.times { 2 + 2 }  # Each operation takes nanoseconds
end

# The profiling overhead dominates actual execution time
methods = result.threads.values.flatten
arithmetic_method = methods.find { |m| m.full_name.include?('+') }
puts "Reported time: #{arithmetic_method&.total_time}s"  # Artificially high

# Better: Profile meaningful workloads
result = RubyProf.profile do
  complex_mathematical_computation
  file_processing_operations
  network_communication
end

Thread safety issues emerge when profiling multi-threaded applications without proper thread filtering. Concurrent profiling sessions interfere with each other, producing inconsistent results or raising threading exceptions.

# Problematic: Concurrent profiling without coordination
threads = 5.times.map do |i|
  Thread.new do
    # Multiple threads starting profilers simultaneously
    result = RubyProf.profile do
      thread_specific_work(i)
    end
    process_results(result)
  end
end
threads.each(&:join)  # May raise threading errors

# Correct: Coordinate profiling across threads
profiling_mutex = Mutex.new
results = Queue.new

threads = 5.times.map do |i|
  Thread.new do
    profiling_mutex.synchronize do
      result = RubyProf.profile do
        thread_specific_work(i)
      end
      results << result
    end
  end
end
threads.each(&:join)

Memory profiling misinterpretation occurs when developers confuse allocation measurements with current memory usage. Ruby's garbage collector complicates memory analysis since allocated objects may be collected before profiling completion.

# Misleading memory analysis
result = RubyProf.profile(measure_mode: RubyProf::MEMORY) do
  large_array = Array.new(1_000_000) { |i| "string_#{i}" }
  
  # Force garbage collection
  large_array = nil
  GC.start
  
  # Memory measurement still includes collected objects
end

# Memory reported includes GC'd objects
total_memory = result.threads.values.flatten.sum(&:total_time)
puts "Allocated memory: #{total_memory} bytes"  # Doesn't reflect current usage

# Better: Combine with ObjectSpace analysis
GC.start  # Clean state
before_objects = ObjectSpace.count_objects

RubyProf.profile(measure_mode: RubyProf::MEMORY) do
  memory_intensive_operation
end

after_objects = ObjectSpace.count_objects
live_objects = after_objects[:TOTAL] - before_objects[:TOTAL]
puts "Live objects created: #{live_objects}"

Recursive method profiling creates confusing call graph interpretations when methods call themselves multiple times. The profiler aggregates recursive calls, making individual call costs difficult to analyze.

def fibonacci(n)
  return n if n <= 1
  fibonacci(n - 1) + fibonacci(n - 2)  # Recursive calls
end

# Recursive profiling confusion
result = RubyProf.profile do
  fibonacci(30)
end

fib_methods = result.threads.values.flatten.select { |m| m.full_name.include?('fibonacci') }
fib_methods.each do |method|
  # Total time includes ALL recursive calls aggregated
  puts "#{method.full_name}: calls=#{method.called} time=#{method.total_time}"
  # Difficult to determine per-call cost
end

# Better: Profile iterative equivalent or use call count analysis
def fibonacci_iterative(n)
  return n if n <= 1
  a, b = 0, 1
  (2..n).each { a, b = b, a + b }
  b
end

Reference

Core Methods

Method Parameters Returns Description
RubyProf.profile(**opts) { block } opts (Hash) RubyProf::Profile Profiles code block execution
RubyProf.start(**opts) opts (Hash) nil Begins profiling session
RubyProf.stop None RubyProf::Profile Ends profiling session
RubyProf.running? None Boolean Returns profiling session status
RubyProf.pause None nil Temporarily suspends profiling
RubyProf.resume None nil Resumes paused profiling session

Measurement Modes

Constant Description Overhead Use Case
RubyProf::WALL_TIME Real elapsed time including I/O Low (1-3%) General performance analysis
RubyProf::PROCESS_TIME CPU time excluding I/O wait Medium (5-10%) CPU-bound operation analysis
RubyProf::ALLOCATIONS Object allocation count High (50-100%) Memory allocation tracking
RubyProf::MEMORY Total allocated bytes High (50-100%) Memory usage analysis

Profiling Options

Option Type Default Description
:measure_mode Integer WALL_TIME Measurement type constant
:include_threads Array<Thread> All threads Threads to include in profiling
:exclude_threads Array<Thread> None Threads to exclude from profiling
:exclude_common_methods Boolean false Skip standard library methods
:exclude_methods Regexp|Proc None Custom method filtering
:track_allocations Boolean false Enable allocation site tracking
:merge_fibers Boolean true Consolidate fiber execution data

Printer Classes

Printer Output Format Use Case
RubyProf::FlatPrinter Tabular method summary Quick performance overview
RubyProf::GraphPrinter Call graph with hierarchy Understanding call relationships
RubyProf::CallTreePrinter Callgrind format External tool integration
RubyProf::CallStackPrinter HTML call stack Visual call analysis
RubyProf::DotPrinter GraphViz dot format Call graph visualization

Profile Object Methods

Method Returns Description
#threads Hash Thread ID to method data mapping
#eliminate_methods(methods) nil Remove methods from results
#dump_file(filename) nil Serialize profile to file
#load_file(filename) RubyProf::Profile Load serialized profile

Method Info Attributes

Attribute Type Description
#full_name String Complete method identifier
#called Integer Number of method invocations
#total_time Float Cumulative execution time
#self_time Float Exclusive method execution time
#wait_time Float Time spent waiting for resources
#children_time Float Time spent in called methods

Error Classes

Exception Description
RubyProf::Error Base profiling error
RubyProf::AlreadyRunningError Profiling session already active
RubyProf::NotRunningError No active profiling session