Overview
Ruby-prof measures application performance by collecting execution statistics during program runtime. The profiler tracks method call frequencies, execution times, memory allocations, and call graphs to identify performance bottlenecks. Ruby-prof operates through multiple profiling modes, each capturing different performance metrics.
The core profiling interface centers around RubyProf.profile
blocks that wrap code sections requiring measurement. The profiler collects data during execution and returns RubyProf::Profile
objects containing detailed performance statistics.
require 'ruby-prof'
# Basic profiling block
result = RubyProf.profile do
1000.times { "string" + "concatenation" }
end
# Access profiling data
result.threads.each do |thread_id, methods|
methods.each do |method_info|
puts "#{method_info.full_name}: #{method_info.total_time}"
end
end
Ruby-prof supports four distinct profiling modes through the RubyProf::MeasureMode
constants. WALL_TIME
measures real elapsed time including I/O operations and system calls. PROCESS_TIME
tracks CPU time consumed by the current process, excluding I/O wait time. ALLOCATIONS
counts object allocations during execution. MEMORY
measures total memory consumption in bytes.
# Configure profiler mode
RubyProf.measure_mode = RubyProf::PROCESS_TIME
result = RubyProf.profile do
File.read('/large/file.txt')
complex_calculation
end
The profiler generates output through printer classes that format results for different consumption methods. RubyProf::FlatPrinter
produces tabular summaries, RubyProf::GraphPrinter
creates call graph visualizations, and RubyProf::CallTreePrinter
outputs data compatible with external analysis tools.
Basic Usage
Profiling begins by wrapping target code with RubyProf.profile
blocks. The profiler activates during block execution and deactivates upon completion, returning collected performance data.
require 'ruby-prof'
def slow_method
sleep(0.1)
expensive_calculation
end
def expensive_calculation
(1..1000).map { |i| Math.sqrt(i) }.sum
end
# Profile the slow method
result = RubyProf.profile do
slow_method
end
# Print flat profile to stdout
printer = RubyProf::FlatPrinter.new(result)
printer.print(STDOUT)
The RubyProf.start
and RubyProf.stop
methods provide granular control over profiling sessions. This approach allows profiling across multiple method calls or complex control flows where block syntax proves insufficient.
# Manual profiling control
RubyProf.start
process_user_request
handle_database_operations
render_response_template
result = RubyProf.stop
# Generate call graph
printer = RubyProf::GraphPrinter.new(result)
printer.print(File.open('profile_output.txt', 'w'))
Profiling specific threads requires thread filtering through the :include_threads
and :exclude_threads
options. Thread filtering reduces noise in multi-threaded applications by focusing on relevant execution paths.
# Profile main thread only
main_thread = Thread.current
result = RubyProf.profile(include_threads: [main_thread]) do
Thread.new { background_task }.join
main_processing_logic
end
Method filtering excludes standard library methods or framework code from profiling results. The :exclude_common_methods
option removes frequently called methods that rarely indicate performance problems.
# Exclude common Ruby methods
result = RubyProf.profile(exclude_common_methods: true) do
data = [1, 2, 3, 4, 5]
data.map(&:to_s).join(',')
end
Multiple printer formats accommodate different analysis requirements. Flat printers show method execution summaries, graph printers reveal call relationships, and call tree printers integrate with external profiling tools.
result = RubyProf.profile { complex_application_logic }
# Multiple output formats
RubyProf::FlatPrinter.new(result).print(File.open('flat.txt', 'w'))
RubyProf::GraphPrinter.new(result).print(File.open('graph.txt', 'w'))
RubyProf::CallTreePrinter.new(result).print(File.open('callgrind.out', 'w'))
Performance & Memory
Profiling overhead varies significantly based on measurement mode and profiled code characteristics. WALL_TIME
mode imposes minimal overhead, typically adding 1-3% execution time to profiled code. PROCESS_TIME
measurements increase overhead to 5-10% due to additional system calls. ALLOCATIONS
and MEMORY
modes create substantial overhead, often doubling or tripling execution time.
# Measure profiling overhead
iterations = 10000
# Baseline timing
start_time = Time.now
iterations.times { Math.sqrt(rand) }
baseline = Time.now - start_time
# Profiled timing
start_time = Time.now
RubyProf.profile(measure_mode: RubyProf::WALL_TIME) do
iterations.times { Math.sqrt(rand) }
end
profiled = Time.now - start_time
overhead = ((profiled - baseline) / baseline * 100).round(2)
puts "Profiling overhead: #{overhead}%"
Memory profiling requires careful interpretation since Ruby's garbage collector affects allocation patterns. The MEMORY
mode measures total allocated bytes, including objects later collected by GC. Memory measurements represent allocation pressure rather than current memory usage.
# Memory allocation profiling
RubyProf.measure_mode = RubyProf::MEMORY
result = RubyProf.profile do
strings = []
1000.times { |i| strings << "string_#{i}" * 100 }
strings.clear # Objects still counted despite clearing
end
# Find memory-intensive methods
methods = result.threads.values.flatten
memory_hogs = methods.select { |m| m.total_time > 50000 }
memory_hogs.each do |method|
puts "#{method.full_name}: #{method.total_time} bytes"
end
Profiling I/O-bound operations reveals different performance characteristics depending on measurement mode. WALL_TIME
captures total elapsed time including I/O wait, while PROCESS_TIME
measures only CPU consumption.
# Compare I/O profiling modes
file_path = '/tmp/large_test_file.txt'
File.write(file_path, 'x' * 1_000_000)
# Wall time includes I/O wait
wall_result = RubyProf.profile(measure_mode: RubyProf::WALL_TIME) do
10.times { File.read(file_path) }
end
# Process time excludes I/O wait
cpu_result = RubyProf.profile(measure_mode: RubyProf::PROCESS_TIME) do
10.times { File.read(file_path) }
end
wall_time = wall_result.threads.values.flatten.sum(&:total_time)
cpu_time = cpu_result.threads.values.flatten.sum(&:total_time)
puts "Wall time: #{wall_time.round(4)}s"
puts "CPU time: #{cpu_time.round(4)}s"
puts "I/O wait ratio: #{((wall_time - cpu_time) / wall_time * 100).round(1)}%"
Large-scale profiling sessions consume significant memory storing call graph data. Profiling sessions tracking thousands of method calls can require hundreds of megabytes of memory. Method filtering and thread filtering reduce memory consumption in complex applications.
# Memory-efficient profiling for large applications
result = RubyProf.profile(
exclude_common_methods: true,
exclude_threads: background_threads
) do
large_scale_data_processing
end
# Analyze memory usage of profiling session
profile_size = ObjectSpace.memsize_of_all(RubyProf::Profile)
puts "Profile data size: #{profile_size / 1024}KB"
Advanced Usage
Custom profiling configurations combine multiple options to focus measurements on specific performance aspects. The :track_allocations
option enables detailed allocation tracking, while :merge_fibers
consolidates fiber-based concurrency results.
# Advanced profiling configuration
result = RubyProf.profile(
measure_mode: RubyProf::ALLOCATIONS,
track_allocations: true,
merge_fibers: false,
exclude_common_methods: true
) do
fiber_based_processing
concurrent_data_transformation
end
# Analyze fiber-specific allocations
result.threads.each do |thread_id, methods|
fiber_methods = methods.select { |m| m.full_name.include?('Fiber') }
total_allocations = fiber_methods.sum(&:total_time)
puts "Thread #{thread_id}: #{total_allocations} allocations"
end
Recursive profiling captures nested profiling sessions within larger profiling contexts. This technique isolates performance measurements for specific subsystems while maintaining overall application profiling.
class PerformanceAnalyzer
def self.profile_subsystem(name)
if RubyProf.running?
# Nested profiling within existing session
start_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
result = yield
end_time = Process.clock_gettime(Process::CLOCK_MONOTONIC)
puts "#{name}: #{(end_time - start_time).round(4)}s"
result
else
# Start new profiling session
RubyProf.profile { yield }
end
end
end
# Usage in nested contexts
main_result = RubyProf.profile do
data_loading = PerformanceAnalyzer.profile_subsystem("Data Loading") do
load_user_data
end
processing = PerformanceAnalyzer.profile_subsystem("Processing") do
process_business_logic(data_loading)
end
rendering = PerformanceAnalyzer.profile_subsystem("Rendering") do
render_output_format(processing)
end
end
Method exclusion patterns filter profiling results using regular expressions or callable objects. Complex filtering logic removes framework overhead and focuses measurements on application-specific code.
# Advanced method filtering
framework_pattern = /^(ActiveRecord|ActionController|Rails)::/
gem_pattern = /gems\//
custom_filter = proc do |method|
# Exclude framework methods and gem code
method.source_file.match?(gem_pattern) ||
method.full_name.match?(framework_pattern) ||
method.total_time < 0.001 # Exclude fast methods
end
result = RubyProf.profile(exclude_methods: custom_filter) do
rails_controller_action
end
Profiling hooks integrate measurement collection with application lifecycle events. Hooks enable automatic profiling activation during specific request types or background job processing.
class AutoProfiler
def self.profile_on_condition(&block)
should_profile = ENV['RAILS_ENV'] == 'development' ||
Thread.current[:profile_request]
if should_profile
result = RubyProf.profile(measure_mode: RubyProf::WALL_TIME, &block)
save_profile_data(result)
result
else
block.call
end
end
private
def self.save_profile_data(result)
filename = "profile_#{Time.now.to_f}.callgrind"
File.open(filename, 'w') do |file|
RubyProf::CallTreePrinter.new(result).print(file)
end
end
end
# Conditional profiling wrapper
AutoProfiler.profile_on_condition do
expensive_business_operation
end
Production Patterns
Production profiling requires careful consideration of performance overhead and data collection strategies. Sampling approaches reduce profiling impact by measuring subsets of requests or operations rather than comprehensive profiling.
class ProductionProfiler
class << self
def sample_profile(sample_rate: 0.01, &block)
if rand < sample_rate
profile_id = SecureRandom.hex(8)
result = RubyProf.profile(measure_mode: RubyProf::WALL_TIME, &block)
store_profile_async(profile_id, result)
result
else
block.call
end
end
private
def store_profile_async(profile_id, result)
Thread.new do
begin
compressed_data = compress_profile_data(result)
ProfileStorage.store(profile_id, compressed_data)
rescue => error
Rails.logger.error("Profile storage failed: #{error}")
end
end
end
def compress_profile_data(result)
output = StringIO.new
RubyProf::CallTreePrinter.new(result).print(output)
Zlib::Deflate.deflate(output.string)
end
end
end
Request-based profiling in web applications targets specific endpoints or user actions showing performance degradation. Middleware integration enables automatic profiling activation based on request parameters or response times.
class ProfilingMiddleware
def initialize(app, options = {})
@app = app
@slow_threshold = options[:slow_threshold] || 1.0
@profile_param = options[:profile_param] || 'profile'
end
def call(env)
request = Rack::Request.new(env)
should_profile = request.params[@profile_param] ||
env['HTTP_X_PROFILE_REQUEST']
if should_profile
profile_request(env)
else
monitor_request(env)
end
end
private
def profile_request(env)
result = RubyProf.profile(measure_mode: RubyProf::WALL_TIME) do
@app.call(env)
end
response = result.first
save_profile_for_request(env, result.last)
response
end
def monitor_request(env)
start_time = Time.now
response = @app.call(env)
duration = Time.now - start_time
if duration > @slow_threshold
Rails.logger.warn("Slow request: #{env['REQUEST_URI']} (#{duration.round(3)}s)")
end
response
end
end
Background job profiling captures performance characteristics of asynchronous processing systems. Job-specific profiling data helps identify bottlenecks in queue processing and worker performance.
module JobProfiler
def self.included(base)
base.extend(ClassMethods)
base.around_perform :profile_job_execution
end
module ClassMethods
def profile_jobs(options = {})
@profiling_options = {
measure_mode: RubyProf::WALL_TIME,
exclude_common_methods: true
}.merge(options)
end
end
private
def profile_job_execution
if should_profile_job?
result = RubyProf.profile(self.class.instance_variable_get(:@profiling_options) || {}) do
yield
end
store_job_profile(result)
else
yield
end
end
def should_profile_job?
Rails.env.development? ||
ENV['PROFILE_JOBS'] == 'true' ||
self.class.name.in?(ENV['PROFILE_JOB_CLASSES']&.split(',') || [])
end
def store_job_profile(result)
profile_key = "#{self.class.name}_#{job_id}_#{Time.now.to_i}"
ProfileCache.set(profile_key, serialize_profile(result), expires_in: 1.hour)
end
end
# Usage in job classes
class DataProcessingJob < ApplicationJob
include JobProfiler
profile_jobs measure_mode: RubyProf::ALLOCATIONS
def perform(dataset_id)
dataset = Dataset.find(dataset_id)
process_dataset(dataset)
end
end
Common Pitfalls
Profiling accuracy degrades when measurement overhead becomes comparable to measured operation duration. Short-duration methods show inflated execution times due to profiler instrumentation costs, leading to misleading performance analysis.
# Problematic: Profiling fast operations
result = RubyProf.profile do
1000.times { 2 + 2 } # Each operation takes nanoseconds
end
# The profiling overhead dominates actual execution time
methods = result.threads.values.flatten
arithmetic_method = methods.find { |m| m.full_name.include?('+') }
puts "Reported time: #{arithmetic_method&.total_time}s" # Artificially high
# Better: Profile meaningful workloads
result = RubyProf.profile do
complex_mathematical_computation
file_processing_operations
network_communication
end
Thread safety issues emerge when profiling multi-threaded applications without proper thread filtering. Concurrent profiling sessions interfere with each other, producing inconsistent results or raising threading exceptions.
# Problematic: Concurrent profiling without coordination
threads = 5.times.map do |i|
Thread.new do
# Multiple threads starting profilers simultaneously
result = RubyProf.profile do
thread_specific_work(i)
end
process_results(result)
end
end
threads.each(&:join) # May raise threading errors
# Correct: Coordinate profiling across threads
profiling_mutex = Mutex.new
results = Queue.new
threads = 5.times.map do |i|
Thread.new do
profiling_mutex.synchronize do
result = RubyProf.profile do
thread_specific_work(i)
end
results << result
end
end
end
threads.each(&:join)
Memory profiling misinterpretation occurs when developers confuse allocation measurements with current memory usage. Ruby's garbage collector complicates memory analysis since allocated objects may be collected before profiling completion.
# Misleading memory analysis
result = RubyProf.profile(measure_mode: RubyProf::MEMORY) do
large_array = Array.new(1_000_000) { |i| "string_#{i}" }
# Force garbage collection
large_array = nil
GC.start
# Memory measurement still includes collected objects
end
# Memory reported includes GC'd objects
total_memory = result.threads.values.flatten.sum(&:total_time)
puts "Allocated memory: #{total_memory} bytes" # Doesn't reflect current usage
# Better: Combine with ObjectSpace analysis
GC.start # Clean state
before_objects = ObjectSpace.count_objects
RubyProf.profile(measure_mode: RubyProf::MEMORY) do
memory_intensive_operation
end
after_objects = ObjectSpace.count_objects
live_objects = after_objects[:TOTAL] - before_objects[:TOTAL]
puts "Live objects created: #{live_objects}"
Recursive method profiling creates confusing call graph interpretations when methods call themselves multiple times. The profiler aggregates recursive calls, making individual call costs difficult to analyze.
def fibonacci(n)
return n if n <= 1
fibonacci(n - 1) + fibonacci(n - 2) # Recursive calls
end
# Recursive profiling confusion
result = RubyProf.profile do
fibonacci(30)
end
fib_methods = result.threads.values.flatten.select { |m| m.full_name.include?('fibonacci') }
fib_methods.each do |method|
# Total time includes ALL recursive calls aggregated
puts "#{method.full_name}: calls=#{method.called} time=#{method.total_time}"
# Difficult to determine per-call cost
end
# Better: Profile iterative equivalent or use call count analysis
def fibonacci_iterative(n)
return n if n <= 1
a, b = 0, 1
(2..n).each { a, b = b, a + b }
b
end
Reference
Core Methods
Method | Parameters | Returns | Description |
---|---|---|---|
RubyProf.profile(**opts) { block } |
opts (Hash) |
RubyProf::Profile |
Profiles code block execution |
RubyProf.start(**opts) |
opts (Hash) |
nil |
Begins profiling session |
RubyProf.stop |
None | RubyProf::Profile |
Ends profiling session |
RubyProf.running? |
None | Boolean |
Returns profiling session status |
RubyProf.pause |
None | nil |
Temporarily suspends profiling |
RubyProf.resume |
None | nil |
Resumes paused profiling session |
Measurement Modes
Constant | Description | Overhead | Use Case |
---|---|---|---|
RubyProf::WALL_TIME |
Real elapsed time including I/O | Low (1-3%) | General performance analysis |
RubyProf::PROCESS_TIME |
CPU time excluding I/O wait | Medium (5-10%) | CPU-bound operation analysis |
RubyProf::ALLOCATIONS |
Object allocation count | High (50-100%) | Memory allocation tracking |
RubyProf::MEMORY |
Total allocated bytes | High (50-100%) | Memory usage analysis |
Profiling Options
Option | Type | Default | Description |
---|---|---|---|
:measure_mode |
Integer |
WALL_TIME |
Measurement type constant |
:include_threads |
Array<Thread> |
All threads | Threads to include in profiling |
:exclude_threads |
Array<Thread> |
None | Threads to exclude from profiling |
:exclude_common_methods |
Boolean |
false |
Skip standard library methods |
:exclude_methods |
Regexp|Proc |
None | Custom method filtering |
:track_allocations |
Boolean |
false |
Enable allocation site tracking |
:merge_fibers |
Boolean |
true |
Consolidate fiber execution data |
Printer Classes
Printer | Output Format | Use Case |
---|---|---|
RubyProf::FlatPrinter |
Tabular method summary | Quick performance overview |
RubyProf::GraphPrinter |
Call graph with hierarchy | Understanding call relationships |
RubyProf::CallTreePrinter |
Callgrind format | External tool integration |
RubyProf::CallStackPrinter |
HTML call stack | Visual call analysis |
RubyProf::DotPrinter |
GraphViz dot format | Call graph visualization |
Profile Object Methods
Method | Returns | Description |
---|---|---|
#threads |
Hash |
Thread ID to method data mapping |
#eliminate_methods(methods) |
nil |
Remove methods from results |
#dump_file(filename) |
nil |
Serialize profile to file |
#load_file(filename) |
RubyProf::Profile |
Load serialized profile |
Method Info Attributes
Attribute | Type | Description |
---|---|---|
#full_name |
String |
Complete method identifier |
#called |
Integer |
Number of method invocations |
#total_time |
Float |
Cumulative execution time |
#self_time |
Float |
Exclusive method execution time |
#wait_time |
Float |
Time spent waiting for resources |
#children_time |
Float |
Time spent in called methods |
Error Classes
Exception | Description |
---|---|
RubyProf::Error |
Base profiling error |
RubyProf::AlreadyRunningError |
Profiling session already active |
RubyProf::NotRunningError |
No active profiling session |