Overview
Application monitoring in Ruby encompasses tracking application performance, detecting errors, measuring resource usage, and observing system health. Ruby provides several approaches through standard library modules, third-party gems, and custom instrumentation patterns.
The Logger
class from Ruby's standard library forms the foundation for basic monitoring through structured logging. Ruby applications commonly integrate with external monitoring services through gems like newrelic_rpm
, scout_apm
, or datadog
. Custom monitoring solutions build on Ruby's TracePoint
API, method instrumentation patterns, and metric collection frameworks.
require 'logger'
require 'benchmark'
# Basic monitoring setup
logger = Logger.new('application.log')
logger.level = Logger::INFO
# Simple performance tracking
execution_time = Benchmark.measure { expensive_operation() }
logger.info "Operation completed in #{execution_time.real}s"
Ruby's metaprogramming capabilities enable method-level instrumentation without modifying existing code. The Module#prepend
and alias_method
patterns create monitoring wrappers around business logic.
module PerformanceMonitor
def self.prepended(base)
base.instance_methods(false).each do |method_name|
base.alias_method "#{method_name}_without_monitoring", method_name
base.define_method(method_name) do |*args, &block|
start_time = Time.now
result = send("#{method_name}_without_monitoring", *args, &block)
duration = Time.now - start_time
puts "#{method_name} executed in #{duration}s"
result
end
end
end
end
Monitoring systems typically collect four types of data: metrics (numerical measurements), logs (structured events), traces (request paths), and errors (exception information). Ruby applications expose this data through instrumentation points, custom collectors, and integration with monitoring platforms.
Basic Usage
Ruby applications implement monitoring through direct instrumentation, automatic collection via gems, and custom metric gathering. The standard library provides basic tools while specialized gems offer comprehensive monitoring solutions.
require 'logger'
require 'json'
require 'time'
class ApplicationMonitor
def initialize(logger = Logger.new(STDOUT))
@logger = logger
@metrics = {}
@start_time = Time.now
end
def track_request(path, method, duration, status)
@logger.info({
timestamp: Time.now.iso8601,
path: path,
method: method,
duration_ms: (duration * 1000).round(2),
status: status,
memory_mb: memory_usage_mb
}.to_json)
end
def increment_counter(metric_name, value = 1)
@metrics[metric_name] ||= 0
@metrics[metric_name] += value
end
private
def memory_usage_mb
`ps -o rss= -p #{Process.pid}`.to_i / 1024.0
end
end
# Usage in application
monitor = ApplicationMonitor.new
monitor.track_request('/api/users', 'GET', 0.143, 200)
monitor.increment_counter('user_registrations')
Method-level monitoring wraps existing functionality to collect timing and error data. Ruby's Module#prepend
provides clean instrumentation without altering original method signatures.
module MethodMonitor
def self.included(base)
base.extend(ClassMethods)
end
module ClassMethods
def monitor_method(method_name, options = {})
original_method = instance_method(method_name)
define_method(method_name) do |*args, &block|
start_time = Time.now
begin
result = original_method.bind(self).call(*args, &block)
duration = Time.now - start_time
log_success(method_name, duration, options)
result
rescue => error
duration = Time.now - start_time
log_error(method_name, error, duration, options)
raise
end
end
end
end
private
def log_success(method, duration, options)
puts "#{method} completed successfully in #{duration}s"
end
def log_error(method, error, duration, options)
puts "#{method} failed with #{error.class}: #{error.message} (#{duration}s)"
end
end
class UserService
include MethodMonitor
def create_user(email, name)
# User creation logic
User.create(email: email, name: name)
end
monitor_method :create_user, alert_threshold: 2.0
end
External monitoring services integrate through initialization and automatic instrumentation. Most gems require minimal configuration while providing comprehensive data collection.
# Gemfile
gem 'newrelic_rpm'
gem 'scout_apm'
# config/newrelic.yml configuration automatically loads
# Custom instrumentation for business logic
class PaymentProcessor
include NewRelic::Agent::Instrumentation::ControllerInstrumentation
def process_payment(amount, card_token)
# Payment processing logic
NewRelic::Agent.record_metric('Custom/PaymentAmount', amount)
NewRelic::Agent.increment_metric('Custom/PaymentCount')
# Business logic here
result = charge_card(amount, card_token)
NewRelic::Agent.add_custom_attributes({
payment_method: 'credit_card',
amount_cents: amount,
success: result.success?
})
result
end
add_transaction_tracer :process_payment, category: :task
end
Health check endpoints provide application status information for load balancers and monitoring systems. These endpoints verify database connections, external service availability, and system resource levels.
require 'sinatra'
require 'json'
class HealthCheck < Sinatra::Base
get '/health' do
content_type :json
checks = {
database: check_database,
redis: check_redis,
disk_space: check_disk_space,
memory: check_memory_usage
}
status = checks.all? { |_, result| result[:status] == 'ok' } ? 200 : 503
{
status: status == 200 ? 'healthy' : 'unhealthy',
timestamp: Time.now.iso8601,
checks: checks
}.to_json
end
private
def check_database
ActiveRecord::Base.connection.execute('SELECT 1')
{ status: 'ok', response_time_ms: 5 }
rescue => error
{ status: 'error', message: error.message }
end
def check_redis
Redis.current.ping
{ status: 'ok' }
rescue => error
{ status: 'error', message: error.message }
end
end
Production Patterns
Production monitoring requires structured approaches to data collection, alerting, and performance optimization. Ruby applications in production implement monitoring through multiple layers: application-level instrumentation, infrastructure monitoring, and business metric tracking.
Centralized logging aggregates data from multiple application instances and services. Production applications structure log data as JSON for parsing by log aggregation systems like ELK stack or Splunk.
require 'logger'
require 'json'
class ProductionLogger
def initialize(service_name, environment)
@service_name = service_name
@environment = environment
@logger = Logger.new(STDOUT)
@logger.formatter = method(:json_formatter)
end
def info(message, metadata = {})
log_data = base_log_data.merge(
level: 'INFO',
message: message,
metadata: metadata
)
@logger.info(log_data)
end
def error(message, exception = nil, metadata = {})
log_data = base_log_data.merge(
level: 'ERROR',
message: message,
metadata: metadata
)
if exception
log_data[:exception] = {
class: exception.class.name,
message: exception.message,
backtrace: exception.backtrace&.first(10)
}
end
@logger.error(log_data)
end
private
def base_log_data
{
timestamp: Time.now.utc.iso8601,
service: @service_name,
environment: @environment,
host: Socket.gethostname,
pid: Process.pid,
thread_id: Thread.current.object_id
}
end
def json_formatter(severity, datetime, progname, msg)
"#{msg.to_json}\n"
end
end
# Usage across application
$logger = ProductionLogger.new('user-service', ENV['RAILS_ENV'])
class UserController < ApplicationController
def create
user = User.create(user_params)
$logger.info('User created', { user_id: user.id, email: user.email })
render json: user
rescue => error
$logger.error('User creation failed', error, { params: user_params })
render json: { error: 'Creation failed' }, status: 422
end
end
Metric collection systems gather quantitative data about application behavior, resource usage, and business operations. Production applications implement custom metric collectors that push data to time-series databases.
require 'net/http'
require 'json'
require 'uri'
class MetricsCollector
def initialize(statsd_host = 'localhost', statsd_port = 8125)
@statsd_host = statsd_host
@statsd_port = statsd_port
@socket = UDPSocket.new
end
def increment(metric_name, value = 1, tags = {})
send_metric("#{metric_name}:#{value}|c", tags)
end
def gauge(metric_name, value, tags = {})
send_metric("#{metric_name}:#{value}|g", tags)
end
def timing(metric_name, duration_ms, tags = {})
send_metric("#{metric_name}:#{duration_ms}|ms", tags)
end
def histogram(metric_name, value, tags = {})
send_metric("#{metric_name}:#{value}|h", tags)
end
def time_block(metric_name, tags = {})
start_time = Time.now
result = yield
duration_ms = ((Time.now - start_time) * 1000).round(2)
timing(metric_name, duration_ms, tags)
result
end
private
def send_metric(metric_data, tags)
tagged_metric = tags.empty? ? metric_data : "#{metric_data}|##{format_tags(tags)}"
@socket.send(tagged_metric, 0, @statsd_host, @statsd_port)
rescue => error
# Log metric sending errors but don't fail application logic
puts "Metric send failed: #{error.message}"
end
def format_tags(tags)
tags.map { |k, v| "#{k}:#{v}" }.join(',')
end
end
# Application integration
class PaymentService
def initialize
@metrics = MetricsCollector.new
end
def process_payment(amount, payment_method)
@metrics.increment('payment.attempted', 1, {
method: payment_method,
amount_range: amount_range(amount)
})
result = @metrics.time_block('payment.processing_time', { method: payment_method }) do
charge_payment(amount, payment_method)
end
if result.success?
@metrics.increment('payment.successful')
@metrics.histogram('payment.amount', amount)
else
@metrics.increment('payment.failed', 1, {
error_type: result.error_code,
method: payment_method
})
end
result
end
private
def amount_range(amount)
case amount
when 0..999 then 'small'
when 1000..9999 then 'medium'
else 'large'
end
end
end
Application Performance Monitoring (APM) tools provide detailed transaction tracing and performance analysis. Production deployments configure APM agents to collect detailed timing data while minimizing performance overhead.
# config/initializers/monitoring.rb
require 'newrelic_rpm'
require 'scout_apm'
# NewRelic custom instrumentation
module CustomInstrumentation
def self.included(base)
base.extend(ClassMethods)
end
module ClassMethods
def trace_execution(method_name, category: :custom)
alias_method "#{method_name}_without_tracing", method_name
define_method(method_name) do |*args, &block|
NewRelic::Agent::MethodTracer.trace_execution_scoped(
"Custom/#{self.class.name}/#{method_name}"
) do
send("#{method_name}_without_tracing", *args, &block)
end
end
end
end
end
class ReportGenerator
include CustomInstrumentation
def generate_monthly_report(user_id, month)
# Report generation logic
data = collect_user_data(user_id, month)
report = build_report(data)
deliver_report(report, user_id)
end
def collect_user_data(user_id, month)
# Data collection logic
NewRelic::Agent.record_metric('Custom/ReportData/UserCount', 1)
User.find(user_id).monthly_data(month)
end
trace_execution :generate_monthly_report
trace_execution :collect_user_data
end
Circuit breaker patterns prevent cascading failures by monitoring service health and stopping requests to failing services. Production applications implement circuit breakers around external service calls.
class CircuitBreaker
STATES = [:closed, :open, :half_open].freeze
def initialize(failure_threshold: 5, timeout: 60, success_threshold: 3)
@failure_threshold = failure_threshold
@timeout = timeout
@success_threshold = success_threshold
@failure_count = 0
@success_count = 0
@last_failure_time = nil
@state = :closed
@mutex = Mutex.new
end
def call
@mutex.synchronize do
case @state
when :open
if Time.now - @last_failure_time > @timeout
@state = :half_open
@success_count = 0
else
raise CircuitBreakerOpenError, "Circuit breaker is OPEN"
end
when :half_open
# Allow limited requests through
when :closed
# Normal operation
end
end
begin
result = yield
on_success
result
rescue => error
on_failure
raise
end
end
def state
@state
end
def metrics
{
state: @state,
failure_count: @failure_count,
success_count: @success_count,
last_failure_time: @last_failure_time
}
end
private
def on_success
@mutex.synchronize do
@failure_count = 0
if @state == :half_open
@success_count += 1
if @success_count >= @success_threshold
@state = :closed
end
end
end
end
def on_failure
@mutex.synchronize do
@failure_count += 1
@last_failure_time = Time.now
if @failure_count >= @failure_threshold
@state = :open
end
end
end
end
class ExternalApiClient
def initialize
@circuit_breaker = CircuitBreaker.new(failure_threshold: 3, timeout: 30)
end
def fetch_user_data(user_id)
@circuit_breaker.call do
# External API call
response = HTTP.get("https://api.example.com/users/#{user_id}")
JSON.parse(response.body)
end
rescue CircuitBreakerOpenError => error
# Return cached data or default response
fetch_cached_user_data(user_id)
end
end
class CircuitBreakerOpenError < StandardError; end
Error Handling & Debugging
Error monitoring captures, categorizes, and reports application exceptions with contextual information for debugging. Ruby applications implement error handling through structured exception capture, automatic error reporting, and custom error classification systems.
Exception tracking systems collect error data including stack traces, request context, user information, and environment details. Production applications integrate with error monitoring services while maintaining custom error handling logic.
require 'sentry-ruby'
require 'json'
class ErrorHandler
def initialize(logger, sentry_dsn = nil)
@logger = logger
setup_sentry(sentry_dsn) if sentry_dsn
end
def handle_error(error, context = {})
error_data = build_error_data(error, context)
# Log error locally
@logger.error("Application Error", error_data)
# Report to external service
report_to_sentry(error, context) if sentry_configured?
# Custom error processing
process_error_by_type(error, context)
error_data
end
def wrap_execution(context = {})
yield
rescue => error
handle_error(error, context)
raise unless context[:suppress_reraise]
end
private
def build_error_data(error, context)
{
error_class: error.class.name,
error_message: error.message,
backtrace: error.backtrace&.first(20),
context: context,
timestamp: Time.now.utc.iso8601,
process_id: Process.pid,
thread_id: Thread.current.object_id,
environment: {
ruby_version: RUBY_VERSION,
rails_env: ENV['RAILS_ENV'],
hostname: Socket.gethostname
}
}
end
def setup_sentry(dsn)
Sentry.init do |config|
config.dsn = dsn
config.breadcrumbs_logger = [:active_support_logger, :http_logger]
config.traces_sample_rate = 0.1
end
end
def report_to_sentry(error, context)
Sentry.with_scope do |scope|
scope.set_tags(context[:tags]) if context[:tags]
scope.set_user(context[:user]) if context[:user]
scope.set_context("request", context[:request]) if context[:request]
Sentry.capture_exception(error)
end
end
def process_error_by_type(error, context)
case error
when ActiveRecord::RecordNotFound
# Handle record not found errors
increment_metric('errors.not_found')
when Net::TimeoutError
# Handle timeout errors
increment_metric('errors.timeout')
alert_operations_team(error, context) if context[:critical]
when JSON::ParserError
# Handle JSON parsing errors
increment_metric('errors.json_parse')
else
# Handle unexpected errors
increment_metric('errors.unexpected')
alert_operations_team(error, context)
end
end
def sentry_configured?
defined?(Sentry) && Sentry.configuration.dsn
end
def increment_metric(metric_name)
# Placeholder for metrics system
puts "Metric incremented: #{metric_name}"
end
def alert_operations_team(error, context)
# Placeholder for alerting system
puts "ALERT: #{error.class.name} - #{error.message}"
end
end
# Application integration
class ApplicationController < ActionController::Base
before_action :setup_error_context
rescue_from StandardError, with: :handle_application_error
private
def setup_error_context
@error_handler = ErrorHandler.new($logger, ENV['SENTRY_DSN'])
@error_context = {
user: { id: current_user&.id, email: current_user&.email },
request: {
path: request.path,
method: request.method,
params: request.parameters,
user_agent: request.user_agent,
ip_address: request.remote_ip
},
tags: {
controller: controller_name,
action: action_name,
environment: Rails.env
}
}
end
def handle_application_error(error)
error_data = @error_handler.handle_error(error, @error_context)
case error
when ActiveRecord::RecordNotFound
render json: { error: 'Resource not found' }, status: 404
when ActionController::ParameterMissing
render json: { error: 'Missing required parameters' }, status: 400
else
render json: { error: 'Internal server error' }, status: 500
end
end
end
Custom exception classes provide structured error handling with specific error codes and recovery strategies. Applications define exception hierarchies that enable precise error handling and reporting.
module ApplicationErrors
class BaseError < StandardError
attr_reader :error_code, :context, :severity
def initialize(message, error_code: nil, context: {}, severity: :error)
super(message)
@error_code = error_code || self.class.default_error_code
@context = context
@severity = severity
end
def self.default_error_code
name.demodulize.underscore.upcase
end
def to_h
{
error_class: self.class.name,
error_code: error_code,
message: message,
context: context,
severity: severity,
timestamp: Time.now.utc.iso8601
}
end
def retryable?
false
end
end
class ValidationError < BaseError
def severity
:warning
end
end
class ExternalServiceError < BaseError
def retryable?
true
end
end
class PaymentError < BaseError
attr_reader :payment_id, :amount
def initialize(message, payment_id:, amount:, **options)
@payment_id = payment_id
@amount = amount
super(message, **options)
end
def context
super.merge(payment_id: payment_id, amount: amount)
end
end
class RateLimitError < ExternalServiceError
attr_reader :retry_after
def initialize(message, retry_after:, **options)
@retry_after = retry_after
super(message, **options)
end
def context
super.merge(retry_after: retry_after)
end
end
end
# Usage in application services
class PaymentProcessor
include ApplicationErrors
def process_payment(amount, card_token)
validate_payment_amount(amount)
result = charge_card(amount, card_token)
unless result.success?
raise PaymentError.new(
"Payment processing failed: #{result.error_message}",
payment_id: result.payment_id,
amount: amount,
error_code: result.error_code,
context: { card_last_four: card_token.last_four }
)
end
result
rescue Net::TimeoutError => error
raise ExternalServiceError.new(
"Payment gateway timeout",
error_code: 'GATEWAY_TIMEOUT',
context: { amount: amount, gateway: 'stripe' }
)
rescue JSON::ParserError => error
raise ExternalServiceError.new(
"Invalid response from payment gateway",
error_code: 'GATEWAY_INVALID_RESPONSE',
context: { amount: amount, response_body: error.data }
)
end
private
def validate_payment_amount(amount)
if amount <= 0
raise ValidationError.new(
"Payment amount must be positive",
error_code: 'INVALID_AMOUNT',
context: { amount: amount }
)
end
if amount > 100_000_00 # $100,000 in cents
raise ValidationError.new(
"Payment amount exceeds maximum limit",
error_code: 'AMOUNT_TOO_LARGE',
context: { amount: amount, limit: 100_000_00 }
)
end
end
end
Debugging tools provide runtime inspection and error analysis capabilities. Production applications implement debugging interfaces that expose application state without compromising security.
class DebugInspector
def initialize(application_name)
@application_name = application_name
@debug_data = {}
@mutex = Mutex.new
end
def capture_state(identifier, data)
@mutex.synchronize do
@debug_data[identifier] = {
data: sanitize_data(data),
timestamp: Time.now.utc.iso8601,
thread_id: Thread.current.object_id
}
end
end
def inspect_error(error, context = {})
error_info = {
error_class: error.class.name,
error_message: error.message,
backtrace: sanitize_backtrace(error.backtrace),
context: sanitize_data(context),
environment_info: gather_environment_info,
memory_info: gather_memory_info,
thread_info: gather_thread_info
}
capture_state("error_#{SecureRandom.hex(8)}", error_info)
error_info
end
def generate_debug_report
@mutex.synchronize do
{
application: @application_name,
generated_at: Time.now.utc.iso8601,
runtime_info: {
ruby_version: RUBY_VERSION,
platform: RUBY_PLATFORM,
engine: RUBY_ENGINE
},
captured_states: @debug_data,
system_metrics: gather_system_metrics
}
end
end
def clear_debug_data
@mutex.synchronize do
@debug_data.clear
end
end
private
def sanitize_data(data)
case data
when Hash
data.each_with_object({}) do |(key, value), sanitized|
sanitized_key = key.to_s.downcase
if sensitive_key?(sanitized_key)
sanitized[key] = '[FILTERED]'
else
sanitized[key] = sanitize_data(value)
end
end
when Array
data.map { |item| sanitize_data(item) }
when String
data.length > 1000 ? "#{data[0..997]}..." : data
else
data
end
end
def sensitive_key?(key)
%w[password token secret key authorization].any? { |sensitive| key.include?(sensitive) }
end
def sanitize_backtrace(backtrace)
return [] unless backtrace
backtrace.first(25).map do |line|
# Remove absolute paths for security
line.gsub(Dir.pwd, '[APP_ROOT]')
end
end
def gather_environment_info
{
hostname: Socket.gethostname,
process_id: Process.pid,
parent_process_id: Process.ppid,
user_id: Process.uid,
working_directory: Dir.pwd
}
end
def gather_memory_info
{
object_count: ObjectSpace.count_objects,
gc_stats: GC.stat,
process_memory_kb: `ps -o rss= -p #{Process.pid}`.to_i
}
rescue
{ error: 'Memory info unavailable' }
end
def gather_thread_info
{
thread_count: Thread.list.count,
main_thread_alive: Thread.main.alive?,
current_thread_priority: Thread.current.priority
}
end
def gather_system_metrics
{
load_average: `uptime`.match(/load average: (.+)$/)[1] rescue 'unavailable',
disk_usage: `df -h /`.lines.last.split rescue ['unavailable'],
timestamp: Time.now.utc.iso8601
}
end
end
# Integration with error handler
class ProductionErrorHandler < ErrorHandler
def initialize(logger, sentry_dsn = nil)
super(logger, sentry_dsn)
@debug_inspector = DebugInspector.new('production-app')
end
def handle_error(error, context = {})
# Capture debug state
debug_info = @debug_inspector.inspect_error(error, context)
# Enhanced context with debug info
enhanced_context = context.merge(debug_session_id: debug_info[:debug_session_id])
# Call parent error handling
super(error, enhanced_context)
end
def generate_incident_report(incident_id)
{
incident_id: incident_id,
debug_report: @debug_inspector.generate_debug_report,
generated_at: Time.now.utc.iso8601
}
end
end
Performance & Memory
Performance monitoring tracks application response times, throughput, resource utilization, and bottleneck identification. Ruby applications implement performance measurement through timing instrumentation, profiling integration, and resource monitoring systems.
Timing measurement captures method execution duration, request processing time, and external service response times. Applications use Ruby's Benchmark
module and custom timing collectors to gather performance data.
require 'benchmark'
require 'json'
class PerformanceMonitor
def initialize
@measurements = []
@mutex = Mutex.new
@gc_stats_before = nil
end
def measure_execution(operation_name, metadata = {})
gc_stats_before = GC.stat
memory_before = memory_usage_kb
result = nil
benchmark = Benchmark.measure do
result = yield
end
gc_stats_after = GC.stat
memory_after = memory_usage_kb
measurement = {
operation: operation_name,
duration_seconds: benchmark.real,
cpu_time_seconds: benchmark.total,
user_time_seconds: benchmark.utime,
system_time_seconds: benchmark.stime,
memory_delta_kb: memory_after - memory_before,
gc_runs: gc_stats_after[:count] - gc_stats_before[:count],
objects_allocated: gc_stats_after[:total_allocated_objects] - gc_stats_before[:total_allocated_objects],
timestamp: Time.now.utc.iso8601,
metadata: metadata
}
record_measurement(measurement)
result
end
def measure_memory_allocation(&block)
gc_disable = GC.disable
before_stats = GC.stat
memory_before = memory_usage_kb
result = yield
after_stats = GC.stat
memory_after = memory_usage_kb
{
result: result,
objects_allocated: after_stats[:total_allocated_objects] - before_stats[:total_allocated_objects],
memory_allocated_kb: memory_after - memory_before,
gc_disabled: gc_disable
}
ensure
GC.enable unless gc_disable
end
def benchmark_comparison(operations = {})
results = {}
operations.each do |name, operation|
measurements = Array.new(5) do
measure_execution("benchmark_#{name}") { operation.call }
end
durations = measurements.map { |m| m[:duration_seconds] }
results[name] = {
min: durations.min,
max: durations.max,
avg: durations.sum / durations.length,
median: durations.sort[durations.length / 2],
iterations: durations.length
}
end
results
end
def performance_report(time_range = 3600)
cutoff_time = Time.now - time_range
recent_measurements = @measurements.select do |m|
Time.parse(m[:timestamp]) >= cutoff_time
end
operations = recent_measurements.group_by { |m| m[:operation] }
{
report_period: time_range,
measurement_count: recent_measurements.length,
operations: operations.transform_values do |measurements|
durations = measurements.map { |m| m[:duration_seconds] }
memory_deltas = measurements.map { |m| m[:memory_delta_kb] }
{
call_count: measurements.length,
avg_duration: durations.sum / durations.length,
p95_duration: percentile(durations, 95),
p99_duration: percentile(durations, 99),
max_duration: durations.max,
avg_memory_delta: memory_deltas.sum / memory_deltas.length,
total_objects_allocated: measurements.sum { |m| m[:objects_allocated] }
}
end
}
end
private
def record_measurement(measurement)
@mutex.synchronize do
@measurements << measurement
# Keep only recent measurements to prevent memory growth
@measurements = @measurements.last(10_000) if @measurements.length > 12_000
end
end
def memory_usage_kb
`ps -o rss= -p #{Process.pid}`.to_i
rescue
0
end
def percentile(values, percentile)
return 0 if values.empty?
sorted = values.sort
index = (percentile / 100.0 * sorted.length).ceil - 1
sorted[index]
end
end
# Application integration
class UserService
def initialize
@performance_monitor = PerformanceMonitor.new
end
def create_user_with_profile(user_data, profile_data)
@performance_monitor.measure_execution('user_creation', {
has_profile: !profile_data.empty?,
user_attributes_count: user_data.keys.length
}) do
ActiveRecord::Base.transaction do
user = create_user(user_data)
profile = create_profile(user, profile_data) unless profile_data.empty?
send_welcome_email(user)
{ user: user, profile: profile }
end
end
end
def bulk_import_users(users_data)
comparison_results = @performance_monitor.benchmark_comparison({
individual_creates: -> { users_data.each { |data| User.create(data) } },
bulk_insert: -> { User.insert_all(users_data) },
activerecord_import: -> { User.import(users_data.map { |data| User.new(data) }) }
})
puts "Bulk import performance comparison:"
comparison_results.each do |method, stats|
puts "#{method}: #{stats[:avg]}s average (#{stats[:iterations]} runs)"
end
# Use the fastest method based on results
fastest_method = comparison_results.min_by { |_, stats| stats[:avg] }.first
send("bulk_import_#{fastest_method}", users_data)
end
private
def create_user(user_data)
User.create!(user_data)
end
def create_profile(user, profile_data)
user.create_profile!(profile_data)
end
def send_welcome_email(user)
WelcomeMailer.welcome_email(user).deliver_now
end
end
Memory profiling identifies memory leaks, object allocation patterns, and garbage collection performance. Ruby applications use memory profiling tools and custom instrumentation to monitor memory usage.
require 'objspace'
class MemoryProfiler
def initialize
@snapshots = {}
@allocation_tracking = false
end
def take_snapshot(name)
ObjectSpace.trace_object_allocations_start unless @allocation_tracking
@allocation_tracking = true
snapshot = {
timestamp: Time.now.utc.iso8601,
object_counts: ObjectSpace.count_objects,
gc_stats: GC.stat,
memory_usage_kb: memory_usage_kb,
object_allocations: sample_object_allocations,
largest_objects: find_largest_objects
}
@snapshots[name] = snapshot
snapshot
end
def compare_snapshots(before_name, after_name)
before = @snapshots[before_name]
after = @snapshots[after_name]
return nil unless before && after
{
time_elapsed: Time.parse(after[:timestamp]) - Time.parse(before[:timestamp]),
memory_delta_kb: after[:memory_usage_kb] - before[:memory_usage_kb],
object_count_deltas: calculate_object_deltas(before[:object_counts], after[:object_counts]),
gc_stats_deltas: calculate_gc_deltas(before[:gc_stats], after[:gc_stats]),
new_allocations: after[:object_allocations] - before[:object_allocations]
}
end
def profile_memory_allocation(&block)
ObjectSpace.trace_object_allocations_start
before_snapshot = take_heap_snapshot
result = yield
after_snapshot = take_heap_snapshot
allocations = ObjectSpace.trace_object_allocations_stop
{
result: result,
before_snapshot: before_snapshot,
after_snapshot: after_snapshot,
allocations_by_class: group_allocations_by_class,
allocations_by_location: group_allocations_by_location,
total_allocated: after_snapshot[:total_objects] - before_snapshot[:total_objects]
}
end
def identify_memory_leaks(threshold_snapshots = 5)
return [] if @snapshots.length < threshold_snapshots
sorted_snapshots = @snapshots.values.sort_by { |s| Time.parse(s[:timestamp]) }
memory_trend = sorted_snapshots.map { |s| s[:memory_usage_kb] }
# Calculate memory growth trend
growth_rate = calculate_growth_rate(memory_trend)
leaks = []
# Check for consistent memory growth
if growth_rate > 1000 # 1MB per snapshot threshold
leaks << {
type: :consistent_growth,
growth_rate_kb_per_snapshot: growth_rate,
severity: growth_rate > 10_000 ? :critical : :warning
}
end
# Check for object count increases
object_trends = calculate_object_trends(sorted_snapshots)
object_trends.each do |klass, growth|
if growth > 10000 # 10k objects threshold
leaks << {
type: :object_accumulation,
object_class: klass,
growth_count: growth,
severity: growth > 100_000 ? :critical : :warning
}
end
end
leaks
end
def generate_memory_report
return {} if @snapshots.empty?
latest = @snapshots.values.max_by { |s| Time.parse(s[:timestamp]) }
{
current_memory_kb: latest[:memory_usage_kb],
total_objects: latest[:object_counts][:TOTAL],
gc_statistics: latest[:gc_stats],
top_object_classes: latest[:object_counts]
.reject { |k, _| k == :TOTAL }
.sort_by { |_, count| -count }
.first(10)
.to_h,
snapshot_count: @snapshots.length,
memory_leaks: identify_memory_leaks,
recommendations: generate_recommendations(latest)
}
end
private
def memory_usage_kb
`ps -o rss= -p #{Process.pid}`.to_i
rescue
0
end
def sample_object_allocations(sample_size = 1000)
return 0 unless @allocation_tracking
all_objects = ObjectSpace.each_object.to_a.sample(sample_size)
all_objects.count { |obj| ObjectSpace.allocation_sourcefile(obj) }
end
def find_largest_objects(count = 10)
largest = []
ObjectSpace.each_object do |obj|
size = ObjectSpace.memsize_of(obj)
if largest.length < count || size > largest.last[:size]
largest << {
class: obj.class.name,
size: size,
object_id: obj.object_id
}
largest.sort_by! { |o| -o[:size] }
largest.pop if largest.length > count
end
end
largest
rescue
[]
end
def take_heap_snapshot
{
timestamp: Time.now.utc.iso8601,
total_objects: ObjectSpace.count_objects[:TOTAL],
memory_kb: memory_usage_kb,
gc_count: GC.count
}
end
def group_allocations_by_class
allocations = {}
ObjectSpace.each_object do |obj|
klass = obj.class.name
allocations[klass] ||= 0
allocations[klass] += 1
end
allocations.sort_by { |_, count| -count }.first(20).to_h
rescue
{}
end
def calculate_object_deltas(before, after)
deltas = {}
(before.keys + after.keys).uniq.each do |key|
before_count = before[key] || 0
after_count = after[key] || 0
delta = after_count - before_count
deltas[key] = delta if delta != 0
end
deltas
end
def calculate_gc_deltas(before, after)
{
count: after[:count] - before[:count],
total_time: after[:total_time] - before[:total_time],
major_gc_count: after[:major_gc_count] - before[:major_gc_count],
minor_gc_count: after[:minor_gc_count] - before[:minor_gc_count]
}
end
def calculate_growth_rate(memory_values)
return 0 if memory_values.length < 2
total_growth = memory_values.last - memory_values.first
snapshots = memory_values.length - 1
total_growth.to_f / snapshots
end
def calculate_object_trends(snapshots)
return {} if snapshots.length < 2
first_counts = snapshots.first[:object_counts]
last_counts = snapshots.last[:object_counts]
trends = {}
(first_counts.keys + last_counts.keys).uniq.each do |klass|
first_count = first_counts[klass] || 0
last_count = last_counts[klass] || 0
growth = last_count - first_count
trends[klass] = growth if growth > 0
end
trends
end
def generate_recommendations(snapshot)
recommendations = []
# High memory usage
if snapshot[:memory_usage_kb] > 500_000 # 500MB
recommendations << "Consider reducing memory usage - current usage is #{snapshot[:memory_usage_kb] / 1024}MB"
end
# High object counts
high_count_classes = snapshot[:object_counts].select { |k, v| v > 100_000 && k != :TOTAL }
unless high_count_classes.empty?
recommendations << "High object counts detected: #{high_count_classes.keys.join(', ')}"
end
# GC pressure
if snapshot[:gc_stats][:minor_gc_count] > 1000
recommendations << "High GC pressure detected - consider object pooling or reducing allocations"
end
recommendations
end
end
Database query performance monitoring tracks SQL execution times, N+1 query detection, and database connection utilization. Applications instrument database operations to identify performance bottlenecks.
class DatabasePerformanceMonitor
def initialize
@query_stats = {}
@slow_query_threshold = 1.0 # 1 second
@n_plus_one_detector = NPlusOneDetector.new
@connection_pool_monitor = ConnectionPoolMonitor.new
end
def monitor_query(sql, binds = [])
normalized_sql = normalize_sql(sql)
query_id = generate_query_id(normalized_sql)
start_time = Time.now
connection_info = capture_connection_info
result = yield
execution_time = Time.now - start_time
record_query_stats(query_id, normalized_sql, execution_time, binds.length, connection_info)
detect_n_plus_one(normalized_sql)
if execution_time > @slow_query_threshold
log_slow_query(normalized_sql, execution_time, binds)
end
result
end
def analyze_query_patterns(time_window = 3600)
cutoff = Time.now - time_window
recent_stats = @query_stats.select { |_, stats| stats[:last_executed] >= cutoff }
{
total_queries: recent_stats.sum { |_, stats| stats[:count] },
unique_queries: recent_stats.length,
slow_queries: recent_stats.count { |_, stats| stats[:avg_time] > @slow_query_threshold },
most_frequent: recent_stats.max_by { |_, stats| stats[:count] },
slowest_average: recent_stats.max_by { |_, stats| stats[:avg_time] },
n_plus_one_occurrences: @n_plus_one_detector.detected_patterns.length,
connection_pool_stats: @connection_pool_monitor.current_stats
}
end
def generate_optimization_suggestions
suggestions = []
# Identify frequently executed slow queries
slow_frequent_queries = @query_stats.select do |_, stats|
stats[:count] > 100 && stats[:avg_time] > 0.5
end
slow_frequent_queries.each do |query_id, stats|
suggestions << {
type: :slow_frequent_query,
query_pattern: stats[:sql_pattern],
count: stats[:count],
avg_time: stats[:avg_time],
recommendation: "Consider adding database indexes or optimizing this frequently executed slow query"
}
end
# Check for N+1 query patterns
@n_plus_one_detector.detected_patterns.each do |pattern|
suggestions << {
type: :n_plus_one,
pattern: pattern[:pattern],
occurrences: pattern[:count],
recommendation: "Use includes/joins to eager load associations and eliminate N+1 queries"
}
end
# Connection pool utilization
pool_stats = @connection_pool_monitor.current_stats
if pool_stats[:utilization] > 0.8
suggestions << {
type: :connection_pool_pressure,
utilization: pool_stats[:utilization],
recommendation: "Consider increasing connection pool size or optimizing long-running queries"
}
end
suggestions
end
private
def normalize_sql(sql)
# Remove literal values and normalize whitespace
sql.gsub(/\$\d+|\?|'[^']*'|\d+/, '?')
.gsub(/\s+/, ' ')
.strip
.upcase
end
def generate_query_id(normalized_sql)
Digest::MD5.hexdigest(normalized_sql)[0, 12]
end
def record_query_stats(query_id, sql, execution_time, bind_count, connection_info)
@query_stats[query_id] ||= {
sql_pattern: sql,
count: 0,
total_time: 0.0,
avg_time: 0.0,
min_time: Float::INFINITY,
max_time: 0.0,
bind_count: bind_count,
first_executed: Time.now,
last_executed: Time.now
}
stats = @query_stats[query_id]
stats[:count] += 1
stats[:total_time] += execution_time
stats[:avg_time] = stats[:total_time] / stats[:count]
stats[:min_time] = [stats[:min_time], execution_time].min
stats[:max_time] = [stats[:max_time], execution_time].max
stats[:last_executed] = Time.now
end
def capture_connection_info
pool = ActiveRecord::Base.connection_pool
{
pool_size: pool.size,
active_connections: pool.connections.count(&:in_use?),
available_connections: pool.available_connection_count
}
rescue
{ error: 'Unable to capture connection info' }
end
def detect_n_plus_one(sql)
@n_plus_one_detector.analyze_query(sql, caller_locations(3, 5))
end
def log_slow_query(sql, execution_time, binds)
puts "SLOW QUERY (#{execution_time.round(3)}s): #{sql}"
puts "Binds: #{binds.inspect}" unless binds.empty?
puts "Backtrace: #{caller_locations(3, 3).map(&:to_s).join("\n ")}"
end
end
class NPlusOneDetector
def initialize
@query_patterns = {}
@request_queries = []
@detection_threshold = 10
end
def analyze_query(sql, caller_info)
return unless sql.match?(/SELECT.*FROM.*WHERE.*=\s*\?/i)
pattern = extract_pattern(sql)
location = caller_info&.first&.to_s
@request_queries << {
pattern: pattern,
location: location,
timestamp: Time.now
}
# Check for repeated patterns
recent_queries = @request_queries.last(50)
pattern_count = recent_queries.count { |q| q[:pattern] == pattern }
if pattern_count >= @detection_threshold
record_n_plus_one(pattern, location, pattern_count)
end
end
def detected_patterns
@query_patterns.values
end
def reset_request_queries
@request_queries.clear
end
private
def extract_pattern(sql)
# Extract table name and general query structure
sql.gsub(/\s+/, ' ')
.gsub(/'[^']*'|\d+/, '?')
.strip
end
def record_n_plus_one(pattern, location, count)
@query_patterns[pattern] ||= {
pattern: pattern,
first_detected: Time.now,
locations: Set.new,
count: 0
}
@query_patterns[pattern][:locations] << location if location
@query_patterns[pattern][:count] += 1
@query_patterns[pattern][:last_detected] = Time.now
end
end
class ConnectionPoolMonitor
def current_stats
pool = ActiveRecord::Base.connection_pool
{
size: pool.size,
checked_out: pool.stat[:busy],
checked_in: pool.stat[:dead],
available: pool.available_connection_count,
utilization: pool.stat[:busy].to_f / pool.size
}
rescue => error
{ error: error.message }
end
end
Reference
Core Classes and Modules
Class/Module | Purpose | Key Methods |
---|---|---|
Logger |
Standard logging functionality | #info , #error , #warn , #debug , #fatal |
Benchmark |
Performance measurement | .measure , .realtime , .bm , .bmbm |
TracePoint |
Runtime tracing and instrumentation | .new , #enable , #disable , #event , #defined_class |
ObjectSpace |
Object and memory inspection | .count_objects , .each_object , .trace_object_allocations_start |
GC |
Garbage collector interface | .stat , .count , .disable , .enable , .start |
Monitoring Patterns
Pattern | Use Case | Implementation |
---|---|---|
Method Wrapping | Instrument existing methods | Module#prepend , alias_method , define_method |
Circuit Breaker | Prevent cascading failures | State machine with failure thresholds |
Health Checks | Service availability monitoring | HTTP endpoints with dependency checks |
Metric Collection | Quantitative data gathering | Custom collectors with time-series data |
Error Handling | Exception tracking and reporting | Structured exception capture with context |
Popular Monitoring Gems
Gem | Purpose | Configuration | Key Features |
---|---|---|---|
newrelic_rpm |
Application Performance Monitoring | config/newrelic.yml |
Transaction tracing, custom metrics, error tracking |
scout_apm |
Lightweight APM | config/scout_apm.yml |
Performance monitoring, memory tracking, N+1 detection |
sentry-ruby |
Error tracking | Sentry.init block |
Exception capture, breadcrumbs, release tracking |
datadog |
Infrastructure monitoring | Datadog.configure |
Metrics, traces, logs, APM integration |
elastic-apm |
Elastic Stack APM | config/elastic_apm.yml |
Distributed tracing, metrics, error tracking |
Logger Configuration
Level | Numeric Value | Usage | Output Includes |
---|---|---|---|
FATAL |
4 | System unusable | Process termination errors |
ERROR |
3 | Error conditions | Exceptions, failures |
WARN |
2 | Warning conditions | Deprecated usage, recoverable errors |
INFO |
1 | Informational | Request logs, business events |
DEBUG |
0 | Debug information | Variable states, execution flow |
Benchmark Methods
Method | Returns | Description |
---|---|---|
Benchmark.measure |
Benchmark::Tms |
Single operation timing |
Benchmark.realtime |
Float |
Wall clock time only |
Benchmark.bm |
Array |
Multiple operation comparison |
Benchmark.bmbm |
Array |
Rehearsal + measurement |
TracePoint Events
Event | Triggers | Available Data |
---|---|---|
:call |
Method calls | #defined_class , #method_id , #parameters |
:return |
Method returns | #return_value , #defined_class , #method_id |
:c_call |
C method calls | #defined_class , #method_id |
:raise |
Exception raising | #raised_exception |
:line |
Line execution | #lineno , #path |
ObjectSpace Methods
Method | Returns | Description |
---|---|---|
ObjectSpace.count_objects |
Hash |
Object counts by class |
ObjectSpace.count_objects_size |
Hash |
Memory size by object type |
ObjectSpace.memsize_of(obj) |
Integer |
Memory size of specific object |
ObjectSpace.trace_object_allocations_start |
nil |
Begin allocation tracking |
ObjectSpace.allocation_sourcefile(obj) |
String |
Source file where object allocated |
GC Statistics
Stat Key | Type | Description |
---|---|---|
:count |
Integer | Total GC runs |
:time |
Integer | Total GC time (microseconds) |
:heap_allocated_pages |
Integer | Allocated heap pages |
:heap_available_slots |
Integer | Available object slots |
:heap_live_slots |
Integer | Live objects count |
:total_allocated_objects |
Integer | Total objects allocated |
:major_gc_count |
Integer | Major GC runs |
:minor_gc_count |
Integer | Minor GC runs |
Error Handling Patterns
# Basic error handling with context
begin
risky_operation
rescue SpecificError => error
logger.error("Operation failed", {
error: error.class.name,
message: error.message,
context: operation_context
})
handle_specific_error(error)
rescue StandardError => error
logger.error("Unexpected error", error_details(error))
raise
ensure
cleanup_resources
end
# Circuit breaker implementation
circuit_breaker = CircuitBreaker.new(
failure_threshold: 5,
timeout: 60,
success_threshold: 3
)
begin
result = circuit_breaker.call { external_service.call }
rescue CircuitBreakerOpenError
fallback_response
end
# Custom exception with structured data
class BusinessLogicError < StandardError
attr_reader :error_code, :details
def initialize(message, error_code:, details: {})
super(message)
@error_code = error_code
@details = details
end
def to_h
{
message: message,
error_code: error_code,
details: details,
backtrace: backtrace&.first(5)
}
end
end
Performance Monitoring Setup
# Method-level timing
class TimedMethods
def self.included(base)
base.extend(ClassMethods)
end
module ClassMethods
def time_method(method_name, options = {})
alias_method "#{method_name}_untimed", method_name
define_method(method_name) do |*args, &block|
start_time = Time.now
result = send("#{method_name}_untimed", *args, &block)
duration = Time.now - start_time
log_timing(method_name, duration, options)
result
end
end
end
end
# Memory allocation tracking
def track_allocations(&block)
ObjectSpace.trace_object_allocations_start
before_allocations = ObjectSpace.count_objects
result = yield
after_allocations = ObjectSpace.count_objects
ObjectSpace.trace_object_allocations_stop
{
result: result,
allocations: calculate_allocation_delta(before_allocations, after_allocations)
}
end
# Database query monitoring
ActiveSupport::Notifications.subscribe('sql.active_record') do |name, start, finish, id, payload|
duration = finish - start
if duration > 1.0 # Log slow queries
logger.warn("Slow Query", {
sql: payload[:sql],
duration: duration,
name: payload[:name]
})
end
end
Health Check Endpoint Pattern
class HealthController < ApplicationController
def show
checks = perform_health_checks
status = checks.all? { |_, check| check[:healthy] } ? :ok : :service_unavailable
render json: {
status: status == :ok ? 'healthy' : 'unhealthy',
checks: checks,
timestamp: Time.current.iso8601,
version: Rails.application.version
}, status: status
end
private
def perform_health_checks
{
database: check_database_connection,
redis: check_redis_connection,
external_api: check_external_services,
disk_space: check_disk_space
}
end
def check_database_connection
ActiveRecord::Base.connection.execute('SELECT 1')
{ healthy: true, response_time: measure_response_time }
rescue => error
{ healthy: false, error: error.message }
end
end