Overview
Memoization stores the result of expensive function calls and returns the cached result when the same inputs occur again. Ruby implements memoization through instance variables that cache computed values on first access. The technique prevents redundant calculations by checking if a value exists before computing it.
Ruby's memoization pattern uses the ||=
operator to assign values conditionally. This operator assigns a value only if the variable is nil
or false
. The pattern @variable ||= expensive_computation
executes the computation once and returns the cached result on subsequent calls.
class Calculator
def fibonacci(n)
@fibonacci ||= {}
@fibonacci[n] ||= n <= 1 ? n : fibonacci(n-1) + fibonacci(n-2)
end
end
calc = Calculator.new
calc.fibonacci(10) # Computes and caches
# => 55
calc.fibonacci(10) # Returns cached value
# => 55
Instance variables persist for the object's lifetime, making them suitable for caching across method calls. Class variables and class instance variables extend memoization to class-level scope, sharing cached values across all instances.
class DataProcessor
def process_large_dataset
@processed_data ||= expensive_database_query.transform(&:normalize)
end
private
def expensive_database_query
# Simulates expensive operation
(1..1000).map { |i| "Record #{i}" }
end
end
Memoization works best for computationally expensive operations with consistent input-output relationships. Methods that access external resources, perform complex calculations, or transform large datasets benefit from caching results.
Basic Usage
The simplest memoization pattern caches method return values using instance variables. The ||=
operator checks if the instance variable contains a value and assigns the computation result if the variable is nil
.
class WeatherService
def current_temperature
@temperature ||= fetch_from_api
end
private
def fetch_from_api
# Expensive API call
sleep(0.5) # Simulates network delay
72.5
end
end
service = WeatherService.new
service.current_temperature # Takes 0.5 seconds
# => 72.5
service.current_temperature # Returns immediately
# => 72.5
Hash-based memoization handles multiple inputs by using hash keys to distinguish between different parameter combinations. This approach scales to methods accepting various arguments.
class MathOperations
def initialize
@squares = {}
@powers = {}
end
def square(number)
@squares[number] ||= number * number
end
def power(base, exponent)
key = "#{base}_#{exponent}"
@powers[key] ||= base ** exponent
end
end
math = MathOperations.new
math.square(5) # => 25
math.power(2, 10) # => 1024
math.square(5) # Returns cached result
Methods with complex return values require careful consideration of what constitutes the same input. Arrays, hashes, and objects need consistent key generation to ensure proper cache hits.
class DataAnalyzer
def initialize
@analysis_cache = {}
end
def analyze_dataset(dataset_id, options = {})
cache_key = generate_key(dataset_id, options)
@analysis_cache[cache_key] ||= perform_analysis(dataset_id, options)
end
private
def generate_key(dataset_id, options)
"#{dataset_id}:#{options.sort.to_h}"
end
def perform_analysis(dataset_id, options)
{
mean: calculate_mean(dataset_id),
variance: calculate_variance(dataset_id),
filtered: options[:filter] || false
}
end
end
Class-level memoization shares cached values across all instances using class instance variables. This approach reduces memory usage when multiple instances need the same computed values.
class ConfigurationLoader
class << self
def database_config
@database_config ||= YAML.load_file('config/database.yml')
end
def redis_config
@redis_config ||= YAML.load_file('config/redis.yml')
end
end
end
# Shared across all instances
ConfigurationLoader.database_config
ConfigurationLoader.redis_config
Advanced Usage
Complex memoization scenarios require sophisticated caching strategies that handle multiple parameters, nested calls, and dynamic cache invalidation. Advanced patterns use metaprogramming to generate memoized methods automatically.
module Memoizable
def memoize(method_name)
original_method = instance_method(method_name)
cache_var = "@#{method_name}_cache"
define_method(method_name) do |*args|
cache = instance_variable_get(cache_var) || instance_variable_set(cache_var, {})
cache_key = args.empty? ? :no_args : args
cache[cache_key] ||= original_method.bind(self).call(*args)
end
end
end
class ComplexCalculator
extend Memoizable
def fibonacci_sequence(length)
(0...length).map { |i| fibonacci(i) }
end
memoize :fibonacci_sequence
def prime_factors(number)
return [] if number <= 1
factors = []
divisor = 2
while divisor * divisor <= number
while number % divisor == 0
factors << divisor
number /= divisor
end
divisor += 1
end
factors << number if number > 1
factors
end
memoize :prime_factors
end
Conditional memoization applies caching based on runtime conditions. This pattern prevents caching inappropriate values while maintaining performance benefits for suitable scenarios.
class AdaptiveCache
def initialize
@computation_cache = {}
end
def expensive_computation(input)
# Only cache for large inputs
if should_cache?(input)
@computation_cache[input] ||= perform_computation(input)
else
perform_computation(input)
end
end
private
def should_cache?(input)
input.respond_to?(:size) && input.size > 100
end
def perform_computation(input)
# Expensive operation here
input.map(&:upcase).sort.join
end
end
Hierarchical memoization creates cache dependencies where invalidating parent caches cascades to dependent child caches. This approach maintains consistency across related computations.
class ReportGenerator
def initialize
@raw_data_cache = {}
@processed_cache = {}
@report_cache = {}
end
def generate_report(data_source, filters)
cache_key = build_key(data_source, filters)
@report_cache[cache_key] ||= begin
processed = process_data(data_source, filters)
format_report(processed)
end
end
def invalidate_cache(data_source = nil)
if data_source
@raw_data_cache.delete(data_source)
@processed_cache.delete_if { |key, _| key.start_with?(data_source.to_s) }
@report_cache.delete_if { |key, _| key.include?(data_source.to_s) }
else
@raw_data_cache.clear
@processed_cache.clear
@report_cache.clear
end
end
private
def process_data(data_source, filters)
cache_key = "#{data_source}:#{filters.hash}"
@processed_cache[cache_key] ||= begin
raw_data = fetch_raw_data(data_source)
apply_filters(raw_data, filters)
end
end
def fetch_raw_data(data_source)
@raw_data_cache[data_source] ||= DatabaseConnection.fetch(data_source)
end
end
Time-based cache expiration combines memoization with automatic invalidation. This pattern ensures cached values remain fresh while maintaining performance benefits.
class TimedMemoization
def initialize(ttl_seconds = 300)
@cache = {}
@timestamps = {}
@ttl = ttl_seconds
end
def cached_method(key)
return @cache[key] if valid_cache?(key)
@cache[key] = expensive_operation(key)
@timestamps[key] = Time.now
@cache[key]
end
def clear_expired
current_time = Time.now
@timestamps.each do |key, timestamp|
if current_time - timestamp > @ttl
@cache.delete(key)
@timestamps.delete(key)
end
end
end
private
def valid_cache?(key)
@cache.key?(key) &&
@timestamps.key?(key) &&
(Time.now - @timestamps[key]) <= @ttl
end
def expensive_operation(key)
# Simulated expensive computation
sleep(0.1)
"Result for #{key} at #{Time.now}"
end
end
Performance & Memory
Memoization trades memory for CPU cycles, storing computed results to avoid repeated calculations. The performance benefit depends on computation cost, access frequency, and memory overhead of cached values.
Measurement demonstrates memoization impact on recursive algorithms. The Fibonacci sequence without memoization exhibits exponential time complexity, while memoized versions achieve linear performance.
require 'benchmark'
class FibonacciComparison
def initialize
@memo_cache = {}
end
# Unmemoized version - exponential complexity
def fib_recursive(n)
return n if n <= 1
fib_recursive(n - 1) + fib_recursive(n - 2)
end
# Memoized version - linear complexity
def fib_memoized(n)
return n if n <= 1
@memo_cache[n] ||= fib_memoized(n - 1) + fib_memoized(n - 2)
end
def benchmark_comparison
Benchmark.bm(15) do |x|
x.report("Recursive (30):") { fib_recursive(30) }
x.report("Memoized (30):") { fib_memoized(30) }
x.report("Memoized (100):") { fib_memoized(100) }
end
end
end
# Results show dramatic performance improvement:
# Recursive (30): ~0.2 seconds
# Memoized (30): ~0.0001 seconds
# Memoized (100): ~0.0002 seconds
Memory usage scales with the number of unique inputs and size of cached results. Large cache sizes can cause memory pressure and garbage collection overhead that negates performance gains.
class MemoryAwareMemoization
def initialize(max_cache_size = 1000)
@cache = {}
@access_order = []
@max_size = max_cache_size
end
def compute_with_limit(input)
if @cache.key?(input)
# Move to end (most recently used)
@access_order.delete(input)
@access_order << input
return @cache[input]
end
result = expensive_computation(input)
# Evict least recently used if at capacity
if @cache.size >= @max_size
oldest = @access_order.shift
@cache.delete(oldest)
end
@cache[input] = result
@access_order << input
result
end
def cache_statistics
{
size: @cache.size,
memory_usage: calculate_memory_usage,
hit_ratio: @hits.to_f / (@hits + @misses)
}
end
private
def calculate_memory_usage
# Approximate memory usage calculation
@cache.map { |k, v| k.to_s.bytesize + v.to_s.bytesize }.sum
end
end
Cache warming improves initial response times by precomputing commonly accessed values. This strategy works well for predictable access patterns and bounded input ranges.
class OptimizedDataProcessor
def initialize
@cache = {}
warm_cache
end
def process_request(request_type, parameters)
cache_key = "#{request_type}:#{parameters.hash}"
@cache[cache_key] ||= execute_processing(request_type, parameters)
end
private
def warm_cache
# Precompute common request types
common_requests.each do |request_type, params_list|
params_list.each do |params|
process_request(request_type, params)
end
end
end
def common_requests
{
'user_stats' => [
{ period: '7d' }, { period: '30d' }, { period: '90d' }
],
'system_health' => [
{ component: 'database' }, { component: 'cache' }
]
}
end
end
Profiling reveals memoization effectiveness by measuring cache hit rates and response time improvements. Monitoring cache performance guides optimization decisions.
class ProfiledMemoization
def initialize
@cache = {}
@stats = { hits: 0, misses: 0, total_time_saved: 0.0 }
end
def memoized_operation(input)
start_time = Time.now
if @cache.key?(input)
@stats[:hits] += 1
computation_time = 0.0 # No computation needed
result = @cache[input]
else
@stats[:misses] += 1
result = expensive_operation(input)
computation_time = Time.now - start_time
@cache[input] = result
end
# Estimate time saved on cache hits
if @cache.key?(input) && @stats[:hits] > 1
@stats[:total_time_saved] += estimated_computation_time(input)
end
result
end
def performance_report
total_requests = @stats[:hits] + @stats[:misses]
hit_rate = (@stats[:hits].to_f / total_requests * 100).round(2)
{
total_requests: total_requests,
cache_hit_rate: "#{hit_rate}%",
time_saved: "#{@stats[:total_time_saved].round(3)}s",
cache_size: @cache.size
}
end
end
Thread Safety & Concurrency
Memoization in multi-threaded environments creates race conditions when multiple threads simultaneously access uninitialized cached values. Thread-safe memoization requires synchronization mechanisms to prevent concurrent modification of cache data structures.
The standard ||=
operator is not atomic across threads. Two threads checking @cache[key]
simultaneously may both find nil
and execute the expensive computation twice, with one result overwriting the other.
require 'thread'
class UnsafeMemoization
def initialize
@cache = {}
end
def dangerous_method(input)
@cache[input] ||= expensive_computation(input)
end
private
def expensive_computation(input)
# Simulate work and track concurrent access
puts "Computing #{input} in thread #{Thread.current.object_id}"
sleep(0.1)
"Result for #{input}"
end
end
# Demonstrates race condition
unsafe = UnsafeMemoization.new
threads = 5.times.map do |i|
Thread.new { unsafe.dangerous_method("test") }
end
threads.each(&:join)
# Output shows multiple computations for same input
Mutex-based synchronization ensures thread-safe memoization by serializing access to cache operations. This approach prevents race conditions but creates potential bottlenecks under high concurrency.
require 'thread'
class ThreadSafeMemoization
def initialize
@cache = {}
@mutex = Mutex.new
end
def safe_method(input)
# Double-checked locking pattern
return @cache[input] if @cache.key?(input)
@mutex.synchronize do
# Check again inside mutex to avoid duplicate computation
@cache[input] ||= expensive_computation(input)
end
end
def bulk_operation(inputs)
results = {}
# Batch synchronization for multiple operations
@mutex.synchronize do
inputs.each do |input|
@cache[input] ||= expensive_computation(input)
results[input] = @cache[input]
end
end
results
end
private
def expensive_computation(input)
puts "Computing #{input} in thread #{Thread.current.object_id}"
sleep(0.01)
"Result for #{input}"
end
end
Read-write locks optimize concurrent access by allowing multiple readers while maintaining exclusive write access. This pattern improves performance when cache reads significantly outnumber writes.
require 'thread'
class ReadWriteMemoization
def initialize
@cache = {}
@lock = ReadWriteLock.new # Custom implementation or gem
end
def concurrent_method(input)
# Try read lock first (non-blocking)
@lock.read_lock do
return @cache[input] if @cache.key?(input)
end
# Need write lock for computation
@lock.write_lock do
# Double-check inside write lock
@cache[input] ||= expensive_computation(input)
end
end
def cache_size
@lock.read_lock { @cache.size }
end
def clear_cache
@lock.write_lock { @cache.clear }
end
end
class ReadWriteLock
def initialize
@read_ready = ConditionVariable.new
@write_ready = ConditionVariable.new
@mutex = Mutex.new
@reading = 0
@writing = false
@pending_writers = 0
end
def read_lock
@mutex.synchronize do
@read_ready.wait(@mutex) while @writing || @pending_writers > 0
@reading += 1
end
begin
yield
ensure
@mutex.synchronize do
@reading -= 1
@write_ready.signal if @reading == 0
end
end
end
def write_lock
@mutex.synchronize do
@pending_writers += 1
@write_ready.wait(@mutex) while @reading > 0 || @writing
@writing = true
@pending_writers -= 1
end
begin
yield
ensure
@mutex.synchronize do
@writing = false
@write_ready.signal
@read_ready.broadcast
end
end
end
end
Thread-local storage provides isolation by maintaining separate caches per thread. This approach eliminates synchronization overhead but increases memory usage and prevents cache sharing across threads.
class ThreadLocalMemoization
def initialize
@thread_key = :"cache_#{object_id}"
end
def thread_safe_method(input)
cache = Thread.current[@thread_key] ||= {}
cache[input] ||= expensive_computation(input)
end
def global_cache_stats
stats = { threads: 0, total_entries: 0, memory_usage: 0 }
Thread.list.each do |thread|
if thread[@thread_key]
stats[:threads] += 1
stats[:total_entries] += thread[@thread_key].size
stats[:memory_usage] += estimate_cache_size(thread[@thread_key])
end
end
stats
end
def cleanup_dead_threads
Thread.list.each do |thread|
unless thread.alive?
thread[@thread_key] = nil if thread[@thread_key]
end
end
end
private
def estimate_cache_size(cache)
cache.map { |k, v| k.to_s.bytesize + v.to_s.bytesize }.sum
end
end
Common Pitfalls
Memoization with nil
or false
return values breaks the ||=
pattern because these values are falsy in Ruby. The operator treats false
and nil
as absent values, causing repeated computation even after caching.
class ProblematicMemoization
def initialize
@cache = {}
end
# Broken: will recompute if method returns false
def might_return_false(input)
@cache[input] ||= check_condition(input)
end
# Fixed: explicit hash key check
def correctly_handles_false(input)
if @cache.key?(input)
@cache[input]
else
@cache[input] = check_condition(input)
end
end
# Alternative fix: sentinel value pattern
def sentinel_pattern(input)
sentinel = Object.new
result = @cache.fetch(input, sentinel)
if result == sentinel
@cache[input] = check_condition(input)
else
result
end
end
private
def check_condition(input)
# Sometimes returns false legitimately
input.even?
end
end
# Demonstrates the problem
broken = ProblematicMemoization.new
puts broken.might_return_false(2) # false - computed
puts broken.might_return_false(2) # false - computed again!
fixed = ProblematicMemoization.new
puts fixed.correctly_handles_false(2) # false - computed
puts fixed.correctly_handles_false(2) # false - cached
Memoization of mutable objects creates unexpected aliasing behavior when cached objects are modified after storage. Changes to returned objects affect the cached value, potentially corrupting future method calls.
class MutableObjectProblem
def initialize
@cache = {}
end
# Dangerous: returns mutable cached object
def get_user_data(user_id)
@cache[user_id] ||= fetch_user_data(user_id)
end
# Safer: returns copy of cached object
def get_user_data_copy(user_id)
cached = @cache[user_id] ||= fetch_user_data(user_id)
cached.dup
end
# Best: freeze cached objects
def get_user_data_frozen(user_id)
@cache[user_id] ||= fetch_user_data(user_id).freeze
end
private
def fetch_user_data(user_id)
{ name: "User #{user_id}", settings: { theme: 'dark' } }
end
end
service = MutableObjectProblem.new
user_data = service.get_user_data(123)
user_data[:name] = "Modified Name" # Corrupts cached value!
# Next call returns modified data
corrupted_data = service.get_user_data(123)
puts corrupted_data[:name] # => "Modified Name"
Parameter-dependent memoization fails when objects with different content produce identical hash keys. Ruby's default hash implementation for custom classes uses object identity, not content, leading to cache misses for equivalent objects.
class HashKeyProblem
def initialize
@cache = {}
end
# Problem: different instances with same content miss cache
def process_request(request_object)
@cache[request_object] ||= expensive_processing(request_object)
end
# Better: use content-based key generation
def process_request_fixed(request_object)
cache_key = generate_content_key(request_object)
@cache[cache_key] ||= expensive_processing(request_object)
end
private
def generate_content_key(request)
case request
when Hash
request.sort.to_h.hash
when Array
request.hash
when String
request.hash
else
# For custom objects, implement meaningful hash
request.respond_to?(:to_h) ? request.to_h.hash : request.to_s.hash
end
end
def expensive_processing(request)
"Processed: #{request}"
end
end
class RequestData
attr_reader :id, :params
def initialize(id, params)
@id = id
@params = params
end
# Without proper hash/eql? implementation, identical content misses cache
def hash
[id, params].hash
end
def eql?(other)
other.is_a?(RequestData) && id == other.id && params == other.params
end
end
Memory leaks occur when memoization caches grow unbounded, especially with dynamic input values. Long-running processes can exhaust memory if cache invalidation strategies are not implemented.
class MemoryLeakExample
def initialize
@unbounded_cache = {}
@time_based_cache = {}
@cache_timestamps = {}
end
# Dangerous: cache grows forever
def leaky_method(dynamic_input)
@unbounded_cache[dynamic_input] ||= process_data(dynamic_input)
end
# Better: implement cache size limits
def size_limited_method(dynamic_input, max_size = 1000)
if @unbounded_cache.size >= max_size
# Remove oldest entries (simple FIFO)
keys_to_remove = @unbounded_cache.keys.first(max_size / 4)
keys_to_remove.each { |key| @unbounded_cache.delete(key) }
end
@unbounded_cache[dynamic_input] ||= process_data(dynamic_input)
end
# Best: time-based expiration
def expiring_method(dynamic_input, ttl_seconds = 3600)
cleanup_expired_entries(ttl_seconds)
if @time_based_cache.key?(dynamic_input)
return @time_based_cache[dynamic_input]
end
result = process_data(dynamic_input)
@time_based_cache[dynamic_input] = result
@cache_timestamps[dynamic_input] = Time.now
result
end
private
def cleanup_expired_entries(ttl_seconds)
current_time = Time.now
expired_keys = @cache_timestamps.select do |key, timestamp|
current_time - timestamp > ttl_seconds
end.keys
expired_keys.each do |key|
@time_based_cache.delete(key)
@cache_timestamps.delete(key)
end
end
def process_data(input)
"Processed: #{input}"
end
end
Exception handling in memoized methods can cache error states, preventing recovery from transient failures. Failed computations should not be memoized to allow retry opportunities.
class ExceptionHandlingMemoization
def initialize
@cache = {}
@error_cache = {}
end
# Wrong: caches exceptions permanently
def bad_error_handling(input)
@cache[input] ||= risky_operation(input)
rescue StandardError => e
@error_cache[input] = e
raise
end
# Better: don't cache failed operations
def good_error_handling(input)
return @cache[input] if @cache.key?(input)
begin
result = risky_operation(input)
@cache[input] = result
result
rescue StandardError => e
# Don't cache the exception - allow retry
raise
end
end
# Advanced: cache exceptions with expiration
def sophisticated_error_handling(input, error_ttl = 60)
return @cache[input] if @cache.key?(input)
# Check if we have a recent error for this input
if @error_cache.key?(input)
error_time, cached_error = @error_cache[input]
if Time.now - error_time < error_ttl
raise cached_error
else
@error_cache.delete(input)
end
end
begin
result = risky_operation(input)
@cache[input] = result
result
rescue StandardError => e
@error_cache[input] = [Time.now, e]
raise
end
end
private
def risky_operation(input)
# Simulates operation that might fail
raise "Network error" if input.to_s.include?('fail')
"Success: #{input}"
end
end
Reference
Core Memoization Patterns
Pattern | Syntax | Use Case | Considerations |
---|---|---|---|
||= operator |
@var ||= computation |
Simple value caching | Fails with nil /false results |
Hash check | @cache.key?(key) ? @cache[key] : (@cache[key] = computation) |
Handles falsy values | More verbose syntax |
Hash#fetch |
@cache.fetch(key) { @cache[key] = computation } |
Clean falsy handling | Block executed on miss |
Conditional assignment | @cache[key] = computation unless @cache.key?(key) |
Explicit cache check | Returns nil on hit |
Thread Safety Approaches
Approach | Implementation | Performance | Memory Usage |
---|---|---|---|
Mutex synchronization | @mutex.synchronize { @cache[key] ||= computation } |
Serialized access | Shared cache |
Read-write locks | @lock.read_lock { } and @lock.write_lock { } |
Concurrent reads | Shared cache |
Thread-local storage | Thread.current[:cache] ||= {} |
No contention | Per-thread caches |
Atomic operations | @cache.compute_if_absent(key) { computation } |
Lock-free (JRuby) | Shared cache |
Cache Management Strategies
Strategy | Implementation Pattern | Memory Impact | Performance Impact |
---|---|---|---|
Unbounded cache | @cache[key] ||= computation |
Grows indefinitely | Fast access |
Size-limited (LRU) | Track access order, evict oldest | Bounded growth | Moderate overhead |
Time-based expiration | Store timestamps, cleanup expired | Bounded with cleanup | Cleanup overhead |
Reference counting | Track usage, evict unused | Dynamic sizing | Tracking overhead |
Common Cache Key Strategies
# Simple keys
cache_key = method_parameter
# Multiple parameters
cache_key = [param1, param2, param3]
# Hash-based keys
cache_key = { param1: value1, param2: value2 }.hash
# String interpolation
cache_key = "#{prefix}:#{param1}:#{param2}"
# Content-based keys for objects
cache_key = object.respond_to?(:cache_key) ? object.cache_key : object.hash
Error Handling Patterns
Pattern | Code Example | Behavior | Use Case |
---|---|---|---|
No error caching | @cache[key] ||= risky_op rescue nil |
Always retry failures | Transient errors |
Exception bypass | return @cache[key] if cached; @cache[key] = risky_op |
Cache success only | Network operations |
Error expiration | check_error_age; cache_or_raise |
Temporary error cache | Rate limiting |
Fallback values | @cache[key] ||= (risky_op rescue default_value) |
Graceful degradation | Non-critical data |
Performance Optimization Guidelines
# Measure cache effectiveness
def cache_statistics
{
hit_rate: hits.to_f / (hits + misses),
memory_usage: cache.map { |k,v| k.to_s.bytesize + v.to_s.bytesize }.sum,
size: cache.size,
average_computation_time: total_computation_time / misses
}
end
# Cache warming for predictable access patterns
def warm_cache(expected_inputs)
expected_inputs.each { |input| memoized_method(input) }
end
# Batch operations for multiple cache operations
def batch_compute(inputs)
uncached = inputs.reject { |input| @cache.key?(input) }
computed = expensive_batch_operation(uncached)
uncached.zip(computed).each { |input, result| @cache[input] = result }
inputs.map { |input| @cache[input] }
end
Integration with Ruby Frameworks
Framework | Integration Pattern | Configuration | Considerations |
---|---|---|---|
Rails | ActiveSupport::Memoizable |
memoize :method_name |
Deprecated in Rails 3+ |
Rails (modern) | Custom concern | extend Memoizable |
Manual implementation |
Sinatra | Instance variables | @cache ||= {} |
Per-request lifecycle |
Sidekiq | Class-level cache | @@cache ||= {} |
Worker process shared |