Overview
Lazy evaluation defers computation until results are actually needed, contrasting with Ruby's default eager evaluation where operations execute immediately. Ruby implements lazy evaluation through the Enumerator::Lazy
class, which wraps enumerable objects and chains operations without processing data until a terminal operation forces evaluation.
The primary mechanism involves converting any enumerable to a lazy enumerator using the lazy
method, then chaining operations like map
, select
, and reject
. These operations return new lazy enumerators rather than processing data immediately. Only when a terminal method like to_a
, first
, or each
executes does Ruby perform the accumulated transformations.
# Eager evaluation - processes immediately
[1, 2, 3, 4, 5].map { |x| x * 2 }.select { |x| x > 4 }
# => [6, 8, 10]
# Lazy evaluation - defers processing
[1, 2, 3, 4, 5].lazy.map { |x| x * 2 }.select { |x| x > 4 }
# => #<Enumerator::Lazy: ...>
Lazy evaluation excels with large datasets, infinite sequences, and scenarios where early termination saves processing time. Ruby creates a pipeline of transformations that processes elements one at a time through the entire chain, rather than creating intermediate arrays between each operation.
The Enumerator::Lazy
class supports most enumerable methods including map
, select
, reject
, take
, drop
, flat_map
, and zip
. Each method returns another lazy enumerator, maintaining the deferred execution pattern until a terminal operation triggers evaluation.
Basic Usage
Converting any enumerable to lazy evaluation requires calling the lazy
method, which returns an Enumerator::Lazy
object. This lazy enumerator supports method chaining while deferring actual computation.
# Creating a lazy enumerator
numbers = (1..1000).lazy
# => #<Enumerator::Lazy: 1..1000>
# Chaining transformations
result = numbers.map { |x| x * 2 }
.select { |x| x.even? }
.take(5)
# => #<Enumerator::Lazy: ...>
# Force evaluation with terminal operation
result.to_a
# => [2, 4, 6, 8, 10]
The take
method proves particularly valuable with lazy evaluation, allowing extraction of specific numbers of results without processing the entire collection. This pattern works effectively with infinite ranges or large datasets.
# Processing only needed elements
fibonacci = Enumerator.new do |yielder|
a, b = 1, 1
loop do
yielder << a
a, b = b, a + b
end
end
fibonacci.lazy
.select { |n| n.even? }
.take(5)
.to_a
# => [2, 8, 34, 144, 610]
Multiple transformation methods chain together seamlessly, with each operation adding to the pipeline without executing immediately. The lazy enumerator maintains the sequence of operations internally and applies them during evaluation.
# Complex transformation chain
data = %w[apple banana cherry date elderberry]
result = data.lazy
.map(&:upcase)
.select { |word| word.length > 4 }
.map { |word| word.reverse }
.take(3)
.to_a
# => ["ELPPA", "ANANAB", "YRREHC"]
Lazy enumerators work with any enumerable object, including arrays, hashes, ranges, and custom enumerators. The conversion preserves the original enumerable's characteristics while adding deferred execution capabilities.
Advanced Usage
Complex lazy evaluation patterns emerge when combining multiple enumerators, creating custom lazy operations, and integrating with Ruby's broader enumerable ecosystem. The flat_map
method flattens nested structures lazily, proving essential for processing hierarchical data without memory overhead.
# Lazy processing of nested data
nested_data = [
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
]
result = nested_data.lazy
.flat_map(&:itself)
.map { |x| x ** 2 }
.select { |x| x > 10 }
.take(4)
.to_a
# => [16, 25, 36, 49]
The zip
method combines multiple lazy enumerators, creating tuples from corresponding elements. This pattern works effectively for parallel processing of multiple data streams without loading everything into memory.
# Combining multiple lazy enumerators
names = %w[alice bob charlie dave eve].lazy
ages = (20..30).lazy
scores = [85, 92, 78, 96, 88].lazy
combined = names.zip(ages, scores)
.map { |name, age, score| { name: name, age: age, score: score } }
.select { |person| person[:score] > 80 }
.take(3)
.to_a
# => [{:name=>"alice", :age=>20, :score=>85},
# {:name=>"bob", :age=>21, :score=>92},
# {:name=>"dave", :age=>23, :score=>96}]
Custom lazy operations integrate through the Enumerator::Lazy.new
constructor, accepting the source enumerator and a block that defines the transformation logic. The block receives a yielder object that produces values for the next stage in the pipeline.
# Custom lazy operation
module LazyExtensions
def lazy_sample(n)
Enumerator::Lazy.new(self) do |yielder, value|
yielder << value if rand < (n.to_f / size)
end
end
end
class Array
include LazyExtensions
end
# Using custom lazy operation
large_dataset = (1..10000).to_a
sample = large_dataset.lazy
.lazy_sample(0.1)
.map { |x| x * 2 }
.select { |x| x > 1000 }
.take(10)
.to_a
Lazy enumerators compose with regular enumerators and other Ruby objects implementing enumerable protocols. The cycle
method creates infinite repeating sequences that work naturally with lazy evaluation patterns.
# Infinite cycling with lazy evaluation
colors = %w[red green blue].cycle.lazy
numbers = (1..Float::INFINITY).lazy
colored_numbers = numbers.zip(colors)
.map { |num, color| "#{color}_#{num}" }
.select { |item| item.include?('red') }
.take(5)
.to_a
# => ["red_1", "red_4", "red_7", "red_10", "red_13"]
Performance & Memory
Lazy evaluation provides significant memory advantages when processing large datasets by avoiding intermediate array creation. Each operation in a lazy chain processes elements individually rather than creating full intermediate collections, reducing memory footprint substantially.
# Memory comparison demonstration
require 'benchmark'
large_range = (1..1_000_000)
# Eager evaluation creates intermediate arrays
eager_time = Benchmark.realtime do
large_range.map { |x| x * 2 }
.select { |x| x > 1_000_000 }
.first(10)
end
# Lazy evaluation processes elements individually
lazy_time = Benchmark.realtime do
large_range.lazy
.map { |x| x * 2 }
.select { |x| x > 1_000_000 }
.first(10)
end
puts "Eager: #{eager_time}s, Lazy: #{lazy_time}s"
# Lazy evaluation typically shows 2-10x performance improvement
The memory efficiency becomes particularly apparent with file processing, where lazy evaluation enables handling files larger than available RAM. Processing occurs line-by-line without loading entire files into memory.
# Memory-efficient file processing
def process_large_log(filename)
File.foreach(filename).lazy
.map(&:strip)
.reject(&:empty?)
.select { |line| line.include?('ERROR') }
.map { |line| parse_log_entry(line) }
.select { |entry| entry[:timestamp] > Time.now - 86400 }
.take(100)
.to_a
end
# Processes gigabyte files with constant memory usage
error_entries = process_large_log('/var/log/application.log')
Early termination scenarios demonstrate lazy evaluation's computational efficiency. When operations like first
, take
, or detect
can satisfy requirements without processing entire collections, lazy evaluation stops computation immediately.
# Early termination efficiency
def find_prime_lazy(limit)
(2..Float::INFINITY).lazy
.select { |n| prime?(n) }
.take(limit)
.to_a
end
def find_prime_eager(limit)
(2..1_000_000).select { |n| prime?(n) }
.first(limit)
end
# Lazy version stops at the 100th prime
# Eager version checks all numbers up to 1,000,000
primes = find_prime_lazy(100)
However, lazy evaluation introduces computational overhead for each element processed through the pipeline. When processing small collections entirely, eager evaluation often performs better due to reduced method call overhead and simpler execution paths.
# Performance crossover point
small_data = (1..100).to_a
large_data = (1..100_000).to_a
# Small data - eager faster
small_eager = small_data.map(&:to_s).select(&:even?).first(10)
small_lazy = small_data.lazy.map(&:to_s).select(&:even?).first(10)
# Large data - lazy faster
large_eager = large_data.map(&:to_s).select(&:even?).first(10)
large_lazy = large_data.lazy.map(&:to_s).select(&:even?).first(10)
Lazy evaluation performs optimally with expensive operations, sparse filtering conditions, or scenarios requiring only partial results from large datasets. The performance benefits compound with pipeline depth and data size.
Common Pitfalls
Lazy evaluation defers execution until terminal operations force evaluation, creating debugging challenges when operations don't behave as expected. The lazy enumerator itself doesn't execute transformations, making intermediate inspection difficult without triggering evaluation.
# Debugging lazy operations
data = [1, 2, 3, 4, 5]
lazy_result = data.lazy.map { |x| puts "Processing #{x}"; x * 2 }
# No output yet - transformations not executed
puts lazy_result.class
# => Enumerator::Lazy
# Output appears only during evaluation
lazy_result.to_a
# Processing 1
# Processing 2
# Processing 3
# Processing 4
# Processing 5
# => [2, 4, 6, 8, 10]
Side effects within lazy operations execute during evaluation rather than chain construction, potentially causing confusion about when effects occur. This behavior differs significantly from eager evaluation where side effects happen immediately.
# Side effect timing confusion
counter = 0
lazy_chain = (1..5).lazy.map do |x|
counter += 1
puts "Counter: #{counter}"
x * 2
end
puts "Chain created, counter: #{counter}"
# => Chain created, counter: 0
# Side effects execute during evaluation
first_three = lazy_chain.take(3).to_a
# Counter: 1
# Counter: 2
# Counter: 3
puts "After taking 3, counter: #{counter}"
# => After taking 3, counter: 3
Multiple enumerations of the same lazy enumerator re-execute all transformations, unlike arrays which cache results. This behavior can cause unexpected performance implications and side effect repetition.
# Multiple enumeration re-execution
expensive_operation = proc { |x| sleep(0.1); x ** 2 }
lazy_squares = (1..5).lazy.map(&expensive_operation)
# First enumeration takes ~0.5 seconds
first_pass = lazy_squares.to_a
# => [1, 4, 9, 16, 25]
# Second enumeration re-executes, another ~0.5 seconds
second_pass = lazy_squares.to_a
# => [1, 4, 9, 16, 25]
# Convert to array for caching if multiple enumerations needed
cached_squares = lazy_squares.to_a
third_pass = cached_squares # Instant
Infinite sequences combined with certain operations can create infinite loops if not properly terminated. Methods like to_a
or each
without limiting operations will attempt to process infinite sequences completely.
# Infinite sequence pitfall
infinite_numbers = (1..Float::INFINITY).lazy
# This will run forever
# infinite_numbers.to_a # DON'T DO THIS
# Always limit infinite sequences
safe_result = infinite_numbers.select(&:even?)
.take(10)
.to_a
# => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
Lazy evaluation doesn't automatically parallelize operations despite the deferred execution model. Each element still processes sequentially through the transformation pipeline, and thread safety concerns apply when using shared state within transformation blocks.
# Sequential processing misconception
large_data = (1..1_000_000).lazy
# Still processes sequentially, not parallel
result = large_data.map { |x| heavy_computation(x) }
.select { |x| x > threshold }
.take(100)
.to_a
# For parallel processing, use other gems like Parallel
# require 'parallel'
# result = Parallel.map(large_data.first(10000)) { |x| heavy_computation(x) }
Reference
Core Methods
Method | Parameters | Returns | Description |
---|---|---|---|
lazy |
None | Enumerator::Lazy |
Converts enumerable to lazy enumerator |
map |
&block |
Enumerator::Lazy |
Transforms each element lazily |
select |
&block |
Enumerator::Lazy |
Filters elements matching condition |
reject |
&block |
Enumerator::Lazy |
Filters elements not matching condition |
take |
n (Integer) |
Enumerator::Lazy |
Takes first n elements |
drop |
n (Integer) |
Enumerator::Lazy |
Skips first n elements |
flat_map |
&block |
Enumerator::Lazy |
Maps and flattens results |
zip |
*others |
Enumerator::Lazy |
Combines with other enumerators |
Transformation Methods
Method | Parameters | Returns | Description |
---|---|---|---|
collect |
&block |
Enumerator::Lazy |
Alias for map |
find_all |
&block |
Enumerator::Lazy |
Alias for select |
grep |
pattern |
Enumerator::Lazy |
Selects elements matching pattern |
grep_v |
pattern |
Enumerator::Lazy |
Selects elements not matching pattern |
uniq |
&block (optional) |
Enumerator::Lazy |
Removes duplicate elements |
slice_before |
pattern or &block |
Enumerator::Lazy |
Groups elements at boundaries |
slice_after |
pattern or &block |
Enumerator::Lazy |
Groups elements after pattern |
slice_when |
&block |
Enumerator::Lazy |
Groups elements when condition changes |
Terminal Operations
Method | Parameters | Returns | Description |
---|---|---|---|
to_a |
None | Array |
Forces evaluation, returns array |
force |
None | Array |
Alias for to_a |
first |
n (Integer, optional) |
Object or Array |
Returns first element(s) |
take |
n (Integer) |
Array |
Returns first n elements as array |
each |
&block |
Object |
Iterates through elements |
reduce |
initial , &block |
Object |
Reduces elements to single value |
inject |
initial , &block |
Object |
Alias for reduce |
Utility Methods
Method | Parameters | Returns | Description |
---|---|---|---|
lazy? |
None | Boolean |
Returns true (always for Lazy objects) |
size |
None | Integer or nil |
Returns size if determinable |
count |
&block (optional) |
Integer |
Counts elements matching condition |
include? |
object |
Boolean |
Checks if object is present |
member? |
object |
Boolean |
Alias for include? |
Construction Patterns
# From enumerable
array.lazy # Convert array to lazy
range.lazy # Convert range to lazy
hash.lazy # Convert hash to lazy
# From enumerator
enum.lazy # Convert enumerator to lazy
Enumerator.new { }.lazy # Custom enumerator to lazy
# Custom lazy enumerator
Enumerator::Lazy.new(source) do |yielder, *values|
# Custom transformation logic
yielder << transformed_value
end
Performance Characteristics
Scenario | Memory Usage | CPU Usage | Recommendation |
---|---|---|---|
Small collections (< 1000) | Higher overhead | Higher overhead | Use eager evaluation |
Large collections (> 10000) | Constant | Optimized | Use lazy evaluation |
Infinite sequences | Constant | On-demand | Always use lazy |
Early termination | Minimal | Minimal | Use lazy with take/first |
Multiple iterations | Per iteration | Per iteration | Cache with to_a if needed |
File processing | Constant | Streaming | Use lazy evaluation |