CrackedRuby logo

CrackedRuby

Lazy Loading

A comprehensive guide to lazy evaluation patterns in Ruby using Enumerator::Lazy for memory-efficient data processing and deferred computation.

Patterns and Best Practices Performance Patterns
11.5.1

Overview

Lazy loading in Ruby refers to the deferred evaluation of enumerable operations until results are actually needed. Ruby implements lazy evaluation through the Enumerator::Lazy class, which wraps enumerable objects and postpones the execution of transformation methods like map, select, and reject until a terminal operation forces evaluation.

Ruby's lazy evaluation creates a processing pipeline where each transformation method returns another lazy enumerator rather than immediately processing all elements. This approach prevents intermediate arrays from being created and allows working with potentially infinite sequences or large datasets without consuming excessive memory.

The lazy evaluation mechanism works by building a chain of operations that execute only when values are requested. When a terminal method like to_a, first, or each is called, Ruby processes elements through the entire pipeline one at a time, applying all transformations to each element before moving to the next.

# Standard eager evaluation - creates intermediate arrays
numbers = (1..1_000_000).to_a
squares = numbers.map { |n| n * n }
evens = squares.select { |n| n.even? }
first_five = evens.first(5)

# Lazy evaluation - no intermediate arrays created
lazy_result = (1..1_000_000).lazy
  .map { |n| n * n }
  .select { |n| n.even? }
  .first(5)

Ruby provides lazy evaluation for most enumerable methods including map, select, reject, take, drop, and zip. The lazy enumerator maintains the original enumerable's interface while deferring computation until absolutely necessary.

Basic Usage

Creating a lazy enumerator requires calling the lazy method on any enumerable object. This returns an Enumerator::Lazy instance that responds to the same methods as the original enumerable but executes them lazily.

# Convert any enumerable to lazy
array = [1, 2, 3, 4, 5]
lazy_enum = array.lazy

range = (1..Float::INFINITY)
lazy_range = range.lazy

# Method chaining with lazy evaluation
result = (1..100).lazy
  .map { |n| n * 2 }
  .select { |n| n > 50 }
  .take(10)
  .to_a
# => [52, 56, 60, 64, 68, 72, 76, 80, 84, 88]

Lazy enumerators support all standard enumerable methods but return other lazy enumerators instead of concrete collections. The computation chain builds up until a terminal operation forces evaluation.

# Building a processing pipeline
pipeline = (1..1000).lazy
  .map { |n| puts "Processing #{n}"; n * n }
  .select { |n| n.even? }
  .reject { |n| n < 100 }

# No output yet - computation is deferred
puts "Pipeline created"

# Forces evaluation - now you'll see "Processing" output
first_three = pipeline.first(3)
# Processing 1, Processing 2, Processing 3, Processing 4, Processing 5...
# => [100, 324, 676]

Terminal operations include methods that produce concrete results like to_a, first, take, each, reduce, count, and find. These methods force the lazy enumerator to begin processing elements through the pipeline.

# Different terminal operations
lazy_nums = (1..20).lazy.map { |n| n * n }.select { |n| n.odd? }

lazy_nums.first(3)           # => [1, 9, 25]
lazy_nums.take(5).to_a       # => [1, 9, 25, 49, 81]
lazy_nums.find { |n| n > 50 } # => 81
lazy_nums.count              # => 10

Lazy evaluation works particularly well with infinite sequences or very large datasets where you only need a subset of results. Ruby processes elements one at a time through the entire pipeline rather than creating intermediate collections.

Advanced Usage

Lazy enumerators support complex method chaining and composition patterns. Multiple lazy enumerators can be combined, and custom lazy operations can be defined using Enumerator::Lazy.new with a block that yields processed values.

# Custom lazy operation
def lazy_debug(enum, prefix)
  Enumerator::Lazy.new(enum) do |yielder, value|
    puts "#{prefix}: #{value}"
    yielder << value
  end
end

# Compose custom operations with built-ins
result = (1..10).lazy
  .then { |enum| lazy_debug(enum, "Input") }
  .map { |n| n * n }
  .then { |enum| lazy_debug(enum, "Squared") }
  .select { |n| n.even? }
  .first(3)
# Input: 1, Squared: 1, Input: 2, Squared: 4, Input: 3, Squared: 9, Input: 4, Squared: 16...

Lazy enumerators can be nested and combined with regular enumerables. The lazy evaluation propagates through nested structures when appropriate methods are used.

# Lazy processing of nested structures
nested_data = [
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9]
]

flattened_lazy = nested_data.lazy
  .flat_map { |sub_array| sub_array.lazy }
  .map { |n| n * n }
  .select { |n| n.odd? }
  .take(4)
  .to_a
# => [1, 9, 25, 49]

The zip method with lazy enumerators creates paired lazy evaluation across multiple sequences. This allows processing multiple infinite or large sequences simultaneously without memory overhead.

# Lazy zipping of sequences
fibonacci = Enumerator.new do |yielder|
  a, b = 0, 1
  loop do
    yielder << a
    a, b = b, a + b
  end
end

squares = Enumerator.new do |yielder|
  n = 1
  loop do
    yielder << n * n
    n += 1
  end
end

paired_lazy = fibonacci.lazy.zip(squares.lazy)
  .map { |fib, square| fib + square }
  .select { |sum| sum.even? }
  .first(5)
# => [2, 10, 26, 146, 290]

Lazy enumerators maintain their lazy properties when passed to methods that accept enumerables. This allows building reusable processing components that work efficiently with both finite and infinite sequences.

# Reusable lazy processing functions
def process_numbers(enum)
  enum.lazy
    .map { |n| n * n }
    .select { |n| n > 10 }
    .map { |n| Math.sqrt(n) }
end

def filter_primes(enum)
  enum.lazy.select do |n|
    next false if n < 2
    (2..Math.sqrt(n)).none? { |i| n % i == 0 }
  end
end

# Compose processing functions
infinite_numbers = (1..Float::INFINITY)
result = filter_primes(process_numbers(infinite_numbers)).first(5)
# => [3.0, 5.0, 7.0, 11.0, 13.0]

Performance & Memory

Lazy evaluation provides significant memory benefits when processing large datasets or infinite sequences by eliminating intermediate array creation. Instead of storing entire transformed collections in memory, lazy enumerators process elements one at a time through the pipeline.

# Memory comparison: eager vs lazy
require 'benchmark/memory'

# Eager evaluation - creates multiple intermediate arrays
Benchmark.memory do |x|
  x.report("eager") do
    (1..1_000_000).to_a
      .map { |n| n * n }
      .select { |n| n.even? }
      .first(100)
  end
  
  x.report("lazy") do
    (1..1_000_000).lazy
      .map { |n| n * n }
      .select { |n| n.even? }
      .first(100)
      .to_a
  end
end
# Eager creates multiple 1M element arrays; lazy processes 200 elements total

The performance characteristics of lazy evaluation depend on the ratio of elements processed to elements in the source collection. Lazy evaluation excels when you need only a small fraction of the final results, but can be slower when processing entire collections due to method call overhead.

# Benchmark: lazy vs eager for different scenarios
require 'benchmark'

data = (1..100_000).to_a

Benchmark.bm(15) do |x|
  # Scenario 1: Process all elements
  x.report("eager_all") { data.map(&:to_s).select { |s| s.length > 2 }.count }
  x.report("lazy_all") { data.lazy.map(&:to_s).select { |s| s.length > 2 }.count }
  
  # Scenario 2: Process subset
  x.report("eager_subset") { data.map(&:to_s).select { |s| s.length > 2 }.first(10) }
  x.report("lazy_subset") { data.lazy.map(&:to_s).select { |s| s.length > 2 }.first(10) }
end

Lazy evaluation particularly benefits scenarios involving expensive computations where early termination is likely. The lazy pipeline stops processing once the terminal condition is met, avoiding unnecessary computation on remaining elements.

# Expensive computation with early termination
def expensive_operation(n)
  # Simulate CPU-intensive work
  sleep(0.01)
  n * n * n
end

# Lazy evaluation stops after finding first match
start_time = Time.now
result = (1..1000).lazy
  .map { |n| expensive_operation(n) }
  .find { |cube| cube > 1000 }
puts "Found #{result} in #{Time.now - start_time} seconds"
# Stops after ~10 iterations instead of processing all 1000

Memory usage patterns differ significantly between eager and lazy evaluation. Eager evaluation creates memory spikes during intermediate collection creation, while lazy evaluation maintains constant memory usage regardless of source collection size.

# Memory usage tracking
class MemoryMonitor
  def self.current_usage
    `ps -o pid,rss -p #{Process.pid}`.split.last.to_i
  end
  
  def self.track_memory
    before = current_usage
    yield
    after = current_usage
    puts "Memory delta: #{after - before} KB"
  end
end

# Compare memory patterns
puts "Eager evaluation:"
MemoryMonitor.track_memory do
  (1..500_000).map { |n| n * 2 }.select { |n| n.even? }.first(100)
end

puts "Lazy evaluation:"
MemoryMonitor.track_memory do
  (1..500_000).lazy.map { |n| n * 2 }.select { |n| n.even? }.first(100).to_a
end

Common Pitfalls

Lazy enumerators can exhibit unexpected behavior when side effects are involved in the processing blocks. Since computation is deferred, side effects occur only when evaluation is forced, and they may not occur in the expected order or frequency.

# Pitfall: Side effects in lazy blocks
counter = 0
lazy_enum = (1..10).lazy.map do |n|
  counter += 1  # Side effect
  puts "Processing #{n}"
  n * 2
end

puts "Counter: #{counter}"  # => 0 (no processing yet)

# Force partial evaluation
first_three = lazy_enum.first(3)
puts "Counter: #{counter}"  # => 3 (only processed 3 elements)

# Reusing the same lazy enumerator
next_two = lazy_enum.drop(3).first(2)
puts "Counter: #{counter}"  # => 5 (processes 2 more, but starts from beginning)

Multiple terminal operations on the same lazy enumerator cause re-evaluation from the beginning. Each terminal operation creates a fresh evaluation cycle, which can lead to repeated computation and side effects.

# Pitfall: Multiple evaluations
expensive_lazy = (1..100).lazy.map do |n|
  puts "Expensive computation for #{n}"
  sleep(0.01)
  n * n
end

# Each operation re-evaluates from the beginning
first_five = expensive_lazy.first(5)    # Computes 1-5
next_five = expensive_lazy.drop(5).first(5)  # Computes 1-10 (not just 6-10)

# Solution: Force evaluation once and reuse
cached_result = expensive_lazy.first(10)
first_five = cached_result.first(5)
next_five = cached_result.drop(5)

Lazy enumerators that depend on mutable state can produce inconsistent results if the underlying state changes between evaluations. The deferred nature means the state at evaluation time determines the results, not the state when the lazy chain was created.

# Pitfall: Mutable state dependency
multiplier = 2
lazy_multiplied = (1..5).lazy.map { |n| n * multiplier }

first_result = lazy_multiplied.to_a
# => [2, 4, 6, 8, 10]

multiplier = 10  # Change state

second_result = lazy_multiplied.to_a
# => [10, 20, 30, 40, 50] (different result!)

# Solution: Capture state when creating the lazy chain
captured_multiplier = multiplier
safe_lazy = (1..5).lazy.map { |n| n * captured_multiplier }

Infinite sequences with lazy evaluation can cause infinite loops if terminal operations don't limit the number of elements processed. Methods like count, to_a, or each without bounds will never complete on infinite sequences.

# Pitfall: Unbounded operations on infinite sequences
infinite_evens = (2..Float::INFINITY).step(2).lazy

# These will run forever:
# infinite_evens.count
# infinite_evens.to_a
# infinite_evens.each { |n| puts n }

# Safe operations with bounds:
infinite_evens.first(10)           # => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
infinite_evens.take(5).to_a        # => [2, 4, 6, 8, 10]
infinite_evens.find { |n| n > 100 } # => 102

# Safe iteration with break condition
infinite_evens.each do |n|
  puts n
  break if n > 20
end

Exception handling in lazy chains can be tricky because exceptions occur during evaluation, not during chain construction. Exceptions in the middle of a lazy chain can leave the chain in an indeterminate state for subsequent evaluations.

# Pitfall: Exception handling in lazy chains
risky_lazy = (1..10).lazy.map do |n|
  raise "Error!" if n == 5
  n * 2
end

# Exception occurs during evaluation, not construction
begin
  result = risky_lazy.first(7)  # Will raise when it hits n == 5
rescue => e
  puts "Caught: #{e.message}"
end

# Solution: Handle exceptions within the chain
safe_lazy = (1..10).lazy.map do |n|
  begin
    raise "Error!" if n == 5
    n * 2
  rescue
    nil  # or some default value
  end
end.reject(&:nil?)

safe_result = safe_lazy.first(7)
# => [2, 4, 6, 8, 12, 14, 16] (skipped the problematic element)

Reference

Core Classes and Methods

Class Description
Enumerator::Lazy Main lazy evaluation wrapper class
Enumerator Base enumerator class, can create infinite sequences

Enumerable Methods with Lazy Support

Method Parameters Returns Description
#lazy None Enumerator::Lazy Converts enumerable to lazy enumerator
#map &block Enumerator::Lazy Lazy transformation of elements
#select &block Enumerator::Lazy Lazy filtering of elements
#reject &block Enumerator::Lazy Lazy rejection of elements
#take n (Integer) Enumerator::Lazy Lazy limit to first n elements
#drop n (Integer) Enumerator::Lazy Lazy skip of first n elements
#take_while &block Enumerator::Lazy Lazy take while condition true
#drop_while &block Enumerator::Lazy Lazy drop while condition true
#zip *enums Enumerator::Lazy Lazy zip with other enumerables
#flat_map &block Enumerator::Lazy Lazy flatten and map combination
#uniq &block (optional) Enumerator::Lazy Lazy removal of duplicates
#grep pattern Enumerator::Lazy Lazy pattern matching
#slice_before pattern or &block Enumerator::Lazy Lazy grouping before pattern
#slice_after pattern or &block Enumerator::Lazy Lazy grouping after pattern
#slice_when &block Enumerator::Lazy Lazy grouping when condition changes
#chunk &block Enumerator::Lazy Lazy consecutive grouping
#chunk_while &block Enumerator::Lazy Lazy grouping while condition true

Terminal Operations

Method Parameters Returns Description
#to_a None Array Forces evaluation to array
#first n (Integer, optional) Object or Array Forces evaluation of first n elements
#each &block self or Enumerator Forces evaluation with iteration
#reduce initial, &block or symbol Object Forces evaluation with reduction
#inject initial, &block or symbol Object Alias for reduce
#count &block (optional) Integer Forces full evaluation for count
#find &block Object Forces evaluation until match found
#detect &block Object Alias for find
#include? obj Boolean Forces evaluation until object found
#member? obj Boolean Alias for include?
#any? &block (optional) Boolean Forces evaluation until condition met
#all? &block (optional) Boolean Forces evaluation of all elements
#none? &block (optional) Boolean Forces evaluation until condition met
#one? &block (optional) Boolean Forces full evaluation
#min &block (optional) Object Forces full evaluation for minimum
#max &block (optional) Object Forces full evaluation for maximum
#minmax &block (optional) Array Forces full evaluation for min and max

Custom Lazy Enumerator Creation

# Creating custom lazy enumerator
custom_lazy = Enumerator::Lazy.new(source_enum) do |yielder, *values|
  # Process values and yield results
  yielder << processed_value
end

# Example: Custom filtering operation
def lazy_filter_even(enum)
  Enumerator::Lazy.new(enum) do |yielder, value|
    yielder << value if value.even?
  end
end

Performance Characteristics

Operation Type Memory Usage CPU Usage Best Use Case
Full processing O(1) constant Higher per-element Never - use eager instead
Subset processing O(1) constant Lower total Large datasets, small results
Infinite sequences O(1) constant Proportional to results Generating sequences
Early termination O(1) constant Stops early Search operations

Common Patterns

# Pipeline pattern
result = source.lazy
  .transformation1
  .transformation2
  .filter
  .terminal_operation

# Generator pattern
def fibonacci_lazy
  Enumerator.new do |yielder|
    a, b = 0, 1
    loop do
      yielder << a
      a, b = b, a + b
    end
  end.lazy
end

# Composition pattern
def compose_lazy(*operations)
  ->(enum) { operations.reduce(enum.lazy) { |acc, op| op.call(acc) } }
end