CrackedRuby - Lazy Loading

Overview

Lazy loading in Ruby refers to the deferred evaluation of enumerable operations until results are actually needed. Ruby implements lazy evaluation through the Enumerator::Lazy class, which wraps enumerable objects and postpones the execution of transformation methods like map, select, and reject until a terminal operation forces evaluation.

Ruby's lazy evaluation creates a processing pipeline where each transformation method returns another lazy enumerator rather than immediately processing all elements. This approach prevents intermediate arrays from being created and allows working with potentially infinite sequences or large datasets without consuming excessive memory.

The lazy evaluation mechanism works by building a chain of operations that execute only when values are requested. When a terminal method like to_a, first, or each is called, Ruby processes elements through the entire pipeline one at a time, applying all transformations to each element before moving to the next.

# Standard eager evaluation - creates intermediate arrays
numbers = (1..1_000_000).to_a
squares = numbers.map { |n| n * n }
evens = squares.select { |n| n.even? }
first_five = evens.first(5)

# Lazy evaluation - no intermediate arrays created
lazy_result = (1..1_000_000).lazy
  .map { |n| n * n }
  .select { |n| n.even? }
  .first(5)

Ruby provides lazy evaluation for most enumerable methods including map, select, reject, take, drop, and zip. The lazy enumerator maintains the original enumerable's interface while deferring computation until absolutely necessary.

Basic Usage

Creating a lazy enumerator requires calling the lazy method on any enumerable object. This returns an Enumerator::Lazy instance that responds to the same methods as the original enumerable but executes them lazily.

# Convert any enumerable to lazy
array = [1, 2, 3, 4, 5]
lazy_enum = array.lazy

range = (1..Float::INFINITY)
lazy_range = range.lazy

# Method chaining with lazy evaluation
result = (1..100).lazy
  .map { |n| n * 2 }
  .select { |n| n > 50 }
  .take(10)
  .to_a
# => [52, 56, 60, 64, 68, 72, 76, 80, 84, 88]

Lazy enumerators support all standard enumerable methods but return other lazy enumerators instead of concrete collections. The computation chain builds up until a terminal operation forces evaluation.

# Building a processing pipeline
pipeline = (1..1000).lazy
  .map { |n| puts "Processing #{n}"; n * n }
  .select { |n| n.even? }
  .reject { |n| n < 100 }

# No output yet - computation is deferred
puts "Pipeline created"

# Forces evaluation - now you'll see "Processing" output
first_three = pipeline.first(3)
# Processing 1, Processing 2, Processing 3, Processing 4, Processing 5...
# => [100, 324, 676]

Terminal operations include methods that produce concrete results like to_a, first, take, each, reduce, count, and find. These methods force the lazy enumerator to begin processing elements through the pipeline.

# Different terminal operations
lazy_nums = (1..20).lazy.map { |n| n * n }.select { |n| n.odd? }

lazy_nums.first(3)           # => [1, 9, 25]
lazy_nums.take(5).to_a       # => [1, 9, 25, 49, 81]
lazy_nums.find { |n| n > 50 } # => 81
lazy_nums.count              # => 10

Lazy evaluation works particularly well with infinite sequences or very large datasets where you only need a subset of results. Ruby processes elements one at a time through the entire pipeline rather than creating intermediate collections.

Advanced Usage

Lazy enumerators support complex method chaining and composition patterns. Multiple lazy enumerators can be combined, and custom lazy operations can be defined using Enumerator::Lazy.new with a block that yields processed values.

# Custom lazy operation
def lazy_debug(enum, prefix)
  Enumerator::Lazy.new(enum) do |yielder, value|
    puts "#{prefix}: #{value}"
    yielder << value
  end
end

# Compose custom operations with built-ins
result = (1..10).lazy
  .then { |enum| lazy_debug(enum, "Input") }
  .map { |n| n * n }
  .then { |enum| lazy_debug(enum, "Squared") }
  .select { |n| n.even? }
  .first(3)
# Input: 1, Squared: 1, Input: 2, Squared: 4, Input: 3, Squared: 9, Input: 4, Squared: 16...

Lazy enumerators can be nested and combined with regular enumerables. The lazy evaluation propagates through nested structures when appropriate methods are used.

# Lazy processing of nested structures
nested_data = [
  [1, 2, 3],
  [4, 5, 6],
  [7, 8, 9]
]

flattened_lazy = nested_data.lazy
  .flat_map { |sub_array| sub_array.lazy }
  .map { |n| n * n }
  .select { |n| n.odd? }
  .take(4)
  .to_a
# => [1, 9, 25, 49]

The zip method with lazy enumerators creates paired lazy evaluation across multiple sequences. This allows processing multiple infinite or large sequences simultaneously without memory overhead.

# Lazy zipping of sequences
fibonacci = Enumerator.new do |yielder|
  a, b = 0, 1
  loop do
    yielder << a
    a, b = b, a + b
  end
end

squares = Enumerator.new do |yielder|
  n = 1
  loop do
    yielder << n * n
    n += 1
  end
end

paired_lazy = fibonacci.lazy.zip(squares.lazy)
  .map { |fib, square| fib + square }
  .select { |sum| sum.even? }
  .first(5)
# => [2, 10, 26, 146, 290]

Lazy enumerators maintain their lazy properties when passed to methods that accept enumerables. This allows building reusable processing components that work efficiently with both finite and infinite sequences.

# Reusable lazy processing functions
def process_numbers(enum)
  enum.lazy
    .map { |n| n * n }
    .select { |n| n > 10 }
    .map { |n| Math.sqrt(n) }
end

def filter_primes(enum)
  enum.lazy.select do |n|
    next false if n < 2
    (2..Math.sqrt(n)).none? { |i| n % i == 0 }
  end
end

# Compose processing functions
infinite_numbers = (1..Float::INFINITY)
result = filter_primes(process_numbers(infinite_numbers)).first(5)
# => [3.0, 5.0, 7.0, 11.0, 13.0]

Performance & Memory

Lazy evaluation provides significant memory benefits when processing large datasets or infinite sequences by eliminating intermediate array creation. Instead of storing entire transformed collections in memory, lazy enumerators process elements one at a time through the pipeline.

# Memory comparison: eager vs lazy
require 'benchmark/memory'

# Eager evaluation - creates multiple intermediate arrays
Benchmark.memory do |x|
  x.report("eager") do
    (1..1_000_000).to_a
      .map { |n| n * n }
      .select { |n| n.even? }
      .first(100)
  end
  
  x.report("lazy") do
    (1..1_000_000).lazy
      .map { |n| n * n }
      .select { |n| n.even? }
      .first(100)
      .to_a
  end
end
# Eager creates multiple 1M element arrays; lazy processes 200 elements total

The performance characteristics of lazy evaluation depend on the ratio of elements processed to elements in the source collection. Lazy evaluation excels when you need only a small fraction of the final results, but can be slower when processing entire collections due to method call overhead.

# Benchmark: lazy vs eager for different scenarios
require 'benchmark'

data = (1..100_000).to_a

Benchmark.bm(15) do |x|
  # Scenario 1: Process all elements
  x.report("eager_all") { data.map(&:to_s).select { |s| s.length > 2 }.count }
  x.report("lazy_all") { data.lazy.map(&:to_s).select { |s| s.length > 2 }.count }
  
  # Scenario 2: Process subset
  x.report("eager_subset") { data.map(&:to_s).select { |s| s.length > 2 }.first(10) }
  x.report("lazy_subset") { data.lazy.map(&:to_s).select { |s| s.length > 2 }.first(10) }
end

Lazy evaluation particularly benefits scenarios involving expensive computations where early termination is likely. The lazy pipeline stops processing once the terminal condition is met, avoiding unnecessary computation on remaining elements.

# Expensive computation with early termination
def expensive_operation(n)
  # Simulate CPU-intensive work
  sleep(0.01)
  n * n * n
end

# Lazy evaluation stops after finding first match
start_time = Time.now
result = (1..1000).lazy
  .map { |n| expensive_operation(n) }
  .find { |cube| cube > 1000 }
puts "Found #{result} in #{Time.now - start_time} seconds"
# Stops after ~10 iterations instead of processing all 1000

Memory usage patterns differ significantly between eager and lazy evaluation. Eager evaluation creates memory spikes during intermediate collection creation, while lazy evaluation maintains constant memory usage regardless of source collection size.

# Memory usage tracking
class MemoryMonitor
  def self.current_usage
    `ps -o pid,rss -p #{Process.pid}`.split.last.to_i
  end
  
  def self.track_memory
    before = current_usage
    yield
    after = current_usage
    puts "Memory delta: #{after - before} KB"
  end
end

# Compare memory patterns
puts "Eager evaluation:"
MemoryMonitor.track_memory do
  (1..500_000).map { |n| n * 2 }.select { |n| n.even? }.first(100)
end

puts "Lazy evaluation:"
MemoryMonitor.track_memory do
  (1..500_000).lazy.map { |n| n * 2 }.select { |n| n.even? }.first(100).to_a
end

Common Pitfalls

Lazy enumerators can exhibit unexpected behavior when side effects are involved in the processing blocks. Since computation is deferred, side effects occur only when evaluation is forced, and they may not occur in the expected order or frequency.

# Pitfall: Side effects in lazy blocks
counter = 0
lazy_enum = (1..10).lazy.map do |n|
  counter += 1  # Side effect
  puts "Processing #{n}"
  n * 2
end

puts "Counter: #{counter}"  # => 0 (no processing yet)

# Force partial evaluation
first_three = lazy_enum.first(3)
puts "Counter: #{counter}"  # => 3 (only processed 3 elements)

# Reusing the same lazy enumerator
next_two = lazy_enum.drop(3).first(2)
puts "Counter: #{counter}"  # => 5 (processes 2 more, but starts from beginning)

Multiple terminal operations on the same lazy enumerator cause re-evaluation from the beginning. Each terminal operation creates a fresh evaluation cycle, which can lead to repeated computation and side effects.

# Pitfall: Multiple evaluations
expensive_lazy = (1..100).lazy.map do |n|
  puts "Expensive computation for #{n}"
  sleep(0.01)
  n * n
end

# Each operation re-evaluates from the beginning
first_five = expensive_lazy.first(5)    # Computes 1-5
next_five = expensive_lazy.drop(5).first(5)  # Computes 1-10 (not just 6-10)

# Solution: Force evaluation once and reuse
cached_result = expensive_lazy.first(10)
first_five = cached_result.first(5)
next_five = cached_result.drop(5)

Lazy enumerators that depend on mutable state can produce inconsistent results if the underlying state changes between evaluations. The deferred nature means the state at evaluation time determines the results, not the state when the lazy chain was created.

# Pitfall: Mutable state dependency
multiplier = 2
lazy_multiplied = (1..5).lazy.map { |n| n * multiplier }

first_result = lazy_multiplied.to_a
# => [2, 4, 6, 8, 10]

multiplier = 10  # Change state

second_result = lazy_multiplied.to_a
# => [10, 20, 30, 40, 50] (different result!)

# Solution: Capture state when creating the lazy chain
captured_multiplier = multiplier
safe_lazy = (1..5).lazy.map { |n| n * captured_multiplier }

Infinite sequences with lazy evaluation can cause infinite loops if terminal operations don't limit the number of elements processed. Methods like count, to_a, or each without bounds will never complete on infinite sequences.

# Pitfall: Unbounded operations on infinite sequences
infinite_evens = (2..Float::INFINITY).step(2).lazy

# These will run forever:
# infinite_evens.count
# infinite_evens.to_a
# infinite_evens.each { |n| puts n }

# Safe operations with bounds:
infinite_evens.first(10)           # => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
infinite_evens.take(5).to_a        # => [2, 4, 6, 8, 10]
infinite_evens.find { |n| n > 100 } # => 102

# Safe iteration with break condition
infinite_evens.each do |n|
  puts n
  break if n > 20
end

Exception handling in lazy chains can be tricky because exceptions occur during evaluation, not during chain construction. Exceptions in the middle of a lazy chain can leave the chain in an indeterminate state for subsequent evaluations.

# Pitfall: Exception handling in lazy chains
risky_lazy = (1..10).lazy.map do |n|
  raise "Error!" if n == 5
  n * 2
end

# Exception occurs during evaluation, not construction
begin
  result = risky_lazy.first(7)  # Will raise when it hits n == 5
rescue => e
  puts "Caught: #{e.message}"
end

# Solution: Handle exceptions within the chain
safe_lazy = (1..10).lazy.map do |n|
  begin
    raise "Error!" if n == 5
    n * 2
  rescue
    nil  # or some default value
  end
end.reject(&:nil?)

safe_result = safe_lazy.first(7)
# => [2, 4, 6, 8, 12, 14, 16] (skipped the problematic element)

Reference

Core Classes and Methods

Class	Description
`Enumerator::Lazy`	Main lazy evaluation wrapper class
`Enumerator`	Base enumerator class, can create infinite sequences

Enumerable Methods with Lazy Support

Method	Parameters	Returns	Description
`#lazy`	None	`Enumerator::Lazy`	Converts enumerable to lazy enumerator
`#map`	`&block`	`Enumerator::Lazy`	Lazy transformation of elements
`#select`	`&block`	`Enumerator::Lazy`	Lazy filtering of elements
`#reject`	`&block`	`Enumerator::Lazy`	Lazy rejection of elements
`#take`	`n` (Integer)	`Enumerator::Lazy`	Lazy limit to first n elements
`#drop`	`n` (Integer)	`Enumerator::Lazy`	Lazy skip of first n elements
`#take_while`	`&block`	`Enumerator::Lazy`	Lazy take while condition true
`#drop_while`	`&block`	`Enumerator::Lazy`	Lazy drop while condition true
`#zip`	`*enums`	`Enumerator::Lazy`	Lazy zip with other enumerables
`#flat_map`	`&block`	`Enumerator::Lazy`	Lazy flatten and map combination
`#uniq`	`&block` (optional)	`Enumerator::Lazy`	Lazy removal of duplicates
`#grep`	`pattern`	`Enumerator::Lazy`	Lazy pattern matching
`#slice_before`	`pattern` or `&block`	`Enumerator::Lazy`	Lazy grouping before pattern
`#slice_after`	`pattern` or `&block`	`Enumerator::Lazy`	Lazy grouping after pattern
`#slice_when`	`&block`	`Enumerator::Lazy`	Lazy grouping when condition changes
`#chunk`	`&block`	`Enumerator::Lazy`	Lazy consecutive grouping
`#chunk_while`	`&block`	`Enumerator::Lazy`	Lazy grouping while condition true

Terminal Operations

Method	Parameters	Returns	Description
`#to_a`	None	`Array`	Forces evaluation to array
`#first`	`n` (Integer, optional)	`Object` or `Array`	Forces evaluation of first n elements
`#each`	`&block`	`self` or `Enumerator`	Forces evaluation with iteration
`#reduce`	`initial`, `&block` or `symbol`	`Object`	Forces evaluation with reduction
`#inject`	`initial`, `&block` or `symbol`	`Object`	Alias for reduce
`#count`	`&block` (optional)	`Integer`	Forces full evaluation for count
`#find`	`&block`	`Object`	Forces evaluation until match found
`#detect`	`&block`	`Object`	Alias for find
`#include?`	`obj`	`Boolean`	Forces evaluation until object found
`#member?`	`obj`	`Boolean`	Alias for include?
`#any?`	`&block` (optional)	`Boolean`	Forces evaluation until condition met
`#all?`	`&block` (optional)	`Boolean`	Forces evaluation of all elements
`#none?`	`&block` (optional)	`Boolean`	Forces evaluation until condition met
`#one?`	`&block` (optional)	`Boolean`	Forces full evaluation
`#min`	`&block` (optional)	`Object`	Forces full evaluation for minimum
`#max`	`&block` (optional)	`Object`	Forces full evaluation for maximum
`#minmax`	`&block` (optional)	`Array`	Forces full evaluation for min and max

Custom Lazy Enumerator Creation

# Creating custom lazy enumerator
custom_lazy = Enumerator::Lazy.new(source_enum) do |yielder, *values|
  # Process values and yield results
  yielder << processed_value
end

# Example: Custom filtering operation
def lazy_filter_even(enum)
  Enumerator::Lazy.new(enum) do |yielder, value|
    yielder << value if value.even?
  end
end

Performance Characteristics

Operation Type	Memory Usage	CPU Usage	Best Use Case
Full processing	O(1) constant	Higher per-element	Never - use eager instead
Subset processing	O(1) constant	Lower total	Large datasets, small results
Infinite sequences	O(1) constant	Proportional to results	Generating sequences
Early termination	O(1) constant	Stops early	Search operations

Common Patterns

# Pipeline pattern
result = source.lazy
  .transformation1
  .transformation2
  .filter
  .terminal_operation

# Generator pattern
def fibonacci_lazy
  Enumerator.new do |yielder|
    a, b = 0, 1
    loop do
      yielder << a
      a, b = b, a + b
    end
  end.lazy
end

# Composition pattern
def compose_lazy(*operations)
  ->(enum) { operations.reduce(enum.lazy) { |acc, op| op.call(acc) } }
end