CrackedRuby logo

CrackedRuby

Enumerator Class

A comprehensive guide to Ruby's Enumerator class, covering external iteration, lazy evaluation, and advanced enumeration patterns.

Core Modules Enumerable Module
3.2.6

Overview

Enumerator provides external iteration capabilities in Ruby, allowing methods to return enumerable objects when called without blocks. Ruby creates Enumerator instances automatically when enumerable methods like #each, #map, or #select receive no block parameter. These objects encapsulate iteration state and provide fine-grained control over enumeration processes.

The Enumerator class serves as Ruby's primary mechanism for implementing external iterators. Unlike internal iteration where blocks control the iteration process, external iteration gives the caller explicit control over when and how elements are retrieved. This approach proves essential for implementing lazy evaluation, infinite sequences, and complex iteration patterns.

Ruby's Enumerator implementation supports both finite and infinite sequences. Finite enumerators exhaust their source collection, while infinite enumerators can generate values indefinitely through custom logic or mathematical sequences. The class integrates seamlessly with Ruby's existing enumerable ecosystem while extending capabilities beyond traditional collection iteration.

# Basic enumerator creation from array
arr = [1, 2, 3, 4, 5]
enum = arr.each
enum.next  # => 1
enum.next  # => 2

# Enumerator from method without block
numbers = (1..10).map
numbers.class  # => Enumerator
# Custom enumerator with Enumerator.new
fibonacci = Enumerator.new do |yielder|
  a, b = 0, 1
  loop do
    yielder << a
    a, b = b, a + b
  end
end

fibonacci.take(8)  # => [0, 1, 1, 2, 3, 5, 8, 13]

Enumerator objects maintain internal state between calls to #next, making them stateful iterators. This statefulness enables complex iteration patterns but requires careful handling to avoid unexpected behavior when reusing enumerators across different contexts.

Basic Usage

Creating enumerators follows several standard patterns in Ruby. The most common approach involves calling enumerable methods without providing blocks. Ruby automatically returns Enumerator objects in these situations, allowing subsequent chaining or manual iteration control.

# Standard enumerator creation patterns
array = %w[apple banana cherry date elderberry]

# From enumerable methods
each_enum = array.each
map_enum = array.map
select_enum = array.select

# All return Enumerator objects
each_enum.class    # => Enumerator
map_enum.class     # => Enumerator  
select_enum.class  # => Enumerator

The #next method provides primary access to enumerator elements. Each call advances the internal position and returns the subsequent value. When the enumerator exhausts its source, #next raises StopIteration exceptions rather than returning nil, maintaining consistency with iteration completion semantics.

colors = ['red', 'green', 'blue'].each

colors.next    # => "red"
colors.next    # => "green" 
colors.next    # => "blue"
colors.next    # raises StopIteration

The #peek method examines the next element without advancing the enumerator's position. This proves valuable when implementing lookahead logic or conditional processing based on upcoming values. Peek operations preserve enumerator state while providing element access.

numbers = (1..5).each

numbers.peek   # => 1
numbers.next   # => 1
numbers.peek   # => 2
numbers.next   # => 2

Enumerator rewinding through #rewind resets internal position to the beginning. This enables multiple iterations over the same enumerator without creating new objects. Rewind operations affect all subsequent #next and #peek calls until the enumerator reaches completion again.

# Rewind demonstration
letters = %w[x y z].each

letters.next   # => "x"
letters.next   # => "y" 
letters.rewind
letters.next   # => "x"  # Back to beginning

Custom enumerators through Enumerator.new accept blocks that define iteration behavior. These blocks receive yielder objects supporting the << operator for emitting values. Custom enumerators enable infinite sequences, computed values, and specialized iteration patterns not possible with standard collection enumeration.

# Custom countdown enumerator
countdown = Enumerator.new do |yielder|
  n = 10
  while n > 0
    yielder << n
    n -= 1
  end
end

countdown.to_a  # => [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

Advanced Usage

Lazy enumerators represent one of Ruby's most sophisticated enumeration features. The #lazy method transforms regular enumerators into lazy variants that defer computation until explicitly requested. This approach enables efficient processing of large datasets and infinite sequences without immediate memory allocation.

# Lazy enumeration with infinite range
infinite_numbers = (1..Float::INFINITY).lazy
squares = infinite_numbers.map { |n| n * n }
even_squares = squares.select(&:even?)

# No computation occurs until take() is called
result = even_squares.take(5).to_a
# => [4, 16, 36, 64, 100]

Lazy evaluation chains multiple operations without creating intermediate arrays. Traditional enumeration methods like #map and #select create new arrays at each step, consuming memory proportional to collection size. Lazy enumerators defer this allocation, processing elements on-demand through the entire chain.

# Memory-efficient processing of large datasets
large_data = (1..1_000_000).lazy
                            .map { |n| n * n }
                            .select { |n| n % 3 == 0 }
                            .map { |n| n.to_s }
                            .take(10)
                            .to_a

# Only processes 10 elements despite million-element source

Custom enumerators with complex state management enable sophisticated iteration patterns. These enumerators can maintain multiple internal variables, implement stateful algorithms, and generate values based on previous outputs or external conditions.

# Stateful enumerator with memory
moving_average = Enumerator.new do |yielder|
  history = []
  window_size = 3
  
  loop do
    # Accept input through fiber communication
    value = Fiber.yield
    history << value
    history.shift if history.size > window_size
    
    if history.size == window_size
      average = history.sum.to_f / history.size
      yielder << average
    end
  end
end

Enumerator composition combines multiple enumerators into unified iteration structures. The #chain method concatenates enumerators sequentially, while custom composition logic can interleave or merge enumerators based on specific criteria.

# Enumerator chaining and composition
first_half = (1..5).each
second_half = (6..10).each
combined = first_half.chain(second_half)

combined.to_a  # => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Custom interleaving composition
def interleave(enum1, enum2)
  Enumerator.new do |yielder|
    loop do
      begin
        yielder << enum1.next
        yielder << enum2.next
      rescue StopIteration
        break
      end
    end
  end
end

odds = [1, 3, 5].each
evens = [2, 4, 6].each
interleaved = interleave(odds, evens)
interleaved.to_a  # => [1, 2, 3, 4, 5, 6]

Advanced enumerator patterns include implementing custom enumerable classes that return specialized enumerators. These classes can provide domain-specific iteration methods while maintaining compatibility with standard Ruby enumeration protocols.

class TimeRange
  def initialize(start_time, end_time, step)
    @start_time = start_time
    @end_time = end_time  
    @step = step
  end
  
  def each
    return enum_for(__method__) unless block_given?
    
    current = @start_time
    while current <= @end_time
      yield current
      current += @step
    end
  end
  
  def business_hours_only
    each.select do |time|
      time.hour >= 9 && time.hour < 17 && 
      ![0, 6].include?(time.wday)
    end
  end
end

# Usage with custom enumerators
start = Time.new(2024, 1, 1, 8, 0)
finish = Time.new(2024, 1, 3, 18, 0)
range = TimeRange.new(start, finish, 3600) # Hourly steps

business_enum = range.business_hours_only
business_enum.take(5)  # Returns first 5 business hours

Performance & Memory

Lazy evaluation provides significant performance advantages when processing large datasets or implementing filtering pipelines. Traditional enumeration creates intermediate arrays at each transformation step, while lazy enumerators process elements individually through the entire chain without intermediate storage.

# Memory comparison: eager vs lazy evaluation
require 'benchmark'

data = (1..10_000_000)

# Eager evaluation - creates intermediate arrays
Benchmark.measure do
  result = data.map { |n| n * 2 }
               .select { |n| n % 4 == 0 }
               .map { |n| n.to_s }
               .first(100)
end
# Higher memory usage, multiple array allocations

# Lazy evaluation - processes elements on-demand  
Benchmark.measure do
  result = data.lazy
               .map { |n| n * 2 }
               .select { |n| n % 4 == 0 }
               .map { |n| n.to_s }
               .first(100)
end
# Lower memory usage, single-pass processing

Enumerator state management impacts performance through object allocation and garbage collection pressure. Reusing enumerators reduces object creation overhead, while creating new enumerators for each operation increases allocation costs. Understanding these trade-offs helps optimize enumeration-heavy code.

# Efficient enumerator reuse pattern
class DataProcessor
  def initialize(source)
    @source_enum = source.each
  end
  
  def process_batches(batch_size)
    loop do
      batch = []
      batch_size.times do
        begin
          batch << @source_enum.next
        rescue StopIteration
          yield batch unless batch.empty?
          break
        end
      end
      yield batch if batch.size == batch_size
    end
    @source_enum.rewind  # Reset for potential reuse
  end
end

# Usage minimizes object allocation
processor = DataProcessor.new((1..1000))
processor.process_batches(50) { |batch| puts batch.sum }

Infinite enumerators require careful memory management to prevent unbounded growth. While the enumerator itself maintains minimal state, consumer code must implement appropriate termination conditions to avoid memory exhaustion or infinite loops.

# Memory-safe infinite enumerator consumption
def process_infinite_stream(enumerator, max_items: 1000)
  processed = 0
  
  enumerator.each do |item|
    yield item
    processed += 1
    break if processed >= max_items
  end
rescue StopIteration
  # Handle finite enumerators gracefully
end

# Infinite sequence with built-in limits
fibonacci_limited = Enumerator.new do |yielder|
  a, b = 0, 1
  1000.times do  # Explicit limit prevents runaway generation
    yielder << a
    a, b = b, a + b
  end
end

Performance optimization strategies for enumerators include minimizing block complexity, avoiding unnecessary conversions, and leveraging built-in methods when possible. Complex blocks within enumeration chains create bottlenecks, while simple operations maintain optimal performance characteristics.

# Performance optimization examples
large_dataset = (1..1_000_000)

# Optimized: simple operations, minimal allocations
optimized = large_dataset.lazy
                         .select(&:odd?)  # Built-in predicate
                         .map(&:to_s)     # Built-in conversion
                         .take(1000)
                         .to_a

# Avoid: complex blocks with multiple operations
avoid = large_dataset.lazy
                     .select do |n|
                       n.odd? && n > 100 && n.to_s.length < 5
                     end
                     .map do |n|
                       result = n.to_s
                       result.upcase!
                       result.reverse
                     end
                     .take(1000)
                     .to_a

Common Pitfalls

Enumerator state persistence causes unexpected behavior when multiple consumers access the same enumerator instance. Each call to #next advances the internal position permanently, affecting all subsequent operations. This statefulness differs from functional programming expectations where operations create new iterators.

# State sharing pitfall
enum = [1, 2, 3, 4, 5].each

# First consumer
first_two = []
2.times { first_two << enum.next }
first_two  # => [1, 2]

# Second consumer gets remaining elements
rest = enum.to_a  # => [3, 4, 5] - not full array!

# Solution: create separate enumerators
source = [1, 2, 3, 4, 5]
enum1 = source.each
enum2 = source.each

first_two = enum1.take(2)   # => [1, 2]
all_items = enum2.to_a      # => [1, 2, 3, 4, 5]

Lazy enumerator evaluation timing creates debugging challenges when operations contain side effects. Side effects execute during element consumption rather than chain construction, potentially causing confusion about execution order and timing.

# Side effect timing confusion
puts "Creating lazy chain..."

lazy_chain = (1..5).lazy.map do |n|
  puts "Processing #{n}"  # Side effect
  n * 2
end

puts "Chain created, no processing yet"
puts "Consuming first element..."
result = lazy_chain.first
puts "Result: #{result}"

# Output order may surprise developers:
# Creating lazy chain...
# Chain created, no processing yet  
# Consuming first element...
# Processing 1
# Result: 2

StopIteration exceptions require proper handling when manually iterating with #next. Unhandled exceptions terminate program execution, while missing exception handling prevents graceful iteration termination. Ruby provides #next_values as an alternative that returns arrays instead of raising exceptions.

# Proper exception handling pattern
def safe_iteration(enumerator)
  results = []
  loop do
    begin
      value = enumerator.next
      results << value
    rescue StopIteration
      break
    end
  end
  results
end

# Alternative using next_values
def safer_iteration(enumerator)
  results = []
  loop do
    values = enumerator.next_values
    break if values.empty?
    results.concat(values)
  end
  results
end

Enumerator chaining with different types creates type confusion when elements flow through multiple transformation stages. Type mismatches in later chain stages can cause runtime errors that are difficult to trace back to their source.

# Type confusion in chained operations
mixed_data = ['1', 2, '3', 4, '5']

# Problematic chain - type errors in later stages
problematic = mixed_data.lazy
                       .map(&:to_i)      # Fails on integers
                       .select(&:even?)
                       .map(&:to_s)

# Safe approach with type checking
safe_chain = mixed_data.lazy
                      .map { |item| item.to_s.to_i }  # Normalize types
                      .select(&:even?)
                      .map(&:to_s)

Infinite enumerator termination requires explicit bounds to prevent runaway execution. Missing termination conditions in consuming code can cause infinite loops or memory exhaustion when processing unbounded sequences.

# Dangerous: no termination condition
dangerous_consumer = Enumerator.new do |yielder|
  n = 1
  loop do
    yielder << n
    n += 1
  end
end

# This would run forever without timeout:
# dangerous_consumer.each { |n| puts n }

# Safe: explicit bounds
def bounded_consumption(infinite_enum, max_count = 1000)
  count = 0
  infinite_enum.each do |value|
    yield value
    count += 1
    break if count >= max_count
  end
end

bounded_consumption(dangerous_consumer) { |n| puts n }

Production Patterns

Web application integration commonly uses enumerators for efficient data streaming and pagination. Rails applications benefit from lazy evaluation when processing large datasets, reducing memory consumption and improving response times through streaming responses.

# Rails streaming with enumerators
class ReportsController < ApplicationController
  def export_users
    # Stream large dataset without loading everything into memory
    user_enumerator = User.find_each.lazy
                         .map(&:to_csv_row)
                         .each_with_index
    
    response.headers['Content-Type'] = 'text/csv'
    response.headers['Content-Disposition'] = 'attachment; filename="users.csv"'
    
    self.response_body = Enumerator.new do |yielder|
      yielder << User.csv_header
      
      user_enumerator.each do |csv_row, index|
        yielder << csv_row
        
        # Periodic memory cleanup for very large datasets
        GC.start if index % 10_000 == 0
      end
    end
  end
end

Background job processing leverages enumerators for batch operations and queue management. Enumerators enable efficient job batching while maintaining memory bounds and providing progress tracking capabilities.

# Background job batching with enumerators
class BatchProcessor
  def initialize(job_class, batch_size: 100)
    @job_class = job_class
    @batch_size = batch_size
  end
  
  def process_records(record_source)
    record_enumerator = record_source.find_each
    batch_enumerator = record_enumerator.each_slice(@batch_size)
    
    batch_enumerator.with_index do |batch, batch_number|
      @job_class.perform_later(
        batch.map(&:id),
        batch_number: batch_number,
        total_batches: estimate_batch_count(record_source)
      )
      
      # Rate limiting between batch submissions
      sleep(0.1)
    end
  end
  
  private
  
  def estimate_batch_count(source)
    (source.count.to_f / @batch_size).ceil
  end
end

# Usage in production
processor = BatchProcessor.new(EmailDeliveryJob, batch_size: 500)
processor.process_records(User.active.where(newsletter: true))

API response streaming uses enumerators to handle large result sets without overwhelming server memory. This pattern proves essential for data export endpoints and real-time feed processing where response sizes can vary dramatically.

# API streaming pattern
class DataStreamingService
  def initialize(query_params)
    @query_params = query_params
    @page_size = query_params.fetch(:page_size, 1000)
  end
  
  def stream_json_response
    Enumerator.new do |yielder|
      yielder << '{"data":['
      first_record = true
      
      data_enumerator.each do |record|
        yielder << ',' unless first_record
        yielder << record.to_json
        first_record = false
      end
      
      yielder << '],"metadata":' + metadata.to_json + '}'
    end
  end
  
  private
  
  def data_enumerator
    # Efficient pagination using enumerators
    Enumerator.new do |yielder|
      offset = 0
      
      loop do
        batch = fetch_batch(offset, @page_size)
        break if batch.empty?
        
        batch.each { |record| yielder << record }
        offset += @page_size
      end
    end
  end
  
  def fetch_batch(offset, limit)
    Model.where(@query_params.except(:page_size))
         .offset(offset)
         .limit(limit)
  end
  
  def metadata
    {
      generated_at: Time.current,
      query_params: @query_params
    }
  end
end

Log processing and analysis systems use enumerators for efficient file parsing and data transformation pipelines. Enumerators handle large log files without loading complete contents into memory, enabling processing of multi-gigabyte files on resource-constrained systems.

# Production log processing system
class LogAnalyzer
  def initialize(log_file_path)
    @log_file = File.open(log_file_path, 'r')
  end
  
  def analyze_patterns
    log_enumerator = @log_file.each_line.lazy
                             .map(&:strip)
                             .reject(&:empty?)
                             .map { |line| parse_log_entry(line) }
                             .reject(&:nil?)
    
    # Process in memory-efficient chunks
    error_patterns = Hash.new(0)
    request_times = []
    
    log_enumerator.each_slice(10_000) do |chunk|
      chunk.each do |entry|
        error_patterns[entry[:error_type]] += 1 if entry[:error_type]
        request_times << entry[:response_time] if entry[:response_time]
      end
      
      # Periodic memory management
      request_times.clear if request_times.size > 50_000
    end
    
    {
      error_summary: error_patterns,
      average_response_time: calculate_average(request_times)
    }
  ensure
    @log_file.close
  end
  
  private
  
  def parse_log_entry(line)
    # Complex parsing logic that might return nil for invalid entries
    return nil unless line.match?(/^\d{4}-\d{2}-\d{2}/)
    
    {
      timestamp: extract_timestamp(line),
      level: extract_level(line),
      message: extract_message(line),
      error_type: extract_error_type(line),
      response_time: extract_response_time(line)
    }
  end
end

Reference

Core Methods

Method Parameters Returns Description
#next None Object Returns next element, raises StopIteration when exhausted
#peek None Object Returns next element without advancing position
#rewind None self Resets enumerator to beginning, returns self
#each Block (optional) Enumerator or result Iterates through elements, returns enumerator without block
#with_index(offset=0) offset (Integer) Enumerator Adds index to enumerated elements starting at offset
#with_object(obj) obj (Object) Enumerator Passes object to each iteration, returns modified object
#next_values None Array Returns array of next values, empty array when exhausted
#peek_values None Array Returns array of next values without advancing

Lazy Evaluation Methods

Method Parameters Returns Description
#lazy None Enumerator::Lazy Returns lazy enumerator for deferred evaluation
#force None Array Forces evaluation of lazy enumerator, returns array
#take(n) n (Integer) Enumerator::Lazy Returns lazy enumerator of first n elements
#drop(n) n (Integer) Enumerator::Lazy Returns lazy enumerator skipping first n elements
#take_while Block Enumerator::Lazy Returns elements while block returns true
#drop_while Block Enumerator::Lazy Skips elements while block returns true

Constructor Methods

Method Parameters Returns Description
Enumerator.new size (Integer, optional), Block Enumerator Creates custom enumerator with yielder block
.produce(init) init (Object), Block Enumerator Creates infinite enumerator from initial value and block
#chain(*enums) *enums (Enumerator) Enumerator::Chain Chains multiple enumerators sequentially
#each_slice(n) n (Integer) Enumerator Groups elements into arrays of size n
#each_cons(n) n (Integer) Enumerator Returns sliding windows of size n

Enumerable Integration

Method Parameters Returns Description
#map Block (optional) Enumerator or Array Transforms elements, returns enumerator without block
#select Block (optional) Enumerator or Array Filters elements, returns enumerator without block
#reject Block (optional) Enumerator or Array Excludes elements, returns enumerator without block
#collect Block (optional) Enumerator or Array Alias for map
#find_all Block (optional) Enumerator or Array Alias for select
#grep(pattern) pattern (Object), Block (optional) Array Filters by pattern match
#zip(*others) *others (Enumerable) Enumerator Combines with other enumerables elementwise

Size and Information

Method Parameters Returns Description
#size None Integer or nil Returns size if determinable, nil otherwise
#inspect None String Returns string representation of enumerator
#feed(value) value (Object) nil Sets value returned by yield in generator block

Exception Classes

Exception Parent Description
StopIteration IndexError Raised when enumerator is exhausted
FiberError StandardError Raised on fiber-related errors in custom enumerators

Common Patterns

External Iterator Pattern

enum = collection.each
while true
  begin
    item = enum.next
    process(item)
  rescue StopIteration
    break
  end
end

Lazy Chain Pattern

result = source.lazy
              .operation1
              .operation2
              .operation3
              .take(limit)
              .to_a

Custom Generator Pattern

generator = Enumerator.new do |yielder|
  # Custom generation logic
  yielder << computed_value
end

Batch Processing Pattern

collection.each_slice(batch_size) do |batch|
  process_batch(batch)
end

Performance Characteristics

Operation Time Complexity Space Complexity Notes
#next O(1) O(1) Constant time advancement
#peek O(1) O(1) No state modification
#rewind O(1) O(1) Resets to beginning
Lazy chain O(1) O(1) Deferred evaluation
#to_a on lazy O(n) O(n) Forces full evaluation
Custom generator Varies Varies Depends on yielder logic