CrackedRuby - Iterator Pattern

Overview

The Iterator Pattern defines a standard way to traverse collections by separating the traversal logic from the collection structure. This pattern originated from object-oriented design principles and became formalized in the Gang of Four design patterns catalog. The pattern addresses the problem that collections can have different internal structures (arrays, trees, graphs, hash tables) but often need similar traversal operations.

The pattern establishes a uniform interface for iteration regardless of the collection's internal organization. This abstraction allows client code to work with different collection types through the same iteration protocol. The iterator maintains the current position in the traversal and knows how to get the next element.

Modern programming languages incorporate iterator concepts directly into their syntax and standard libraries. Ruby provides built-in iterator support through the Enumerable module and blocks, making iteration a first-class language feature. The pattern influences how developers design collection classes and process sequential data.

# Basic iterator concept
collection = [1, 2, 3, 4, 5]

# External iterator approach
iterator = collection.each
iterator.next  # => 1
iterator.next  # => 2

# Internal iterator approach
collection.each { |item| puts item }

Key Principles

The Iterator Pattern consists of four primary components. The Iterator interface declares operations for traversing elements, typically including methods to get the next element, check if more elements exist, and potentially reset to the beginning. The Concrete Iterator implements this interface for a specific collection type and tracks the current position during traversal.

The Aggregate interface declares a method for creating an iterator object. This factory method returns an iterator configured to traverse the aggregate. The Concrete Aggregate implements the iterator creation method and returns an instance of the appropriate concrete iterator. This separation means the collection doesn't need to implement traversal logic directly.

The iterator encapsulates the traversal algorithm and maintains traversal state independently from the collection. Multiple iterators can traverse the same collection simultaneously because each iterator maintains its own position. The collection doesn't need to track iteration state or support multiple simultaneous traversals directly.

Two main iterator approaches exist. External iterators give clients explicit control over iteration. The client requests the next element when ready, controlling the traversal pace and logic. Internal iterators control the iteration internally and apply client-provided operations to each element. Ruby emphasizes internal iterators through blocks and the Enumerable module.

# External iterator - client controls advancement
class ExternalCounter
  def initialize(limit)
    @limit = limit
    @current = 0
  end
  
  def next?
    @current < @limit
  end
  
  def next
    return nil unless next?
    value = @current
    @current += 1
    value
  end
end

counter = ExternalCounter.new(3)
while counter.next?
  puts counter.next
end

# Internal iterator - collection controls traversal
class InternalCounter
  def initialize(limit)
    @limit = limit
  end
  
  def each
    (0...@limit).each { |i| yield i }
  end
end

InternalCounter.new(3).each { |i| puts i }

The pattern implements the Single Responsibility Principle by extracting traversal logic into separate iterator classes. Collections handle storage and organization while iterators handle traversal. This separation simplifies both classes and makes the code more maintainable.

The Open/Closed Principle applies because new iterator types can be added without modifying existing collection classes. Different traversal algorithms (forward, backward, filtered, transformed) can be implemented as new iterator classes that work with existing collections.

Ruby Implementation

Ruby integrates iteration deeply into the language through the Enumerable module and block syntax. The core iteration protocol requires a collection class to implement the each method. Once each is defined, mixing in Enumerable provides dozens of iteration methods built on that foundation.

The each method represents Ruby's internal iterator pattern. The collection controls the iteration loop and yields each element to a client-provided block. This approach differs from languages that emphasize external iterators where clients explicitly request each element.

class CustomCollection
  include Enumerable
  
  def initialize(items)
    @items = items
  end
  
  def each
    return enum_for(:each) unless block_given?
    @items.each { |item| yield item }
  end
end

collection = CustomCollection.new([1, 2, 3, 4, 5])

# Enumerable methods work automatically
collection.map { |x| x * 2 }        # => [2, 4, 6, 8, 10]
collection.select { |x| x.even? }   # => [2, 4]
collection.reduce(:+)               # => 15

The each method checks if a block was provided using block_given?. When no block exists, it returns an Enumerator object representing an external iterator. This pattern supports both internal iteration (with a block) and external iteration (without a block).

Enumerator objects provide Ruby's external iterator implementation. These objects maintain iteration state and allow explicit control over traversal. Enumerators support methods like next, rewind, and peek for fine-grained iteration control.

array = [1, 2, 3]
enum = array.each

# External iteration control
enum.next   # => 1
enum.peek   # => 2 (doesn't advance)
enum.next   # => 2
enum.next   # => 3
enum.next   # raises StopIteration

# Rewind to start over
enum.rewind
enum.next   # => 1

Custom enumerators can be created using Enumerator.new with a block that defines the iteration logic. The block receives a yielder object used to emit values to the iterator's consumer. This approach enables lazy evaluation and infinite sequences.

fibonacci = Enumerator.new do |yielder|
  a, b = 0, 1
  loop do
    yielder << a
    a, b = b, a + b
  end
end

# Take first 10 Fibonacci numbers
fibonacci.take(10)  # => [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

# Infinite sequence but only evaluates what's needed
fibonacci.select { |n| n.even? }.take(5)  # => [0, 2, 8, 34, 144]

Ruby's enum_for method creates an enumerator for any method that accepts a block. This enables external iteration for methods designed for internal iteration. The pattern appears frequently in Ruby's standard library to support both iteration styles.

class DataSource
  def read_records
    return enum_for(:read_records) unless block_given?
    
    # Simulate reading records from a data source
    5.times do |i|
      yield "Record #{i}"
    end
  end
end

source = DataSource.new

# Internal iteration
source.read_records { |record| puts record }

# External iteration
enum = source.read_records
enum.next  # => "Record 0"
enum.next  # => "Record 1"

The Enumerable module provides lazy enumerators through the lazy method. Lazy enumerators defer computation until results are actually needed. This optimization matters for large collections or infinite sequences where generating all intermediate results would be expensive or impossible.

# Without lazy - generates all intermediate arrays
(1..1_000_000).map { |x| x * 2 }.select { |x| x > 100 }.take(5)

# With lazy - only computes what's needed
(1..1_000_000).lazy.map { |x| x * 2 }.select { |x| x > 100 }.take(5).to_a
# => [102, 104, 106, 108, 110]

Practical Examples

Building a custom tree structure demonstrates how the Iterator Pattern enables multiple traversal strategies for the same data structure. Trees can be traversed depth-first or breadth-first, and each approach requires different iteration logic.

class TreeNode
  attr_accessor :value, :children
  
  def initialize(value)
    @value = value
    @children = []
  end
  
  def add_child(node)
    @children << node
  end
  
  # Depth-first traversal
  def each_depth_first(&block)
    return enum_for(:each_depth_first) unless block_given?
    
    yield self
    @children.each do |child|
      child.each_depth_first(&block)
    end
  end
  
  # Breadth-first traversal
  def each_breadth_first
    return enum_for(:each_breadth_first) unless block_given?
    
    queue = [self]
    until queue.empty?
      node = queue.shift
      yield node
      queue.concat(node.children)
    end
  end
end

# Build tree structure
root = TreeNode.new('A')
b = TreeNode.new('B')
c = TreeNode.new('C')
root.add_child(b)
root.add_child(c)
b.add_child(TreeNode.new('D'))
b.add_child(TreeNode.new('E'))
c.add_child(TreeNode.new('F'))

# Different traversals of same tree
root.each_depth_first { |node| print "#{node.value} " }
# Output: A B D E C F

puts
root.each_breadth_first { |node| print "#{node.value} " }
# Output: A B C D E F

Processing large files line-by-line demonstrates iterators for I/O operations. Loading entire files into memory fails with large datasets. An iterator pattern processes one line at a time, maintaining constant memory usage regardless of file size.

class LogFileIterator
  def initialize(filename)
    @filename = filename
  end
  
  def each
    return enum_for(:each) unless block_given?
    
    File.open(@filename, 'r') do |file|
      file.each_line do |line|
        yield line.chomp
      end
    end
  end
  
  # Filter iterator that wraps another iterator
  def error_lines
    return enum_for(:error_lines) unless block_given?
    
    each do |line|
      yield line if line.include?('[ERROR]')
    end
  end
  
  # Transform iterator
  def parsed_entries
    return enum_for(:parsed_entries) unless block_given?
    
    each do |line|
      timestamp, level, message = line.split(' - ', 3)
      yield({ timestamp: timestamp, level: level, message: message })
    end
  end
end

# Assuming log file exists
log = LogFileIterator.new('application.log')

# Process only error lines
log.error_lines.each do |line|
  puts "Error found: #{line}"
end

# Transform each line into structured data
log.parsed_entries.take(5).each do |entry|
  puts "#{entry[:level]}: #{entry[:message]}"
end

Implementing pagination for database queries shows iterators for external data sources. Database queries shouldn't load all results into memory. An iterator fetches pages of results on demand, providing the illusion of iterating through all records while keeping memory usage bounded.

class PaginatedQuery
  def initialize(base_query, page_size: 100)
    @base_query = base_query
    @page_size = page_size
  end
  
  def each
    return enum_for(:each) unless block_given?
    
    offset = 0
    loop do
      # Simulate database query with LIMIT/OFFSET
      results = fetch_page(offset, @page_size)
      break if results.empty?
      
      results.each { |record| yield record }
      offset += @page_size
    end
  end
  
  private
  
  def fetch_page(offset, limit)
    # Simulated database query
    # In practice: @base_query.limit(limit).offset(offset)
    all_records = (1..500).to_a  # Simulated large dataset
    all_records[offset, limit] || []
  end
end

query = PaginatedQuery.new("SELECT * FROM users", page_size: 100)

# Process all records without loading everything into memory
query.each do |record|
  # Process one record at a time
  puts record
end

# Combine with Enumerable methods
query.select { |record| record > 250 }.take(10)

Creating a composite iterator that merges multiple sorted sequences demonstrates advanced iterator composition. When processing multiple sorted data sources, a merging iterator produces a single sorted sequence without materializing all data first.

class MergingIterator
  def initialize(*iterators)
    @iterators = iterators.map do |iter|
      enum = iter.is_a?(Enumerator) ? iter : iter.each
      { enum: enum, current: get_next(enum) }
    end.reject { |iter| iter[:current].nil? }
  end
  
  def each
    return enum_for(:each) unless block_given?
    
    until @iterators.empty?
      # Find iterator with smallest current value
      min_iter = @iterators.min_by { |iter| iter[:current] }
      
      yield min_iter[:current]
      
      # Advance that iterator
      min_iter[:current] = get_next(min_iter[:enum])
      
      # Remove exhausted iterators
      @iterators.reject! { |iter| iter[:current].nil? }
    end
  end
  
  private
  
  def get_next(enum)
    enum.next
  rescue StopIteration
    nil
  end
end

# Three sorted sequences
iter1 = [1, 4, 7, 10].each
iter2 = [2, 5, 8].each
iter3 = [3, 6, 9, 11, 12].each

merged = MergingIterator.new(iter1, iter2, iter3)
merged.each { |value| print "#{value} " }
# Output: 1 2 3 4 5 6 7 8 9 10 11 12

Design Considerations

The choice between internal and external iterators affects API design and flexibility. Internal iterators (Ruby's block-based approach) offer simpler client code and better encapsulation. The collection controls the iteration loop and clients provide operations to perform on each element. External iterators provide more control but require clients to manage iteration state explicitly.

Internal iterators work well for straightforward traversals where the iteration logic doesn't need to be interrupted or interleaved with other operations. They excel at operations like mapping, filtering, and reducing because these operations process all elements in a standard way. Ruby's Enumerable module builds on this foundation effectively.

# Internal iterator - simple and clean
collection.select { |item| item.even? }.map { |item| item * 2 }

# External iterator - more control but verbose
result = []
enum = collection.each
while true
  begin
    item = enum.next
    result << item * 2 if item.even?
  rescue StopIteration
    break
  end
end

External iterators become necessary when iteration needs to be paused, interleaved with other operations, or coordinated across multiple collections. Comparing elements from two collections requires advancing both iterators in lockstep, which internal iterators cannot express easily.

The decision to implement custom iterators versus using Enumerable depends on the collection's characteristics. Simple sequential collections should mix in Enumerable and implement each. Collections with multiple natural traversal orders (like trees) benefit from providing multiple iterator methods. Collections that need lazy evaluation or represent potentially infinite sequences require custom Enumerator implementations.

Iterator composition—building new iterators from existing ones—creates processing pipelines without intermediate data structures. Ruby's lazy enumerators support this pattern. Composed iterators can filter, transform, and combine data streams efficiently.

# Iterator composition
def process_large_dataset(filename)
  File.open(filename).lazy
    .select { |line| line.start_with?('ERROR') }
    .map { |line| parse_error(line) }
    .select { |error| error.severity > 3 }
    .take(10)
end

Thread safety considerations arise with iterators and mutable collections. If a collection changes during iteration, the iterator may produce inconsistent results or raise exceptions. Some options include:

Make collections immutable during iteration
Copy the collection before iterating
Use synchronization to prevent concurrent modification
Document iteration behavior during modification
Implement fail-fast iterators that detect concurrent modification

Performance implications vary by iterator type. Internal iterators often perform better because they avoid allocation and method call overhead of external iterator objects. However, external iterators enable optimizations like lazy evaluation that can dramatically improve performance for large datasets.

Common Patterns

The enumerable pattern implements iteration by mixing in Enumerable and defining each. This pattern provides dozens of iteration methods for free. Collections that follow this pattern integrate seamlessly with Ruby's iteration infrastructure.

class Range2D
  include Enumerable
  
  def initialize(x_range, y_range)
    @x_range = x_range
    @y_range = y_range
  end
  
  def each
    return enum_for(:each) unless block_given?
    
    @x_range.each do |x|
      @y_range.each do |y|
        yield [x, y]
      end
    end
  end
end

grid = Range2D.new(1..3, 1..3)
grid.map { |x, y| x * y }
# => [1, 2, 3, 2, 4, 6, 3, 6, 9]

grid.select { |x, y| (x + y).even? }
# => [[1, 1], [1, 3], [2, 2], [3, 1], [3, 3]]

The external iterator pattern returns an Enumerator when no block is provided. This dual-mode operation supports both iteration styles. The pattern appears throughout Ruby's standard library.

def custom_iterator(data)
  return enum_for(:custom_iterator, data) unless block_given?
  
  data.each { |item| yield item }
end

# With block - internal iteration
custom_iterator([1, 2, 3]) { |x| puts x }

# Without block - external iteration
enum = custom_iterator([1, 2, 3])
enum.next  # => 1

The filter iterator pattern creates a new iterator that selects elements from an underlying iterator based on a condition. This pattern chains iterators to build processing pipelines.

class FilterIterator
  include Enumerable
  
  def initialize(source)
    @source = source
  end
  
  def each(&block)
    return enum_for(:each) unless block_given?
    
    @source.each do |item|
      yield item if condition(item)
    end
  end
  
  def condition(item)
    raise NotImplementedError
  end
end

class EvenFilter < FilterIterator
  def condition(item)
    item.even?
  end
end

numbers = 1..10
filtered = EvenFilter.new(numbers)
filtered.to_a  # => [2, 4, 6, 8, 10]

The transform iterator pattern creates an iterator that applies a transformation to each element from a source iterator. This pattern differs from simple mapping because the iterator remains lazy—transformations happen on demand rather than upfront.

class TransformIterator
  include Enumerable
  
  def initialize(source, &transform)
    @source = source
    @transform = transform
  end
  
  def each
    return enum_for(:each) unless block_given?
    
    @source.each do |item|
      yield @transform.call(item)
    end
  end
end

numbers = 1..5
squared = TransformIterator.new(numbers) { |x| x ** 2 }
squared.take(3)  # => [1, 4, 9]

The composite iterator pattern coordinates iteration across multiple collections. This pattern appears when processing parallel data structures or merging multiple streams.

class ZipIterator
  def initialize(*sources)
    @sources = sources
  end
  
  def each
    return enum_for(:each) unless block_given?
    
    enumerators = @sources.map(&:each)
    
    loop do
      values = enumerators.map do |enum|
        enum.next
      end
      yield values
    end
  rescue StopIteration
    # Stop when any iterator exhausted
  end
end

names = ['Alice', 'Bob', 'Carol']
ages = [30, 25, 35]
cities = ['NYC', 'LA', 'Chicago']

zipped = ZipIterator.new(names, ages, cities)
zipped.each { |name, age, city| puts "#{name}, #{age}, #{city}" }

The generator pattern creates iterators for computed sequences without storing all values. Ruby's Enumerator constructor enables this pattern for lazy, potentially infinite sequences.

def prime_generator
  Enumerator.new do |yielder|
    primes = []
    candidate = 2
    
    loop do
      is_prime = primes.none? { |p| candidate % p == 0 }
      if is_prime
        primes << candidate
        yielder << candidate
      end
      candidate += 1
    end
  end
end

primes = prime_generator
primes.take(10)  # => [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]

Reference

Iterator Protocol Methods

Method	Purpose	Returns
each	Primary iteration method	Enumerator or nil
each_with_index	Iterate with element index	Enumerator or nil
next	Get next element (external)	Next element
peek	View next without advancing	Next element
rewind	Reset to beginning	self
next?	Check if more elements exist	Boolean

Enumerable Core Methods

Method	Operation	Example Result
map	Transform elements	[1,2,3].map{
select	Filter elements	[1,2,3].select(&:even?) => [2]
reject	Inverse filter	[1,2,3].reject(&:even?) => [1,3]
reduce	Aggregate elements	[1,2,3].reduce(:+) => 6
find	First matching element	[1,2,3].find(&:even?) => 2
take	First n elements	[1,2,3].take(2) => [1,2]
drop	Skip first n elements	[1,2,3].drop(2) => [3]

Creating Enumerators

Approach	Use Case	Syntax
Return from each	Standard pattern	return enum_for(:each) unless block_given?
Enumerator.new	Custom sequences	Enumerator.new {
enum_for	Convert any method	obj.enum_for(:method_name)
to_enum	Alias for enum_for	obj.to_enum(:method_name)
lazy	Lazy evaluation	collection.lazy.map{...}.select{...}

Iterator Pattern Components

Component	Responsibility	Ruby Implementation
Iterator	Defines traversal interface	Enumerator class
Concrete Iterator	Implements traversal for collection	Enumerator instances
Aggregate	Declares iterator creation	Object with each method
Concrete Aggregate	Creates iterator instance	Classes mixing Enumerable

Iterator Types

Type	Control	Use Case	Example
Internal	Collection controls loop	Simple traversals	array.each {
External	Client controls loop	Interleaved iteration	enum.next; enum.next
Lazy	Deferred evaluation	Large/infinite sequences	(1..Float::INFINITY).lazy
Active	Maintains current position	Stateful traversal	File.open(f).each_line

Common Iterator Patterns

Pattern	Description	Implementation
Filter	Select elements by condition	Wrap source, yield conditionally
Transform	Apply function to elements	Wrap source, yield transformed
Chain	Sequential multiple sources	Track sources, switch when exhausted
Zip	Parallel multiple sources	Advance all, yield array of values
Generator	Compute on demand	Use Enumerator.new with yielder

Thread Safety Strategies

Strategy	Approach	Trade-off
Immutable	Freeze during iteration	Prevents all modification
Copy	Clone before iterating	Memory overhead
Synchronized	Lock collection	Performance impact
Fail-fast	Detect modification	Error on concurrent change
Thread-local	Per-thread iterators	Complex state management

Performance Characteristics

Operation	Internal Iterator	External Iterator	Lazy Iterator
Memory	Constant	Constant + iterator	Constant
Overhead	Minimal	Method calls	Deferred computation
Composition	Intermediate arrays	Nested calls	Chained operations
Early termination	Break from block	Stop calling next	Stop evaluation

Iterator Pattern