Overview
The Iterator Pattern defines a standard way to traverse collections by separating the traversal logic from the collection structure. This pattern originated from object-oriented design principles and became formalized in the Gang of Four design patterns catalog. The pattern addresses the problem that collections can have different internal structures (arrays, trees, graphs, hash tables) but often need similar traversal operations.
The pattern establishes a uniform interface for iteration regardless of the collection's internal organization. This abstraction allows client code to work with different collection types through the same iteration protocol. The iterator maintains the current position in the traversal and knows how to get the next element.
Modern programming languages incorporate iterator concepts directly into their syntax and standard libraries. Ruby provides built-in iterator support through the Enumerable module and blocks, making iteration a first-class language feature. The pattern influences how developers design collection classes and process sequential data.
# Basic iterator concept
collection = [1, 2, 3, 4, 5]
# External iterator approach
iterator = collection.each
iterator.next # => 1
iterator.next # => 2
# Internal iterator approach
collection.each { |item| puts item }
Key Principles
The Iterator Pattern consists of four primary components. The Iterator interface declares operations for traversing elements, typically including methods to get the next element, check if more elements exist, and potentially reset to the beginning. The Concrete Iterator implements this interface for a specific collection type and tracks the current position during traversal.
The Aggregate interface declares a method for creating an iterator object. This factory method returns an iterator configured to traverse the aggregate. The Concrete Aggregate implements the iterator creation method and returns an instance of the appropriate concrete iterator. This separation means the collection doesn't need to implement traversal logic directly.
The iterator encapsulates the traversal algorithm and maintains traversal state independently from the collection. Multiple iterators can traverse the same collection simultaneously because each iterator maintains its own position. The collection doesn't need to track iteration state or support multiple simultaneous traversals directly.
Two main iterator approaches exist. External iterators give clients explicit control over iteration. The client requests the next element when ready, controlling the traversal pace and logic. Internal iterators control the iteration internally and apply client-provided operations to each element. Ruby emphasizes internal iterators through blocks and the Enumerable module.
# External iterator - client controls advancement
class ExternalCounter
def initialize(limit)
@limit = limit
@current = 0
end
def next?
@current < @limit
end
def next
return nil unless next?
value = @current
@current += 1
value
end
end
counter = ExternalCounter.new(3)
while counter.next?
puts counter.next
end
# Internal iterator - collection controls traversal
class InternalCounter
def initialize(limit)
@limit = limit
end
def each
(0...@limit).each { |i| yield i }
end
end
InternalCounter.new(3).each { |i| puts i }
The pattern implements the Single Responsibility Principle by extracting traversal logic into separate iterator classes. Collections handle storage and organization while iterators handle traversal. This separation simplifies both classes and makes the code more maintainable.
The Open/Closed Principle applies because new iterator types can be added without modifying existing collection classes. Different traversal algorithms (forward, backward, filtered, transformed) can be implemented as new iterator classes that work with existing collections.
Ruby Implementation
Ruby integrates iteration deeply into the language through the Enumerable module and block syntax. The core iteration protocol requires a collection class to implement the each method. Once each is defined, mixing in Enumerable provides dozens of iteration methods built on that foundation.
The each method represents Ruby's internal iterator pattern. The collection controls the iteration loop and yields each element to a client-provided block. This approach differs from languages that emphasize external iterators where clients explicitly request each element.
class CustomCollection
include Enumerable
def initialize(items)
@items = items
end
def each
return enum_for(:each) unless block_given?
@items.each { |item| yield item }
end
end
collection = CustomCollection.new([1, 2, 3, 4, 5])
# Enumerable methods work automatically
collection.map { |x| x * 2 } # => [2, 4, 6, 8, 10]
collection.select { |x| x.even? } # => [2, 4]
collection.reduce(:+) # => 15
The each method checks if a block was provided using block_given?. When no block exists, it returns an Enumerator object representing an external iterator. This pattern supports both internal iteration (with a block) and external iteration (without a block).
Enumerator objects provide Ruby's external iterator implementation. These objects maintain iteration state and allow explicit control over traversal. Enumerators support methods like next, rewind, and peek for fine-grained iteration control.
array = [1, 2, 3]
enum = array.each
# External iteration control
enum.next # => 1
enum.peek # => 2 (doesn't advance)
enum.next # => 2
enum.next # => 3
enum.next # raises StopIteration
# Rewind to start over
enum.rewind
enum.next # => 1
Custom enumerators can be created using Enumerator.new with a block that defines the iteration logic. The block receives a yielder object used to emit values to the iterator's consumer. This approach enables lazy evaluation and infinite sequences.
fibonacci = Enumerator.new do |yielder|
a, b = 0, 1
loop do
yielder << a
a, b = b, a + b
end
end
# Take first 10 Fibonacci numbers
fibonacci.take(10) # => [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
# Infinite sequence but only evaluates what's needed
fibonacci.select { |n| n.even? }.take(5) # => [0, 2, 8, 34, 144]
Ruby's enum_for method creates an enumerator for any method that accepts a block. This enables external iteration for methods designed for internal iteration. The pattern appears frequently in Ruby's standard library to support both iteration styles.
class DataSource
def read_records
return enum_for(:read_records) unless block_given?
# Simulate reading records from a data source
5.times do |i|
yield "Record #{i}"
end
end
end
source = DataSource.new
# Internal iteration
source.read_records { |record| puts record }
# External iteration
enum = source.read_records
enum.next # => "Record 0"
enum.next # => "Record 1"
The Enumerable module provides lazy enumerators through the lazy method. Lazy enumerators defer computation until results are actually needed. This optimization matters for large collections or infinite sequences where generating all intermediate results would be expensive or impossible.
# Without lazy - generates all intermediate arrays
(1..1_000_000).map { |x| x * 2 }.select { |x| x > 100 }.take(5)
# With lazy - only computes what's needed
(1..1_000_000).lazy.map { |x| x * 2 }.select { |x| x > 100 }.take(5).to_a
# => [102, 104, 106, 108, 110]
Practical Examples
Building a custom tree structure demonstrates how the Iterator Pattern enables multiple traversal strategies for the same data structure. Trees can be traversed depth-first or breadth-first, and each approach requires different iteration logic.
class TreeNode
attr_accessor :value, :children
def initialize(value)
@value = value
@children = []
end
def add_child(node)
@children << node
end
# Depth-first traversal
def each_depth_first(&block)
return enum_for(:each_depth_first) unless block_given?
yield self
@children.each do |child|
child.each_depth_first(&block)
end
end
# Breadth-first traversal
def each_breadth_first
return enum_for(:each_breadth_first) unless block_given?
queue = [self]
until queue.empty?
node = queue.shift
yield node
queue.concat(node.children)
end
end
end
# Build tree structure
root = TreeNode.new('A')
b = TreeNode.new('B')
c = TreeNode.new('C')
root.add_child(b)
root.add_child(c)
b.add_child(TreeNode.new('D'))
b.add_child(TreeNode.new('E'))
c.add_child(TreeNode.new('F'))
# Different traversals of same tree
root.each_depth_first { |node| print "#{node.value} " }
# Output: A B D E C F
puts
root.each_breadth_first { |node| print "#{node.value} " }
# Output: A B C D E F
Processing large files line-by-line demonstrates iterators for I/O operations. Loading entire files into memory fails with large datasets. An iterator pattern processes one line at a time, maintaining constant memory usage regardless of file size.
class LogFileIterator
def initialize(filename)
@filename = filename
end
def each
return enum_for(:each) unless block_given?
File.open(@filename, 'r') do |file|
file.each_line do |line|
yield line.chomp
end
end
end
# Filter iterator that wraps another iterator
def error_lines
return enum_for(:error_lines) unless block_given?
each do |line|
yield line if line.include?('[ERROR]')
end
end
# Transform iterator
def parsed_entries
return enum_for(:parsed_entries) unless block_given?
each do |line|
timestamp, level, message = line.split(' - ', 3)
yield({ timestamp: timestamp, level: level, message: message })
end
end
end
# Assuming log file exists
log = LogFileIterator.new('application.log')
# Process only error lines
log.error_lines.each do |line|
puts "Error found: #{line}"
end
# Transform each line into structured data
log.parsed_entries.take(5).each do |entry|
puts "#{entry[:level]}: #{entry[:message]}"
end
Implementing pagination for database queries shows iterators for external data sources. Database queries shouldn't load all results into memory. An iterator fetches pages of results on demand, providing the illusion of iterating through all records while keeping memory usage bounded.
class PaginatedQuery
def initialize(base_query, page_size: 100)
@base_query = base_query
@page_size = page_size
end
def each
return enum_for(:each) unless block_given?
offset = 0
loop do
# Simulate database query with LIMIT/OFFSET
results = fetch_page(offset, @page_size)
break if results.empty?
results.each { |record| yield record }
offset += @page_size
end
end
private
def fetch_page(offset, limit)
# Simulated database query
# In practice: @base_query.limit(limit).offset(offset)
all_records = (1..500).to_a # Simulated large dataset
all_records[offset, limit] || []
end
end
query = PaginatedQuery.new("SELECT * FROM users", page_size: 100)
# Process all records without loading everything into memory
query.each do |record|
# Process one record at a time
puts record
end
# Combine with Enumerable methods
query.select { |record| record > 250 }.take(10)
Creating a composite iterator that merges multiple sorted sequences demonstrates advanced iterator composition. When processing multiple sorted data sources, a merging iterator produces a single sorted sequence without materializing all data first.
class MergingIterator
def initialize(*iterators)
@iterators = iterators.map do |iter|
enum = iter.is_a?(Enumerator) ? iter : iter.each
{ enum: enum, current: get_next(enum) }
end.reject { |iter| iter[:current].nil? }
end
def each
return enum_for(:each) unless block_given?
until @iterators.empty?
# Find iterator with smallest current value
min_iter = @iterators.min_by { |iter| iter[:current] }
yield min_iter[:current]
# Advance that iterator
min_iter[:current] = get_next(min_iter[:enum])
# Remove exhausted iterators
@iterators.reject! { |iter| iter[:current].nil? }
end
end
private
def get_next(enum)
enum.next
rescue StopIteration
nil
end
end
# Three sorted sequences
iter1 = [1, 4, 7, 10].each
iter2 = [2, 5, 8].each
iter3 = [3, 6, 9, 11, 12].each
merged = MergingIterator.new(iter1, iter2, iter3)
merged.each { |value| print "#{value} " }
# Output: 1 2 3 4 5 6 7 8 9 10 11 12
Design Considerations
The choice between internal and external iterators affects API design and flexibility. Internal iterators (Ruby's block-based approach) offer simpler client code and better encapsulation. The collection controls the iteration loop and clients provide operations to perform on each element. External iterators provide more control but require clients to manage iteration state explicitly.
Internal iterators work well for straightforward traversals where the iteration logic doesn't need to be interrupted or interleaved with other operations. They excel at operations like mapping, filtering, and reducing because these operations process all elements in a standard way. Ruby's Enumerable module builds on this foundation effectively.
# Internal iterator - simple and clean
collection.select { |item| item.even? }.map { |item| item * 2 }
# External iterator - more control but verbose
result = []
enum = collection.each
while true
begin
item = enum.next
result << item * 2 if item.even?
rescue StopIteration
break
end
end
External iterators become necessary when iteration needs to be paused, interleaved with other operations, or coordinated across multiple collections. Comparing elements from two collections requires advancing both iterators in lockstep, which internal iterators cannot express easily.
The decision to implement custom iterators versus using Enumerable depends on the collection's characteristics. Simple sequential collections should mix in Enumerable and implement each. Collections with multiple natural traversal orders (like trees) benefit from providing multiple iterator methods. Collections that need lazy evaluation or represent potentially infinite sequences require custom Enumerator implementations.
Iterator composition—building new iterators from existing ones—creates processing pipelines without intermediate data structures. Ruby's lazy enumerators support this pattern. Composed iterators can filter, transform, and combine data streams efficiently.
# Iterator composition
def process_large_dataset(filename)
File.open(filename).lazy
.select { |line| line.start_with?('ERROR') }
.map { |line| parse_error(line) }
.select { |error| error.severity > 3 }
.take(10)
end
Thread safety considerations arise with iterators and mutable collections. If a collection changes during iteration, the iterator may produce inconsistent results or raise exceptions. Some options include:
- Make collections immutable during iteration
- Copy the collection before iterating
- Use synchronization to prevent concurrent modification
- Document iteration behavior during modification
- Implement fail-fast iterators that detect concurrent modification
Performance implications vary by iterator type. Internal iterators often perform better because they avoid allocation and method call overhead of external iterator objects. However, external iterators enable optimizations like lazy evaluation that can dramatically improve performance for large datasets.
Common Patterns
The enumerable pattern implements iteration by mixing in Enumerable and defining each. This pattern provides dozens of iteration methods for free. Collections that follow this pattern integrate seamlessly with Ruby's iteration infrastructure.
class Range2D
include Enumerable
def initialize(x_range, y_range)
@x_range = x_range
@y_range = y_range
end
def each
return enum_for(:each) unless block_given?
@x_range.each do |x|
@y_range.each do |y|
yield [x, y]
end
end
end
end
grid = Range2D.new(1..3, 1..3)
grid.map { |x, y| x * y }
# => [1, 2, 3, 2, 4, 6, 3, 6, 9]
grid.select { |x, y| (x + y).even? }
# => [[1, 1], [1, 3], [2, 2], [3, 1], [3, 3]]
The external iterator pattern returns an Enumerator when no block is provided. This dual-mode operation supports both iteration styles. The pattern appears throughout Ruby's standard library.
def custom_iterator(data)
return enum_for(:custom_iterator, data) unless block_given?
data.each { |item| yield item }
end
# With block - internal iteration
custom_iterator([1, 2, 3]) { |x| puts x }
# Without block - external iteration
enum = custom_iterator([1, 2, 3])
enum.next # => 1
The filter iterator pattern creates a new iterator that selects elements from an underlying iterator based on a condition. This pattern chains iterators to build processing pipelines.
class FilterIterator
include Enumerable
def initialize(source)
@source = source
end
def each(&block)
return enum_for(:each) unless block_given?
@source.each do |item|
yield item if condition(item)
end
end
def condition(item)
raise NotImplementedError
end
end
class EvenFilter < FilterIterator
def condition(item)
item.even?
end
end
numbers = 1..10
filtered = EvenFilter.new(numbers)
filtered.to_a # => [2, 4, 6, 8, 10]
The transform iterator pattern creates an iterator that applies a transformation to each element from a source iterator. This pattern differs from simple mapping because the iterator remains lazy—transformations happen on demand rather than upfront.
class TransformIterator
include Enumerable
def initialize(source, &transform)
@source = source
@transform = transform
end
def each
return enum_for(:each) unless block_given?
@source.each do |item|
yield @transform.call(item)
end
end
end
numbers = 1..5
squared = TransformIterator.new(numbers) { |x| x ** 2 }
squared.take(3) # => [1, 4, 9]
The composite iterator pattern coordinates iteration across multiple collections. This pattern appears when processing parallel data structures or merging multiple streams.
class ZipIterator
def initialize(*sources)
@sources = sources
end
def each
return enum_for(:each) unless block_given?
enumerators = @sources.map(&:each)
loop do
values = enumerators.map do |enum|
enum.next
end
yield values
end
rescue StopIteration
# Stop when any iterator exhausted
end
end
names = ['Alice', 'Bob', 'Carol']
ages = [30, 25, 35]
cities = ['NYC', 'LA', 'Chicago']
zipped = ZipIterator.new(names, ages, cities)
zipped.each { |name, age, city| puts "#{name}, #{age}, #{city}" }
The generator pattern creates iterators for computed sequences without storing all values. Ruby's Enumerator constructor enables this pattern for lazy, potentially infinite sequences.
def prime_generator
Enumerator.new do |yielder|
primes = []
candidate = 2
loop do
is_prime = primes.none? { |p| candidate % p == 0 }
if is_prime
primes << candidate
yielder << candidate
end
candidate += 1
end
end
end
primes = prime_generator
primes.take(10) # => [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
Reference
Iterator Protocol Methods
| Method | Purpose | Returns |
|---|---|---|
| each | Primary iteration method | Enumerator or nil |
| each_with_index | Iterate with element index | Enumerator or nil |
| next | Get next element (external) | Next element |
| peek | View next without advancing | Next element |
| rewind | Reset to beginning | self |
| next? | Check if more elements exist | Boolean |
Enumerable Core Methods
| Method | Operation | Example Result |
|---|---|---|
| map | Transform elements | [1,2,3].map{ |
| select | Filter elements | [1,2,3].select(&:even?) => [2] |
| reject | Inverse filter | [1,2,3].reject(&:even?) => [1,3] |
| reduce | Aggregate elements | [1,2,3].reduce(:+) => 6 |
| find | First matching element | [1,2,3].find(&:even?) => 2 |
| take | First n elements | [1,2,3].take(2) => [1,2] |
| drop | Skip first n elements | [1,2,3].drop(2) => [3] |
Creating Enumerators
| Approach | Use Case | Syntax |
|---|---|---|
| Return from each | Standard pattern | return enum_for(:each) unless block_given? |
| Enumerator.new | Custom sequences | Enumerator.new { |
| enum_for | Convert any method | obj.enum_for(:method_name) |
| to_enum | Alias for enum_for | obj.to_enum(:method_name) |
| lazy | Lazy evaluation | collection.lazy.map{...}.select{...} |
Iterator Pattern Components
| Component | Responsibility | Ruby Implementation |
|---|---|---|
| Iterator | Defines traversal interface | Enumerator class |
| Concrete Iterator | Implements traversal for collection | Enumerator instances |
| Aggregate | Declares iterator creation | Object with each method |
| Concrete Aggregate | Creates iterator instance | Classes mixing Enumerable |
Iterator Types
| Type | Control | Use Case | Example |
|---|---|---|---|
| Internal | Collection controls loop | Simple traversals | array.each { |
| External | Client controls loop | Interleaved iteration | enum.next; enum.next |
| Lazy | Deferred evaluation | Large/infinite sequences | (1..Float::INFINITY).lazy |
| Active | Maintains current position | Stateful traversal | File.open(f).each_line |
Common Iterator Patterns
| Pattern | Description | Implementation |
|---|---|---|
| Filter | Select elements by condition | Wrap source, yield conditionally |
| Transform | Apply function to elements | Wrap source, yield transformed |
| Chain | Sequential multiple sources | Track sources, switch when exhausted |
| Zip | Parallel multiple sources | Advance all, yield array of values |
| Generator | Compute on demand | Use Enumerator.new with yielder |
Thread Safety Strategies
| Strategy | Approach | Trade-off |
|---|---|---|
| Immutable | Freeze during iteration | Prevents all modification |
| Copy | Clone before iterating | Memory overhead |
| Synchronized | Lock collection | Performance impact |
| Fail-fast | Detect modification | Error on concurrent change |
| Thread-local | Per-thread iterators | Complex state management |
Performance Characteristics
| Operation | Internal Iterator | External Iterator | Lazy Iterator |
|---|---|---|---|
| Memory | Constant | Constant + iterator | Constant |
| Overhead | Minimal | Method calls | Deferred computation |
| Composition | Intermediate arrays | Nested calls | Chained operations |
| Early termination | Break from block | Stop calling next | Stop evaluation |