CrackedRuby logo

CrackedRuby

Object Enumeration

Comprehensive guide to Ruby's enumeration methods for processing collections and implementing custom iterators.

Core Modules ObjectSpace Module
3.5.1

Overview

Ruby provides enumeration capabilities through the Enumerable module, which defines iteration methods that work with any class implementing #each. Arrays, hashes, ranges, and custom objects can include this module to gain powerful collection processing methods.

The Enumerable module contains over 50 methods for filtering, transforming, and aggregating data. Core methods include #map, #select, #reject, #reduce, and #each_with_index. Ruby also provides Enumerator objects for lazy evaluation and custom iteration patterns.

# Basic enumeration with Array
numbers = [1, 2, 3, 4, 5]
doubled = numbers.map { |n| n * 2 }
# => [2, 4, 6, 8, 10]

# Hash enumeration
scores = { alice: 95, bob: 87, carol: 92 }
passing = scores.select { |name, score| score >= 90 }
# => { alice: 95, carol: 92 }

Classes gain enumeration by including Enumerable and defining #each:

class WordList
  include Enumerable
  
  def initialize(words)
    @words = words
  end
  
  def each
    @words.each { |word| yield word.upcase }
  end
end

list = WordList.new(['hello', 'world'])
list.map(&:reverse)
# => ["OLLEH", "DLROW"]

Basic Usage

Enumeration methods accept blocks that define processing logic. The #each method iterates without returning a new collection, while transformation methods like #map and #select return modified collections.

Collection filtering uses #select for inclusion and #reject for exclusion:

ages = [15, 22, 8, 35, 19, 67]
adults = ages.select { |age| age >= 18 }
# => [22, 35, 19, 67]

minors = ages.reject { |age| age >= 18 }
# => [15, 8]

The #map method transforms each element:

names = ['alice', 'bob', 'carol']
capitalized = names.map(&:capitalize)
# => ["Alice", "Bob", "Carol"]

lengths = names.map(&:length)
# => [5, 3, 5]

Aggregation methods reduce collections to single values. The #reduce method (aliased as #inject) accumulates results:

# Sum calculation
total = [1, 2, 3, 4, 5].reduce(0) { |sum, n| sum + n }
# => 15

# String concatenation
sentence = %w[Ruby makes enumeration easy].reduce { |result, word| "#{result} #{word}" }
# => "Ruby makes enumeration easy"

# Hash building
word_counts = %w[apple banana apple cherry banana apple].reduce(Hash.new(0)) do |counts, word|
  counts[word] += 1
  counts
end
# => {"apple"=>3, "banana"=>2, "cherry"=>1}

The #find method returns the first matching element, while #find_all is an alias for #select:

users = [
  { name: 'Alice', age: 30 },
  { name: 'Bob', age: 25 },
  { name: 'Carol', age: 35 }
]

first_adult = users.find { |user| user[:age] >= 30 }
# => { name: 'Alice', age: 30 }

Advanced Usage

Method chaining creates processing pipelines by combining multiple enumeration operations. Ruby evaluates chains eagerly by default, creating intermediate collections:

data = [
  { name: 'Alice', department: 'Engineering', salary: 95000 },
  { name: 'Bob', department: 'Marketing', salary: 65000 },
  { name: 'Carol', department: 'Engineering', salary: 87000 },
  { name: 'David', department: 'Sales', salary: 72000 }
]

# Complex chain processing
high_earners = data
  .select { |emp| emp[:salary] > 70000 }
  .group_by { |emp| emp[:department] }
  .transform_values { |emps| emps.map { |emp| emp[:name] } }
# => {"Engineering"=>["Alice", "Carol"], "Sales"=>["David"]}

The Enumerator class enables lazy evaluation and custom iteration patterns. Lazy enumerators process elements on-demand rather than creating intermediate collections:

# Infinite sequence with lazy evaluation
fibonacci = Enumerator.new do |yielder|
  a, b = 0, 1
  loop do
    yielder << a
    a, b = b, a + b
  end
end.lazy

first_ten = fibonacci.take(10).to_a
# => [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

# Large file processing without memory overhead
def process_large_file(filename)
  File.foreach(filename).lazy
    .map(&:chomp)
    .select { |line| line.match?(/ERROR/) }
    .map { |line| parse_log_entry(line) }
    .take(100)
    .to_a
end

Custom enumerators handle specialized iteration requirements:

class TreeNode
  include Enumerable
  
  def initialize(value, children = [])
    @value = value
    @children = children
  end
  
  # Depth-first traversal
  def each(&block)
    return enum_for(:each) unless block_given?
    
    yield @value
    @children.each { |child| child.each(&block) }
  end
  
  # Breadth-first traversal
  def breadth_first
    return enum_for(:breadth_first) unless block_given?
    
    queue = [self]
    until queue.empty?
      node = queue.shift
      yield node.value
      queue.concat(node.children)
    end
  end
  
  attr_reader :value, :children
end

tree = TreeNode.new('root', [
  TreeNode.new('child1', [TreeNode.new('grandchild1')]),
  TreeNode.new('child2')
])

tree.map(&:upcase)
# => ["ROOT", "CHILD1", "GRANDCHILD1", "CHILD2"]

tree.breadth_first.to_a
# => ["root", "child1", "child2", "grandchild1"]

The #group_by method creates hash partitions, while #partition splits into two arrays:

# Grouping by multiple criteria
transactions = [
  { amount: 100, type: 'credit', date: '2023-01-01' },
  { amount: 50, type: 'debit', date: '2023-01-01' },
  { amount: 200, type: 'credit', date: '2023-01-02' }
]

by_date_and_type = transactions.group_by { |t| [t[:date], t[:type]] }
# => {["2023-01-01", "credit"]=>[{:amount=>100, :type=>"credit", :date=>"2023-01-01"}], ...}

# Binary partition
credits, debits = transactions.partition { |t| t[:type] == 'credit' }

Performance & Memory

Enumeration method selection significantly impacts performance. Lazy evaluation prevents unnecessary computation and reduces memory usage for large datasets:

# Eager evaluation - processes entire collection
def find_expensive_items_eager(products)
  products
    .map { |p| expensive_calculation(p) }  # Processes all items
    .select { |result| result[:score] > 0.8 }
    .first
end

# Lazy evaluation - stops at first match
def find_expensive_items_lazy(products)
  products.lazy
    .map { |p| expensive_calculation(p) }  # Processes until match found
    .select { |result| result[:score] > 0.8 }
    .first
end

Some methods create full result arrays, while others can short-circuit:

# Memory-intensive operations
large_array = (1..1_000_000).to_a

# Creates intermediate arrays
result1 = large_array.map { |n| n * 2 }.select { |n| n > 500_000 }.first(10)

# Memory-efficient with lazy evaluation
result2 = large_array.lazy.map { |n| n * 2 }.select { |n| n > 500_000 }.first(10)

# Short-circuiting methods
any_large = large_array.any? { |n| n > 900_000 }  # Stops at first match
all_positive = large_array.all? { |n| n > 0 }     # Stops at first false

Hash operations generally perform better than array operations for lookup-heavy enumeration:

# Inefficient - repeated array searches
def categorize_slow(items, categories)
  items.map do |item|
    category = categories.find { |cat| cat[:id] == item[:category_id] }
    { item: item, category: category }
  end
end

# Efficient - hash lookup
def categorize_fast(items, categories)
  category_hash = categories.index_by { |cat| cat[:id] }
  items.map do |item|
    { item: item, category: category_hash[item[:category_id]] }
  end
end

Common Pitfalls

Block variable scope can cause unexpected behavior when blocks modify outer variables:

# Dangerous - modifying collection during enumeration
items = [1, 2, 3, 4, 5]
items.each do |item|
  items.delete(item) if item.even?  # Skips elements
end
# => [1, 3, 5] but may skip elements due to shifting indices

# Safe approach
items = [1, 2, 3, 4, 5]
items = items.reject(&:even?)

The difference between #map and #each confuses developers. Use #each for side effects and #map for transformations:

# Wrong - using each for transformation
names = []
['alice', 'bob'].each { |name| names << name.capitalize }

# Correct - using map for transformation  
names = ['alice', 'bob'].map(&:capitalize)

# Wrong - using map for side effects
['alice', 'bob'].map { |name| puts name.capitalize }  # Returns array of nils

# Correct - using each for side effects
['alice', 'bob'].each { |name| puts name.capitalize }

Symbol-to-proc conversion (&:method) has limitations with method arguments:

# Works - no arguments
numbers = [1, 2, 3]
strings = numbers.map(&:to_s)

# Fails - method requires arguments
strings = ['hello', 'world']
# strings.map(&:ljust(10))  # ArgumentError

# Correct - use explicit block
padded = strings.map { |s| s.ljust(10) }

Hash enumeration yields key-value pairs as separate block parameters:

scores = { alice: 95, bob: 87 }

# Wrong - treats each pair as single argument
# scores.map { |pair| pair * 2 }  # NoMethodError

# Correct - separate key and value parameters
doubled = scores.map { |name, score| [name, score * 2] }.to_h
# => { alice: 190, bob: 174 }

Production Patterns

Database record processing benefits from batching and lazy evaluation to avoid memory exhaustion:

# Process large result sets in batches
class ReportGenerator
  def generate_user_report
    User.find_each(batch_size: 1000) do |user|
      process_user_data(user)
    end
  end
  
  def calculate_metrics(users)
    users.lazy
      .select(&:active?)
      .map { |user| UserMetrics.new(user) }
      .each { |metrics| store_metrics(metrics) }
  end
end

API response processing requires error handling within enumeration blocks:

class APIDataProcessor
  def process_api_responses(urls)
    results = []
    errors = []
    
    urls.each_with_index do |url, index|
      begin
        response = fetch_api_data(url)
        parsed = JSON.parse(response.body)
        results << transform_response(parsed)
      rescue StandardError => error
        errors << { url: url, index: index, error: error.message }
      end
    end
    
    { results: results, errors: errors }
  end
end

Configuration processing with enumeration handles nested data structures:

class ConfigProcessor
  def process_environment_config(config)
    config.each_with_object({}) do |(key, value), processed|
      processed_key = key.to_s.downcase.gsub('-', '_')
      
      processed[processed_key] = case value
      when Hash
        process_environment_config(value)
      when Array
        value.map { |item| item.is_a?(Hash) ? process_environment_config(item) : item }
      else
        value
      end
    end
  end
end

Log processing with enumeration handles large files efficiently:

class LogAnalyzer
  def analyze_access_logs(log_file)
    stats = Hash.new { |h, k| h[k] = Hash.new(0) }
    
    File.foreach(log_file).lazy
      .map(&:chomp)
      .filter_map { |line| parse_log_line(line) }
      .group_by { |entry| entry[:hour] }
      .each do |hour, entries|
        stats[hour][:requests] = entries.size
        stats[hour][:unique_ips] = entries.map { |e| e[:ip] }.uniq.size
        stats[hour][:status_codes] = entries.group_by { |e| e[:status] }
                                           .transform_values(&:size)
      end
    
    stats
  end
end

Reference

Core Enumerable Methods

Method Parameters Returns Description
#each &block Enumerable Yields each element to block
#map &block Array Transforms each element via block
#select &block Array Returns elements where block is truthy
#reject &block Array Returns elements where block is falsy
#find &block Object or nil Returns first element where block is truthy
#reduce initial=nil, &block Object Accumulates result via block
#each_with_index &block Enumerable Yields element and index
#each_with_object object, &block Object Yields element and object

Filtering and Searching Methods

Method Parameters Returns Description
#filter &block Array Alias for select
#find_all &block Array Alias for select
#detect &block Object or nil Alias for find
#grep pattern Array Elements matching pattern
#grep_v pattern Array Elements not matching pattern

Testing Methods

Method Parameters Returns Description
#all? &block Boolean True if block returns truthy for all
#any? &block Boolean True if block returns truthy for any
#none? &block Boolean True if block returns falsy for all
#one? &block Boolean True if block returns truthy for exactly one
#include? object Boolean True if collection includes object

Grouping and Partitioning

Method Parameters Returns Description
#group_by &block Hash Groups elements by block result
#partition &block Array Splits into two arrays based on block
#chunk &block Enumerator Groups consecutive elements
#slice_when &block Enumerator Splits when block returns truthy

Aggregation Methods

Method Parameters Returns Description
#count object=nil, &block Integer Count of elements
#sum init=0, &block Object Sum of elements
#min &block Object Minimum element
#max &block Object Maximum element
#minmax &block Array Array of min and max

Enumerator Class

Method Parameters Returns Description
Enumerator.new &block Enumerator Creates custom enumerator
#lazy Enumerator::Lazy Returns lazy enumerator
#with_index offset=0 Enumerator Adds index to enumeration
#next Object Returns next element
#rewind Enumerator Resets enumerator position

Common Block Patterns

# Symbol to proc conversion
collection.map(&:method_name)
collection.select(&:predicate?)

# Index access
collection.each_with_index { |item, index| }
collection.map.with_index { |item, index| }

# Hash iteration
hash.each { |key, value| }
hash.map { |key, value| [new_key, new_value] }

# Nested enumeration
matrix.each { |row| row.each { |cell| } }
matrix.flatten.each { |cell| }