CrackedRuby logo

CrackedRuby

Array Transformation (map, select, reject)

Ruby array transformation methods map, select, and reject for filtering and converting array elements.

Core Built-in Classes Array Class
2.4.4

Overview

Array transformation in Ruby centers on three fundamental methods: map, select, and reject. These methods process arrays by applying blocks to elements, returning new arrays based on different criteria. The Array#map method transforms each element using a block and returns a new array containing the transformed values. The Array#select method filters elements by keeping only those for which the block returns a truthy value. The Array#reject method performs the inverse operation of select, keeping elements for which the block returns a falsy value.

numbers = [1, 2, 3, 4, 5]

# Transform elements
doubled = numbers.map { |n| n * 2 }
# => [2, 4, 6, 8, 10]

# Filter elements (keep truthy results)
evens = numbers.select { |n| n.even? }
# => [2, 4]

# Filter elements (keep falsy results)
odds = numbers.reject { |n| n.even? }
# => [1, 3, 5]

These methods belong to the Enumerable module and are available on Array instances. Each method creates a new array without modifying the original, making them functional programming building blocks. The methods accept blocks that receive each array element as an argument.

words = ['ruby', 'python', 'javascript']
lengths = words.map(&:length)
# => [4, 6, 10]

short_words = words.select { |word| word.length < 6 }
# => ["ruby"]

The methods integrate with Ruby's block syntax, accepting both block and Proc objects. When called without a block, they return an Enumerator object that can be chained with other enumerable methods.

Basic Usage

The map method applies a transformation block to every element in the array. The block's return value becomes the corresponding element in the new array. This creates a one-to-one mapping between input and output elements.

prices = [10.99, 15.50, 8.75, 22.00]
tax_rate = 0.08

# Calculate total prices with tax
total_prices = prices.map { |price| price * (1 + tax_rate) }
# => [11.8692, 16.74, 9.45, 23.76]

# Round to currency precision
final_prices = total_prices.map { |price| price.round(2) }
# => [11.87, 16.74, 9.45, 23.76]

The select method evaluates each element against a block condition. Elements that make the block return a truthy value are included in the result array. Elements that make the block return nil or false are excluded.

inventory = [
  { name: 'Widget A', stock: 15, price: 12.99 },
  { name: 'Widget B', stock: 0, price: 8.50 },
  { name: 'Widget C', stock: 23, price: 15.75 },
  { name: 'Widget D', stock: 0, price: 9.25 }
]

# Find items in stock
available_items = inventory.select { |item| item[:stock] > 0 }
# => [{:name=>"Widget A", :stock=>15, :price=>12.99}, 
#     {:name=>"Widget C", :stock=>23, :price=>15.75}]

# Find premium items (price > $10)
premium_items = inventory.select { |item| item[:price] > 10 }
# => [{:name=>"Widget A", :stock=>15, :price=>12.99}, 
#     {:name=>"Widget C", :stock=>23, :price=>15.75}]

The reject method works as the logical inverse of select. Elements that make the block return a truthy value are excluded from the result. Elements that make the block return nil or false are included.

user_ages = [16, 21, 17, 25, 19, 14, 28]

# Remove underage users (keep 18 and older)
adults = user_ages.reject { |age| age < 18 }
# => [21, 25, 19, 28]

# Remove users over retirement age
working_adults = adults.reject { |age| age >= 65 }
# => [21, 25, 19, 28]

Each method returns a new Array object, leaving the original array unchanged. This immutable approach prevents accidental data modification and supports functional programming patterns.

original_data = [1, 2, 3, 4]
processed_data = original_data.map { |n| n ** 2 }

original_data    # => [1, 2, 3, 4] (unchanged)
processed_data   # => [1, 4, 9, 16] (new array)

Advanced Usage

Method chaining creates powerful data transformation pipelines by combining multiple array operations. Each method returns an array, making subsequent method calls straightforward.

transaction_data = [
  { amount: 150.00, category: 'food', date: '2024-01-15', status: 'cleared' },
  { amount: -75.25, category: 'refund', date: '2024-01-16', status: 'pending' },
  { amount: 85.50, category: 'food', date: '2024-01-17', status: 'cleared' },
  { amount: 200.00, category: 'entertainment', date: '2024-01-18', status: 'cleared' },
  { amount: -25.00, category: 'fee', date: '2024-01-19', status: 'cleared' }
]

# Complex transformation pipeline
monthly_food_spending = transaction_data
  .select { |t| t[:status] == 'cleared' }           # Only processed transactions
  .select { |t| t[:category] == 'food' }            # Food purchases only
  .map { |t| t[:amount] }                           # Extract amounts
  .select { |amount| amount > 0 }                   # Exclude refunds
  .map { |amount| amount.round(2) }                 # Round to cents
# => [150.0, 85.5]

Block parameters can destructure complex data structures, making transformations more readable when working with nested arrays or hashes.

coordinate_pairs = [[1, 2], [3, 4], [5, 6], [7, 8]]

# Calculate distances from origin using destructured parameters
distances = coordinate_pairs.map { |x, y| Math.sqrt(x**2 + y**2).round(2) }
# => [2.24, 5.0, 7.81, 10.63]

# Select points in specific quadrants
first_quadrant = coordinate_pairs.select { |x, y| x > 0 && y > 0 }
# => [[1, 2], [3, 4], [5, 6], [7, 8]]

The Proc and Method objects provide advanced block functionality. Symbol-to-proc conversion (&:method_name) works for simple method calls, while custom Proc objects handle complex logic.

class DataProcessor
  def self.normalize_score(score)
    [[score, 100].min, 0].max
  end
  
  def self.valid_score?(score)
    score.is_a?(Numeric) && score >= 0 && score <= 100
  end
end

raw_scores = [85, 92, 105, -5, 88, 'invalid', 76]

# Using method objects for complex processing
normalizer = DataProcessor.method(:normalize_score)
validator = DataProcessor.method(:valid_score?)

# Chain method objects with type filtering
processed_scores = raw_scores
  .select { |score| score.is_a?(Numeric) }    # Remove non-numeric values
  .map(&normalizer)                           # Apply normalization method
  .select(&validator)                         # Apply validation method
# => [85, 92, 100, 0, 88, 76]

Conditional transformations can be implemented using ternary operators or case statements within blocks, providing branching logic during processing.

product_data = [
  { name: 'Laptop', category: 'electronics', price: 1200, warranty: true },
  { name: 'Book', category: 'media', price: 15, warranty: false },
  { name: 'Phone', category: 'electronics', price: 800, warranty: true },
  { name: 'Desk', category: 'furniture', price: 300, warranty: false }
]

# Apply different pricing rules based on category and conditions
final_prices = product_data.map do |product|
  base_price = product[:price]
  
  adjusted_price = case product[:category]
  when 'electronics'
    # Electronics get 10% markup for extended warranty
    product[:warranty] ? base_price * 1.10 : base_price
  when 'media'
    # Media gets bulk discount for orders over $20
    base_price > 20 ? base_price * 0.95 : base_price
  else
    base_price
  end
  
  { name: product[:name], final_price: adjusted_price.round(2) }
end
# => [{:name=>"Laptop", :final_price=>1320.0}, 
#     {:name=>"Book", :final_price=>15.0}, 
#     {:name=>"Phone", :final_price=>880.0}, 
#     {:name=>"Desk", :final_price=>300.0}]

Performance & Memory

Array transformation methods create new arrays, which impacts memory usage with large datasets. Each method allocates memory for the result array, and chained operations create intermediate arrays.

# Memory-intensive transformation chain
large_dataset = (1..1_000_000).to_a

# Each step creates a new million-element array
result = large_dataset
  .map { |n| n * 2 }        # First intermediate array
  .select { |n| n % 4 == 0 } # Second intermediate array  
  .map { |n| n.to_s }       # Final result array

For memory-critical applications, consider using lazy evaluation to process elements on-demand without creating intermediate arrays.

# Lazy evaluation processes elements individually
result = large_dataset.lazy
  .map { |n| n * 2 }
  .select { |n| n % 4 == 0 }
  .map { |n| n.to_s }
  .to_a  # Materialize the result when needed

Block complexity significantly affects performance. Simple operations like arithmetic or method calls are fast, while complex operations like regular expression matching or database queries can create bottlenecks.

require 'benchmark'

emails = Array.new(100_000) { "user#{rand(1000)}@example.com" }

Benchmark.bm(20) do |x|
  # Fast: Simple method call
  x.report("length check:") do
    emails.select { |email| email.length > 15 }
  end
  
  # Slower: Regular expression matching
  x.report("regex validation:") do
    emails.select { |email| email.match?(/\A[\w+\-.]+@[a-z\d\-]+(\.[a-z\d\-]+)*\.[a-z]+\z/i) }
  end
end

The choice between transformation methods affects performance. Use select and reject for filtering operations rather than map followed by compact when possible.

# Less efficient: map then compact
valid_numbers = mixed_data.map { |item| item.to_i if item.respond_to?(:to_i) }.compact

# More efficient: direct filtering
valid_numbers = mixed_data.select { |item| item.respond_to?(:to_i) }.map(&:to_i)

Consider using filter_map (Ruby 2.7+) for combined filtering and mapping operations, which eliminates intermediate arrays and nil values in a single pass.

# Traditional approach with intermediate array
processed = data
  .map { |item| process_item(item) }
  .reject(&:nil?)

# More efficient with filter_map
processed = data.filter_map { |item| process_item(item) }

Error Handling & Debugging

Transformation methods can raise exceptions when blocks encounter unexpected data types or when the array contains incompatible elements.

mixed_data = [1, '2', 3, nil, 'four', 5.0]

begin
  # This will raise NoMethodError on nil
  squared = mixed_data.map { |item| item ** 2 }
rescue NoMethodError => e
  puts "Error processing element: #{e.message}"
  # Handle the error or provide fallback behavior
end

Implement defensive programming by checking element types within blocks or providing fallback values for problematic elements.

# Safe transformation with type checking
safe_squared = mixed_data.map do |item|
  if item.respond_to?(:**)
    item ** 2
  else
    0  # Fallback value for non-numeric elements
  end
end
# => [1, 0, 9, 0, 0, 25.0]

# Alternative: skip invalid elements entirely
numeric_squared = mixed_data
  .select { |item| item.is_a?(Numeric) }
  .map { |item| item ** 2 }
# => [1, 9, 25.0]

Debugging complex transformation chains requires systematic inspection of intermediate results. Use tap to examine values at each stage without modifying the chain.

def debug_transformation(data)
  data
    .tap { |d| puts "Original: #{d.inspect}" }
    .select { |item| item.is_a?(String) }
    .tap { |d| puts "After select: #{d.inspect}" }
    .map(&:upcase)
    .tap { |d| puts "After map: #{d.inspect}" }
    .reject { |item| item.length < 3 }
    .tap { |d| puts "Final result: #{d.inspect}" }
end

mixed_input = [1, 'hello', 2, 'hi', 'world', 3]
debug_transformation(mixed_input)
# Original: [1, "hello", 2, "hi", "world", 3]
# After select: ["hello", "hi", "world"]  
# After map: ["HELLO", "HI", "WORLD"]
# Final result: ["HELLO", "WORLD"]

Handle exceptions at the appropriate level in your application. For data processing pipelines, consider collecting errors rather than halting execution.

def process_user_data(users)
  results = []
  errors = []
  
  users.each_with_index do |user, index|
    begin
      processed = user
        .select { |k, v| ['name', 'email', 'age'].include?(k.to_s) }
        .map { |k, v| [k, v.to_s.strip] }
        .to_h
      
      results << processed
    rescue StandardError => e
      errors << { index: index, user: user, error: e.message }
    end
  end
  
  { results: results, errors: errors }
end

Common Pitfalls

Modifying arrays during iteration leads to unexpected behavior. Transformation methods iterate over the original array, but modifying it during iteration can skip elements or cause index errors.

numbers = [1, 2, 3, 4, 5]

# WRONG: Modifying original array during transformation
dangerous_result = numbers.map do |n|
  numbers.delete(n) if n.even?  # Modifies array during iteration
  n * 2
end
# Unpredictable results due to concurrent modification

The correct approach uses the transformation result or operates on a copy of the array.

# CORRECT: Transform without modifying original
transformed = numbers.map { |n| n * 2 }
filtered = transformed.reject { |n| (n / 2).even? }

# CORRECT: Create new array with desired elements
result = numbers
  .reject(&:even?)  # Remove evens first
  .map { |n| n * 2 }  # Then transform odds
# => [2, 6, 10]

Block return values in select and reject are evaluated for truthiness, not equality. This creates confusion when working with values that are falsy but not false.

scores = [0, 85, 92, nil, 78, '', 95]

# Unexpected behavior: 0 and empty string are falsy
passing_scores = scores.select { |score| score }
# => [85, 92, 78, 95] (excludes 0, nil, and empty string)

# CORRECT: Explicit condition for numeric values
passing_scores = scores.select { |score| score.is_a?(Numeric) && score >= 70 }
# => [85, 92, 78, 95]

Chaining operations without considering intermediate results can lead to inefficient or incorrect transformations.

product_names = ['Widget A', 'Widget B', 'Gadget C', 'Widget D']

# INEFFICIENT: Multiple passes over the array
widgets = product_names.select { |name| name.include?('Widget') }
uppercased = widgets.map(&:upcase)
sorted = uppercased.sort

# BETTER: Single logical chain
result = product_names
  .select { |name| name.include?('Widget') }
  .map(&:upcase)
  .sort
# => ["WIDGET A", "WIDGET B", "WIDGET D"]

Assuming that transformation methods mutate the original array is a frequent error. These methods return new arrays and leave the original unchanged.

original_prices = [10.99, 15.50, 8.75]

# WRONG: Expecting mutation
original_prices.map { |price| price * 1.08 }
puts original_prices  # Still [10.99, 15.50, 8.75]

# CORRECT: Assign the result
tax_included_prices = original_prices.map { |price| price * 1.08 }
puts tax_included_prices  # [11.8692, 16.74, 9.45]

Using complex nested blocks without extracting methods reduces readability and makes debugging difficult.

# HARD TO READ: Complex nested logic in block
complex_result = data.map do |item|
  if item[:category] == 'premium'
    if item[:stock] > 0
      { name: item[:name], price: item[:price] * 1.15, available: true }
    else
      { name: item[:name], price: item[:price], available: false }
    end
  else
    { name: item[:name], price: item[:price], available: item[:stock] > 0 }
  end
end

# BETTER: Extract transformation logic to methods
def transform_premium_item(item)
  {
    name: item[:name],
    price: item[:stock] > 0 ? item[:price] * 1.15 : item[:price],
    available: item[:stock] > 0
  }
end

def transform_regular_item(item)
  {
    name: item[:name],
    price: item[:price],
    available: item[:stock] > 0
  }
end

clear_result = data.map do |item|
  item[:category] == 'premium' ? transform_premium_item(item) : transform_regular_item(item)
end

Reference

Core Transformation Methods

Method Parameters Returns Description
#map { |item| block } Block receiving each element Array Transform each element using block return value
#map None (no block) Enumerator Returns enumerator for chaining
#select { |item| block } Block receiving each element Array Keep elements where block returns truthy value
#select None (no block) Enumerator Returns enumerator for chaining
#reject { |item| block } Block receiving each element Array Keep elements where block returns falsy value
#reject None (no block) Enumerator Returns enumerator for chaining
#filter_map { |item| block } Block receiving each element Array Map and filter in single operation (Ruby 2.7+)

Method Aliases and Variants

Method Alias Behavior Difference
#map #collect Identical functionality
#select #filter Identical functionality
#map! #collect! Mutates original array instead of creating new one
#select! #filter! Mutates original array, returns nil if no changes
#reject! N/A Mutates original array, returns nil if no changes

Block Return Value Evaluation

Return Value select Behavior reject Behavior Notes
true Include element Exclude element Explicitly truthy
false Exclude element Include element Explicitly falsy
nil Exclude element Include element Falsy in Ruby
0 Exclude element Include element Falsy in Ruby (unlike other languages)
"" (empty string) Exclude element Include element Falsy in Ruby
[] (empty array) Include element Exclude element Truthy in Ruby
{} (empty hash) Include element Exclude element Truthy in Ruby
Any object Include element Exclude element All objects are truthy except nil and false

Performance Characteristics

Operation Time Complexity Memory Usage Notes
map O(n) O(n) new array Always creates array same size as input
select O(n) O(k) where k ≤ n Creates array with filtered elements only
reject O(n) O(k) where k ≤ n Creates array with remaining elements
Method chaining O(n × m) O(n × m) m = number of chained operations
lazy chaining O(k) O(k) k = number of elements actually processed

Common Error Types

Error Cause Example Solution
NoMethodError Calling method on incompatible type nil.upcase in map block Add type checking or use safe navigation
LocalJumpError Block expects different arity Block expects 2 params, gets 1 Match block parameter count to element structure
ArgumentError Wrong number of block arguments map { |a, b, c| } on simple array Use correct parameter count
SystemStackError Infinite recursion in block Recursive call without base case Add termination condition

Integration Patterns

# Chaining with other Enumerable methods
array.map(&:method).select(&:predicate?).sort.reverse

# Using with ranges and infinite sequences  
(1..Float::INFINITY).lazy.map { |n| n ** 2 }.select(&:even?).first(10)

# Integration with case statements
array.map { |item| case item when String then item.upcase else item end }

# Pattern matching integration (Ruby 3.0+)
array.map { |item| item in String ? item.upcase : item }