CrackedRuby logo

CrackedRuby

Hash Transformation

Documentation for transforming hash data structures in Ruby using built-in methods and custom patterns.

Core Built-in Classes Hash Class
2.5.5

Overview

Hash transformation in Ruby involves converting hash structures by modifying keys, values, or both while preserving or changing the overall structure. Ruby provides several built-in methods for hash transformation, including transform_keys, transform_values, map, filter_map, and compact. These methods support both destructive (mutating) and non-destructive operations.

The Hash class includes methods that return new hash instances and methods that modify existing instances. Non-destructive methods like transform_keys and transform_values create new hashes, while their exclamation mark variants modify the original hash. Ruby also supports transformations through enumeration methods that can reconstruct hashes with different structures.

# Basic key transformation
user_data = { 'first_name' => 'John', 'last_name' => 'Doe' }
symbolized = user_data.transform_keys(&:to_sym)
# => {:first_name=>"John", :last_name=>"Doe"}

# Value transformation
scores = { math: '85', science: '92', english: '78' }
numeric_scores = scores.transform_values(&:to_i)
# => {:math=>85, :science=>92, :english=>78}

Hash transformations form the backbone of data processing in Ruby applications, particularly when interfacing between different data formats, APIs, or when normalizing user input.

Basic Usage

The transform_keys method creates a new hash with transformed keys while preserving values. The method accepts a block that receives each key and returns the transformed key.

# String keys to symbols
config = { 'database_url' => 'localhost', 'port' => 5432 }
symbolized_config = config.transform_keys { |key| key.to_sym }
# => {:database_url=>"localhost", :port=>5432}

# Case transformation
headers = { 'Content-Type' => 'json', 'Accept-Language' => 'en' }
downcased = headers.transform_keys(&:downcase)
# => {"content-type"=>"json", "accept-language"=>"en"}

The transform_values method operates on hash values while maintaining the original keys. This method processes each value through the provided block.

# String values to integers
form_data = { age: '25', salary: '50000', years_experience: '3' }
parsed_data = form_data.transform_values(&:to_i)
# => {:age=>25, :salary=>50000, :years_experience=>3}

# Conditional value transformation
inventory = { apples: 0, bananas: 5, oranges: 0, grapes: 12 }
stocked_items = inventory.transform_values { |count| count > 0 ? "#{count} available" : "out of stock" }
# => {:apples=>"out of stock", :bananas=>"5 available", :oranges=>"out of stock", :grapes=>"12 available"}

The map method transforms hashes into different structures by yielding key-value pairs and reconstructing the hash from the results. This method provides complete control over both keys and values.

# Simultaneous key and value transformation
raw_params = { 'user_name' => ' JOHN DOE ', 'email_address' => ' JOHN@EXAMPLE.COM ' }
cleaned_params = raw_params.map { |k, v| [k.to_sym, v.strip.downcase] }.to_h
# => {:user_name=>"john doe", :email_address=>"john@example.com"}

The destructive variants transform_keys! and transform_values! modify the original hash instead of creating new instances. These methods return the modified hash or nil if no changes occurred.

user_prefs = { 'theme' => 'dark', 'language' => 'english' }
user_prefs.transform_keys!(&:to_sym)
puts user_prefs
# => {:theme=>"dark", :language=>"english"}

Advanced Usage

Complex hash transformations often involve nested structures, conditional logic, and method chaining. Ruby supports deep transformation through recursive approaches and custom methods.

# Nested hash transformation with recursion
def deep_transform_keys(hash, &block)
  hash.each_with_object({}) do |(key, value), result|
    new_key = block.call(key)
    result[new_key] = case value
                      when Hash
                        deep_transform_keys(value, &block)
                      when Array
                        value.map { |item| item.is_a?(Hash) ? deep_transform_keys(item, &block) : item }
                      else
                        value
                      end
  end
end

nested_data = {
  'user_info' => {
    'personal_details' => { 'first_name' => 'Jane', 'last_name' => 'Smith' },
    'contact_info' => [{ 'phone_number' => '555-1234' }, { 'email_address' => 'jane@example.com' }]
  }
}

symbolized_nested = deep_transform_keys(nested_data, &:to_sym)
# => {:user_info=>{:personal_details=>{:first_name=>"Jane", :last_name=>"Smith"}, 
#     :contact_info=>[{:phone_number=>"555-1234"}, {:email_address=>"jane@example.com"}]}}

Method chaining enables complex transformations through sequential operations. Ruby's enumerable methods combine with hash methods to create transformation pipelines.

# Multi-step transformation pipeline
raw_survey_data = {
  'q1_rating' => '4',
  'q2_rating' => '5',
  'q3_rating' => '',
  'participant_name' => '  John Smith  ',
  'submission_date' => '2023-10-15'
}

processed_survey = raw_survey_data
  .transform_keys { |k| k.gsub('_', '-').to_sym }
  .transform_values { |v| v.is_a?(String) ? v.strip : v }
  .reject { |k, v| v.empty? }
  .transform_values { |v| v.match?(/^\d+$/) ? v.to_i : v }

# => {:"q1-rating"=>4, :"q2-rating"=>5, :"participant-name"=>"John Smith", :"submission-date"=>"2023-10-15"}

The filter_map method combines filtering and transformation, processing only elements that meet specific criteria while transforming others.

# Selective transformation with filtering
product_data = {
  laptop: { price: '999.99', category: 'electronics', in_stock: true },
  shirt: { price: '29.99', category: 'clothing', in_stock: false },
  phone: { price: '699.99', category: 'electronics', in_stock: true },
  shoes: { price: '79.99', category: 'clothing', in_stock: true }
}

available_electronics = product_data.filter_map do |name, details|
  next unless details[:category] == 'electronics' && details[:in_stock]
  [name, { name: name.to_s.capitalize, price: details[:price].to_f }]
end.to_h

# => {:laptop=>{:name=>"Laptop", :price=>999.99}, :phone=>{:name=>"Phone", :price=>699.99}}

Custom transformation classes provide reusable transformation logic with configurable behavior and state management.

class HashNormalizer
  def initialize(key_transform: :to_sym, value_transforms: {})
    @key_transform = key_transform
    @value_transforms = value_transforms
  end
  
  def normalize(hash)
    hash.each_with_object({}) do |(key, value), result|
      normalized_key = key.send(@key_transform)
      normalized_value = apply_value_transform(normalized_key, value)
      result[normalized_key] = normalized_value
    end
  end
  
  private
  
  def apply_value_transform(key, value)
    transform = @value_transforms[key]
    transform ? value.send(transform) : value
  end
end

normalizer = HashNormalizer.new(
  key_transform: :to_sym,
  value_transforms: { age: :to_i, salary: :to_f, active: ->(v) { v == 'true' } }
)

user_input = { 'name' => 'Alice', 'age' => '30', 'salary' => '75000.50', 'active' => 'true' }
normalized = normalizer.normalize(user_input)
# => {:name=>"Alice", :age=>30, :salary=>75000.5, :active=>true}

Performance & Memory

Hash transformation performance varies significantly based on the operation type, hash size, and transformation complexity. Non-destructive methods create new hash instances, consuming additional memory, while destructive methods modify existing structures.

Memory allocation patterns differ between transformation approaches. The transform_keys and transform_values methods allocate new hash objects plus individual key or value objects. The map method allocates intermediate arrays before hash reconstruction.

require 'benchmark'
require 'memory_profiler'

# Performance comparison for different transformation approaches
large_hash = (1..10_000).each_with_object({}) { |i, h| h["key_#{i}"] = "value_#{i}" }

# Memory-efficient destructive transformation
memory_report = MemoryProfiler.report do
  test_hash = large_hash.dup
  test_hash.transform_keys!(&:to_sym)
end

puts "Destructive transformation:"
puts "Total allocated: #{memory_report.total_allocated_memsize} bytes"
puts "Total retained: #{memory_report.total_retained_memsize} bytes"

# Memory-intensive non-destructive transformation
memory_report = MemoryProfiler.report do
  large_hash.transform_keys(&:to_sym)
end

puts "Non-destructive transformation:"
puts "Total allocated: #{memory_report.total_allocated_memsize} bytes"
puts "Total retained: #{memory_report.total_retained_memsize} bytes"

Benchmark comparisons reveal performance characteristics across different transformation methods. Destructive operations generally outperform non-destructive alternatives for large datasets.

# Benchmark different transformation approaches
large_hash = (1..50_000).each_with_object({}) { |i, h| h[i.to_s] = i * 2 }

Benchmark.bm(20) do |x|
  x.report('transform_keys') do
    large_hash.transform_keys(&:to_i)
  end
  
  x.report('transform_keys!') do
    test_hash = large_hash.dup
    test_hash.transform_keys!(&:to_i)
  end
  
  x.report('map + to_h') do
    large_hash.map { |k, v| [k.to_i, v] }.to_h
  end
  
  x.report('each_with_object') do
    large_hash.each_with_object({}) { |(k, v), h| h[k.to_i] = v }
  end
end

# Results typically show:
# transform_keys!    fastest (modifies in place)
# each_with_object   second fastest (single allocation)
# transform_keys     moderate (creates new hash)
# map + to_h         slowest (intermediate array allocation)

Lazy evaluation strategies minimize memory usage when transforming large datasets with selective processing requirements.

# Memory-efficient selective transformation using lazy evaluation
class LazyHashTransformer
  def initialize(hash)
    @hash = hash
    @transformations = []
  end
  
  def transform_keys(&block)
    @transformations << [:keys, block]
    self
  end
  
  def transform_values(&block)
    @transformations << [:values, block]
    self
  end
  
  def select(&block)
    @transformations << [:select, block]
    self
  end
  
  def to_h
    @transformations.reduce(@hash) do |current_hash, (operation, block)|
      case operation
      when :keys then current_hash.transform_keys(&block)
      when :values then current_hash.transform_values(&block)
      when :select then current_hash.select(&block)
      end
    end
  end
end

# Process only matching entries without intermediate allocations
large_dataset = (1..100_000).each_with_object({}) { |i, h| h["item_#{i}"] = { value: i, category: i.even? ? 'even' : 'odd' } }

result = LazyHashTransformer.new(large_dataset)
  .select { |k, v| v[:category] == 'even' }
  .transform_keys(&:to_sym)
  .transform_values { |v| v[:value] * 2 }
  .to_h

Common Pitfalls

Hash transformation operations exhibit several behavioral characteristics that commonly confuse developers. Key type consistency represents a frequent source of errors, particularly when mixing string and symbol keys.

# Key type mixing creates lookup problems
mixed_hash = { 'name' => 'John', :age => 30, 'email' => 'john@example.com' }
transformed = mixed_hash.transform_keys(&:to_s)

# This fails because original had symbol :age, now string 'age'
puts transformed[:age]  # => nil (key doesn't exist)
puts transformed['age'] # => 30 (correct string key)

# Safe approach: normalize key types first
def normalize_keys(hash)
  hash.each_with_object({}) do |(key, value), result|
    normalized_key = key.to_s.to_sym  # Convert all to symbols
    result[normalized_key] = value
  end
end

Mutation versus immutation behavior catches developers who expect consistent interfaces across Ruby methods. Some hash methods modify the receiver while others return new instances.

original = { a: 1, b: 2, c: 3 }

# These create new hashes
new_hash1 = original.transform_keys(&:to_s)
new_hash2 = original.select { |k, v| v > 1 }
puts original  # => {:a=>1, :b=>2, :c=>3} (unchanged)

# These modify the original hash
original.transform_keys!(&:to_s)
puts original  # => {"a"=>1, "b"=>2, "c"=>3} (changed!)

# Destructive methods return nil when no changes occur
result = { a: 1 }.compact!
puts result  # => nil (hash had no nil values to remove)

Block parameter expectations differ between methods, leading to incorrect assumptions about available data. Some methods yield keys and values separately, others yield them together.

hash = { name: 'Alice', age: 25 }

# This works - transform_values yields only values
hash.transform_values { |value| value.to_s.upcase }

# This fails - transform_values doesn't yield keys
hash.transform_values { |key, value| "#{key}: #{value}" }
# => ArgumentError: wrong number of arguments (given 1, expected 2)

# Use map for key-value pair access
hash.map { |key, value| [key, "#{key}: #{value}"] }.to_h

Nested hash transformation requires careful handling of different value types. Applying transformations designed for flat hashes to nested structures produces unexpected results.

# Problematic nested transformation
nested = {
  user: { name: 'John', details: { age: 30, city: 'NYC' } },
  settings: { theme: 'dark', notifications: true }
}

# This only transforms top-level keys
partial_transform = nested.transform_keys(&:to_s)
# => {"user"=>{:name=>"John", :details=>{:age=>30, :city=>"NYC"}}, "settings"=>{:theme=>"dark", :notifications=>true}}

# Attempting to transform nested values without type checking fails
nested.transform_values { |v| v.transform_keys(&:to_s) }
# => NoMethodError: undefined method `transform_keys' for true:TrueClass

# Correct approach with type checking
def safe_deep_transform(hash)
  hash.transform_values do |value|
    case value
    when Hash then safe_deep_transform(value.transform_keys(&:to_s))
    else value
    end
  end.transform_keys(&:to_s)
end

Block return value expectations cause transformation failures when developers return incompatible types or structures from transformation blocks.

# Incorrect block return types
numbers = { a: '1', b: '2', c: '3' }

# This fails - block returns array instead of single value
numbers.transform_values { |v| [v.to_i, v.to_f] }
# => {:a=>[1, 1.0], :b=>[2, 2.0], :c=>[3, 3.0]} (probably not intended)

# This fails - nil return removes entries unexpectedly  
mixed_data = { valid: '123', invalid: 'abc', empty: '' }
mixed_data.transform_values { |v| v.to_i if v.match?(/^\d+$/) }
# => {:valid=>123, :invalid=>nil, :empty=>nil} (nil values retained)

# Use compact to remove nil values after transformation
cleaned = mixed_data.transform_values { |v| v.to_i if v.match?(/^\d+$/) }.compact
# => {:valid=>123}

Production Patterns

Production hash transformation patterns address real-world requirements including error handling, logging, validation, and integration with web frameworks. These patterns ensure robust data processing in applications handling user input, API responses, and configuration management.

API response normalization represents a common production use case where external data requires transformation for internal consumption. This pattern handles inconsistent data formats and missing fields gracefully.

class APIResponseNormalizer
  def self.normalize_user_data(api_response)
    normalized = api_response.transform_keys { |k| k.to_s.underscore.to_sym }
    
    # Handle nested address data
    if normalized[:address_info]
      normalized[:address] = normalize_address(normalized.delete(:address_info))
    end
    
    # Convert timestamps
    [:created_at, :updated_at].each do |timestamp_field|
      if normalized[timestamp_field]
        normalized[timestamp_field] = Time.parse(normalized[timestamp_field])
      end
    end
    
    # Ensure required fields exist with defaults
    normalized.merge(default_user_fields).merge(normalized)
  end
  
  private
  
  def self.normalize_address(address_data)
    return {} unless address_data.is_a?(Hash)
    
    address_data.transform_keys { |k| k.to_s.underscore.to_sym }
                .slice(:street, :city, :state, :zip_code, :country)
                .compact
  end
  
  def self.default_user_fields
    {
      active: true,
      role: :user,
      preferences: {},
      created_at: Time.current
    }
  end
end

# Usage in Rails controller or service class
api_data = {
  'firstName' => 'Jane',
  'lastName' => 'Doe',
  'emailAddress' => 'jane@example.com',
  'addressInfo' => {
    'streetAddress' => '123 Main St',
    'cityName' => 'Boston',
    'stateCode' => 'MA',
    'postalCode' => '02101'
  },
  'createdAt' => '2023-10-15T14:30:00Z'
}

normalized_user = APIResponseNormalizer.normalize_user_data(api_data)
# => {:first_name=>"Jane", :last_name=>"Doe", :email_address=>"jane@example.com", 
#     :address=>{:street=>"123 Main St", :city=>"Boston", :state=>"MA", :zip_code=>"02101"}, 
#     :created_at=>2023-10-15 14:30:00 UTC, :active=>true, :role=>:user, :preferences=>{}}

Configuration management systems require transformation patterns that handle environment variables, type coercion, and validation. These patterns ensure application configuration remains consistent across different deployment environments.

class ConfigurationTransformer
  BOOLEAN_VALUES = {
    'true' => true, '1' => true, 'yes' => true, 'on' => true,
    'false' => false, '0' => false, 'no' => false, 'off' => false
  }.freeze
  
  def self.process_environment_config(env_hash)
    env_hash
      .select { |key, _| key.start_with?('APP_') }
      .transform_keys { |key| key.sub('APP_', '').downcase.to_sym }
      .transform_values { |value| coerce_value(value) }
      .tap { |config| validate_required_config(config) }
  end
  
  private
  
  def self.coerce_value(value)
    case value
    when /^\d+$/
      value.to_i
    when /^\d+\.\d+$/
      value.to_f
    when /^(true|false|yes|no|on|off|0|1)$/i
      BOOLEAN_VALUES[value.downcase]
    when /^,/
      value.split(',').map(&:strip)
    else
      value
    end
  end
  
  def self.validate_required_config(config)
    required_keys = [:database_url, :secret_key, :port]
    missing_keys = required_keys - config.keys
    
    if missing_keys.any?
      raise "Missing required configuration: #{missing_keys.join(', ')}"
    end
  end
end

# Environment variables transformation
env_vars = {
  'APP_DATABASE_URL' => 'postgres://localhost:5432/myapp',
  'APP_SECRET_KEY' => 'super_secret_key_123',
  'APP_PORT' => '3000',
  'APP_DEBUG' => 'true',
  'APP_ALLOWED_HOSTS' => 'localhost,127.0.0.1,example.com',
  'PATH' => '/usr/bin:/bin',  # Non-app env var, should be ignored
  'APP_MAX_CONNECTIONS' => '100'
}

config = ConfigurationTransformer.process_environment_config(env_vars)
# => {:database_url=>"postgres://localhost:5432/myapp", :secret_key=>"super_secret_key_123", 
#     :port=>3000, :debug=>true, :allowed_hosts=>["localhost", "127.0.0.1", "example.com"], 
#     :max_connections=>100}

Form data processing requires transformation patterns that handle validation, sanitization, and type conversion while preserving user experience through meaningful error messages.

class FormDataProcessor
  def self.process_user_registration(form_params)
    transformed_params = form_params
      .transform_keys { |k| k.to_s.underscore.to_sym }
      .transform_values { |v| v.is_a?(String) ? v.strip : v }
      .reject { |_, v| v.blank? }
    
    process_nested_attributes(transformed_params)
    validate_and_convert_types(transformed_params)
  rescue ValidationError => e
    { success: false, errors: e.errors, data: transformed_params }
  end
  
  private
  
  def self.process_nested_attributes(params)
    # Handle nested attributes like profile_attributes
    params.keys.select { |k| k.to_s.end_with?('_attributes') }.each do |nested_key|
      base_key = nested_key.to_s.sub('_attributes', '').to_sym
      params[base_key] = params.delete(nested_key)
    end
  end
  
  def self.validate_and_convert_types(params)
    type_conversions = {
      age: ->(v) { Integer(v) if v.present? },
      birth_date: ->(v) { Date.parse(v) if v.present? },
      terms_accepted: ->(v) { ['1', 'true', true].include?(v) },
      salary: ->(v) { BigDecimal(v.to_s.gsub(/[,$]/, '')) if v.present? }
    }
    
    errors = {}
    
    type_conversions.each do |field, converter|
      next unless params.key?(field)
      
      begin
        params[field] = converter.call(params[field])
      rescue ArgumentError, TypeError => e
        errors[field] = "Invalid #{field} format"
        params.delete(field)
      end
    end
    
    raise ValidationError.new(errors) if errors.any?
    { success: true, data: params }
  end
end

class ValidationError < StandardError
  attr_reader :errors
  
  def initialize(errors)
    @errors = errors
    super("Validation failed: #{errors.keys.join(', ')}")
  end
end

# Process user registration form
form_data = {
  'firstName' => '  John  ',
  'lastName' => 'Doe',
  'emailAddress' => 'john.doe@example.com',
  'age' => '25',
  'birthDate' => '1998-05-15',
  'salary' => '$75,000.00',
  'termsAccepted' => '1',
  'profileAttributes' => {
    'bio' => 'Software developer',
    'location' => 'San Francisco'
  }
}

result = FormDataProcessor.process_user_registration(form_data)
# => {:success=>true, :data=>{:first_name=>"John", :last_name=>"Doe", 
#     :email_address=>"john.doe@example.com", :age=>25, :birth_date=>#<Date: 1998-05-15>, 
#     :salary=>#<BigDecimal:...,'0.75E5',18(27)>, :terms_accepted=>true, 
#     :profile=>{:bio=>"Software developer", :location=>"San Francisco"}}}

Reference

Core Transformation Methods

Method Parameters Returns Description
#transform_keys(&block) Block yielding key Hash Returns new hash with keys transformed by block
#transform_keys!(&block) Block yielding key Hash or nil Modifies hash keys in place, returns nil if no changes
#transform_values(&block) Block yielding value Hash Returns new hash with values transformed by block
#transform_values!(&block) Block yielding value Hash or nil Modifies hash values in place, returns nil if no changes
#map(&block) Block yielding key, value Array Returns array of block results
#filter_map(&block) Block yielding key, value Array Returns array of non-nil block results
#compact None Hash Returns new hash with nil values removed
#compact! None Hash or nil Removes nil values in place, returns nil if no changes

Enumeration Methods for Transformation

Method Parameters Returns Description
#each_with_object(obj, &block) Initial object, block Object Iterates with accumulator object
#inject(initial=nil, &block) Initial value, block Object Reduces hash to single value
#select(&block) Block yielding key, value Hash Returns hash with entries matching block
#reject(&block) Block yielding key, value Hash Returns hash with entries not matching block
#slice(*keys) Key list Hash Returns hash containing only specified keys
#except(*keys) Key list Hash Returns hash excluding specified keys

Common Transformation Patterns

Pattern Syntax Use Case
Key symbolization hash.transform_keys(&:to_sym) Converting string keys to symbols
Key stringification hash.transform_keys(&:to_s) Converting symbol keys to strings
Value parsing hash.transform_values(&:to_i) Converting string values to integers
Conditional transformation hash.transform_values { |v| condition ? transform(v) : v } Selective value modification
Nested key access hash.dig(:key1, :key2, :key3) Safe nested value retrieval
Key normalization hash.transform_keys { |k| k.to_s.downcase.to_sym } Consistent key formatting

Error Types and Handling

Error Type Cause Prevention
NoMethodError Calling method on wrong type Type checking before transformation
ArgumentError Wrong number of block parameters Match block arity to method expectations
TypeError Invalid type conversion Validate types before conversion
KeyError Missing required keys Use dig or fetch with defaults
RuntimeError Custom validation failures Implement proper error handling

Performance Characteristics

Method Memory Usage Speed Mutates Original
transform_keys High (new hash + keys) Moderate No
transform_keys! Low (key replacement) Fast Yes
transform_values High (new hash + values) Moderate No
transform_values! Low (value replacement) Fast Yes
map.to_h High (intermediate array) Slow No
each_with_object Moderate (single allocation) Fast No

Block Parameter Patterns

Method Block Parameters Example
transform_keys key { |key| key.to_sym }
transform_values value { |value| value.to_i }
map key, value { |k, v| [k.to_sym, v] }
select key, value { |k, v| k.start_with?('user') }
each_with_object (key, value), object { |(k, v), h| h[k] = v.upcase }

Type Conversion Helpers

Method Input Type Output Type Example
to_sym String Symbol "name".to_sym:name
to_s Symbol/Other String :name.to_s"name"
to_i String/Numeric Integer "123".to_i123
to_f String/Numeric Float "123.45".to_f123.45
strip String String " text ".strip"text"
downcase String String "TEXT".downcase"text"