CrackedRuby - Hash Merging and Updating

Overview

Ruby provides multiple approaches for combining hash data through merge and update operations. The core methods Hash#merge, Hash#merge!, and Hash#update enable combining key-value pairs from multiple hashes with different strategies for handling conflicts and mutations.

The merge method creates a new hash containing entries from both the receiver and argument hash, while merge! and update modify the receiver hash directly. When duplicate keys exist, values from the argument hash take precedence by default, though custom conflict resolution logic can be provided through blocks.

base = { a: 1, b: 2 }
extension = { b: 3, c: 4 }

# Non-destructive merge
result = base.merge(extension)
# => { a: 1, b: 3, c: 4 }
# base remains { a: 1, b: 2 }

# Destructive merge
base.merge!(extension)
# => { a: 1, b: 3, c: 4 }
# base is now { a: 1, b: 3, c: 4 }

Ruby treats Hash#update as an alias for Hash#merge!, providing identical functionality with different naming conventions. Both methods modify the receiver hash and return the modified hash object.

The merge operations work with any object that responds to #each_pair or can be converted to a hash, making them flexible for various data integration scenarios. Key comparison uses the same equality semantics as hash key lookup, supporting strings, symbols, and custom objects as keys.

Basic Usage

The Hash#merge method combines two hashes without modifying the original, creating a new hash with combined key-value pairs. When keys appear in both hashes, the argument hash values override the receiver hash values.

defaults = { timeout: 30, retries: 3, debug: false }
options = { timeout: 60, ssl: true }

config = defaults.merge(options)
# => { timeout: 60, retries: 3, debug: false, ssl: true }

puts defaults
# => { timeout: 30, retries: 3, debug: false }

The destructive variants Hash#merge! and Hash#update modify the receiver hash directly, adding or overwriting keys from the argument hash. These methods return the modified receiver hash, enabling method chaining.

settings = { host: "localhost", port: 3000 }
settings.merge!(ssl: true, port: 443)
# => { host: "localhost", port: 443, ssl: true }

# settings hash is permanently modified
puts settings
# => { host: "localhost", port: 443, ssl: true }

Multiple hashes can be merged in sequence, with later arguments taking precedence over earlier ones. This pattern supports building configuration objects from multiple sources with clear precedence rules.

base_config = { timeout: 10, retries: 1 }
env_config = { timeout: 30, logging: true }
user_config = { retries: 5, cache: false }

final_config = base_config.merge(env_config).merge(user_config)
# => { timeout: 30, retries: 5, logging: true, cache: false }

Block-based merging provides custom conflict resolution when keys exist in both hashes. The block receives the key, old value, and new value, returning the value to use in the merged hash.

sales_q1 = { "east" => 1000, "west" => 800, "north" => 600 }
sales_q2 = { "east" => 1200, "west" => 900, "south" => 700 }

total_sales = sales_q1.merge(sales_q2) { |region, q1, q2| q1 + q2 }
# => { "east" => 2200, "west" => 1700, "north" => 600, "south" => 700 }

Advanced Usage

Complex merge scenarios often require sophisticated conflict resolution strategies beyond simple value replacement. Block-based merging enables custom logic for handling duplicate keys, including mathematical operations, array concatenation, and nested hash merging.

def deep_merge(base, other)
  base.merge(other) do |key, old_val, new_val|
    if old_val.is_a?(Hash) && new_val.is_a?(Hash)
      deep_merge(old_val, new_val)
    else
      new_val
    end
  end
end

config1 = {
  database: { host: "localhost", pool: 5 },
  cache: { ttl: 300 }
}

config2 = {
  database: { port: 5432, pool: 10 },
  logging: { level: "info" }
}

merged = deep_merge(config1, config2)
# => {
#      database: { host: "localhost", pool: 10, port: 5432 },
#      cache: { ttl: 300 },
#      logging: { level: "info" }
#    }

Conditional merging strategies can preserve certain values based on custom criteria. This approach proves useful for maintaining defaults while selectively updating specific fields.

def smart_merge(base, updates, preserve: [])
  base.merge(updates) do |key, old_val, new_val|
    if preserve.include?(key)
      old_val
    elsif old_val.nil?
      new_val
    elsif new_val.respond_to?(:empty?) && new_val.empty?
      old_val
    else
      new_val
    end
  end
end

user_prefs = { theme: "dark", notifications: true, email: nil }
form_data = { theme: "light", email: "", timezone: "UTC" }

result = smart_merge(user_prefs, form_data, preserve: [:notifications])
# => { theme: "light", notifications: true, email: nil, timezone: "UTC" }

Array value merging requires explicit handling since default merge behavior replaces arrays entirely. Custom merge logic can concatenate, union, or apply set operations to array values.

def merge_with_arrays(base, other, array_strategy: :concat)
  base.merge(other) do |key, old_val, new_val|
    if old_val.is_a?(Array) && new_val.is_a?(Array)
      case array_strategy
      when :concat
        old_val + new_val
      when :union
        (old_val + new_val).uniq
      when :intersect
        old_val & new_val
      else
        new_val
      end
    else
      new_val
    end
  end
end

tags1 = { post: ["ruby", "programming"], categories: ["tech", "tutorial"] }
tags2 = { post: ["hash", "programming"], categories: ["guide"] }

combined = merge_with_arrays(tags1, tags2, array_strategy: :union)
# => { post: ["ruby", "programming", "hash"], categories: ["tech", "tutorial", "guide"] }

Merge operations can be chained with transformation blocks to modify values during the merge process. This pattern enables data normalization and validation during hash combination.

class ConfigMerger
  def self.merge_and_transform(base, updates, &block)
    merged = base.merge(updates)
    return merged unless block_given?
    
    merged.transform_values(&block)
  end
end

base = { timeout: "30", max_connections: "100" }
updates = { timeout: "60", retry_attempts: "3" }

config = ConfigMerger.merge_and_transform(base, updates) do |value|
  value.is_a?(String) && value.match?(/^\d+$/) ? value.to_i : value
end
# => { timeout: 60, max_connections: 100, retry_attempts: 3 }

Common Pitfalls

Hash merging operations contain several subtle behaviors that can lead to unexpected results. The most common issue involves confusion between destructive and non-destructive merge methods, particularly when working with shared hash references.

original = { a: 1, b: 2 }
shared_ref = original

# This modifies the original hash
shared_ref.merge!(c: 3)
puts original
# => { a: 1, b: 2, c: 3 } - unexpected modification!

# Safe approach uses non-destructive merge
safe_result = original.merge(d: 4)
puts original
# => { a: 1, b: 2, c: 3 } - only previous modifications remain

Nested hash structures do not merge recursively by default. The merge operation replaces entire nested hash values rather than combining their contents, leading to data loss in complex structures.

user1 = {
  name: "Alice",
  preferences: { theme: "dark", notifications: true, language: "en" }
}

user2 = {
  preferences: { theme: "light", timezone: "PST" }
}

# Nested preferences are completely replaced, not merged
merged = user1.merge(user2)
puts merged[:preferences]
# => { theme: "light", timezone: "PST" }
# Lost: notifications and language settings

Key type mismatches create silent merge failures where string and symbol keys are treated as distinct, resulting in duplicate conceptual keys in the merged hash.

config1 = { "timeout" => 30, port: 3000 }
config2 = { timeout: 60, "port" => 8080 }

merged = config1.merge(config2)
# => { "timeout" => 30, port: 8080, timeout: 60, "port" => 8080 }
# Both string and symbol versions of keys exist

Block-based merge methods can produce confusing results when the block returns nil or other falsy values. The merge operation uses the block's return value directly, including nil and false.

hash1 = { a: 10, b: 20 }
hash2 = { a: 5, c: 15 }

# Intended to keep larger values, but nil comparison fails
result = hash1.merge(hash2) { |k, v1, v2| v1 > v2 ? v1 : v2 if v1 && v2 }
# => { a: nil, b: 20, c: 15 }
# Block returned nil for key 'a' due to failed condition

Frozen hash objects raise exceptions when destructive merge operations are attempted. This behavior can interrupt program flow unexpectedly when working with immutable configuration objects.

frozen_config = { debug: false }.freeze

begin
  frozen_config.merge!(verbose: true)
rescue FrozenError => e
  puts "Cannot modify frozen hash: #{e.message}"
end

# Safe approach creates new hash
new_config = frozen_config.merge(verbose: true)

Performance & Memory

Hash merge operations exhibit different performance characteristics based on the size of input hashes and the chosen merge strategy. Non-destructive merging allocates new hash objects and copies all key-value pairs, while destructive merging modifies existing hash structures with lower memory overhead.

require 'benchmark'

large_base = {}
1000.times { |i| large_base["key_#{i}"] = i }

small_update = { "new_key" => 999, "key_1" => 1001 }

Benchmark.bm(15) do |x|
  x.report("merge") { large_base.merge(small_update) }
  x.report("merge!") { large_base.dup.merge!(small_update) }
end

# Results show merge! performs better for large base hashes
# merge creates entirely new hash with ~1000 entries
# merge! only modifies existing structure

Memory allocation patterns differ significantly between merge approaches. The merge method allocates memory proportional to the combined size of both hashes, while merge! allocates memory only for new keys and modified internal structures.

def measure_memory_usage
  before = GC.stat(:total_allocated_objects)
  yield
  after = GC.stat(:total_allocated_objects)
  after - before
end

base = { a: 1, b: 2, c: 3 }
update = { d: 4, e: 5 }

non_destructive = measure_memory_usage { base.merge(update) }
destructive = measure_memory_usage { base.dup.merge!(update) }

puts "merge allocations: #{non_destructive}"
puts "merge! allocations: #{destructive}"
# merge! typically shows lower allocation counts

Complex merge operations with custom blocks add computational overhead proportional to the number of conflicting keys. Block evaluation occurs for each key collision, making merge performance dependent on hash overlap and block complexity.

def expensive_merge_block(key, old_val, new_val)
  # Simulated expensive operation
  sleep(0.001)
  old_val + new_val
end

overlapping_keys = {}
100.times { |i| overlapping_keys["shared_#{i}"] = i }

base_data = overlapping_keys.dup
update_data = overlapping_keys.transform_values { |v| v * 2 }

# Block called for each of 100 overlapping keys
start_time = Time.now
result = base_data.merge(update_data, &method(:expensive_merge_block))
duration = Time.now - start_time

puts "Merge with expensive block took #{duration} seconds"
# Time increases linearly with number of key conflicts

Large-scale merge operations benefit from pre-sizing hash objects when the final size is known. This optimization reduces rehashing operations during hash growth, improving merge performance for bulk operations.

def optimized_bulk_merge(base, *hashes)
  estimated_size = base.size + hashes.sum(&:size)
  result = base.class.new
  
  # Pre-size internal hash structure
  result.compare_by_identity if base.compare_by_identity?
  
  result.merge!(base)
  hashes.each { |h| result.merge!(h) }
  result
end

# Performance improvement visible with large hash sets
hash_set = Array.new(50) { |i| { "group_#{i}" => (1..100).to_a } }
base_hash = { "initial" => "value" }

optimized_result = optimized_bulk_merge(base_hash, *hash_set)

Production Patterns

Configuration management represents the most common production use case for hash merging, where applications combine settings from multiple sources with defined precedence rules. Environment variables, configuration files, and runtime parameters merge to create final application configuration.

class AppConfig
  def self.build
    base_config = load_defaults
    env_config = load_environment_specific
    runtime_config = load_runtime_overrides
    
    base_config
      .merge(env_config)
      .merge(runtime_config)
      .merge(load_secrets)
  end

  private

  def self.load_defaults
    {
      database_pool: 5,
      timeout: 30,
      retry_attempts: 3,
      log_level: 'info'
    }
  end

  def self.load_environment_specific
    env = ENV['RAILS_ENV'] || 'development'
    case env
    when 'production'
      { database_pool: 20, log_level: 'warn', monitoring: true }
    when 'test'
      { database_pool: 1, timeout: 5 }
    else
      { debug: true, log_level: 'debug' }
    end
  end

  def self.load_runtime_overrides
    overrides = {}
    overrides[:timeout] = ENV['APP_TIMEOUT'].to_i if ENV['APP_TIMEOUT']
    overrides[:debug] = true if ENV['DEBUG']
    overrides
  end

  def self.load_secrets
    secrets_file = ENV['SECRETS_FILE'] || 'secrets.yml'
    return {} unless File.exist?(secrets_file)
    
    YAML.load_file(secrets_file) || {}
  end
end

API parameter processing commonly uses merge operations to combine default parameters with user-supplied values, ensuring complete parameter sets while respecting client preferences.

class ApiController
  DEFAULT_PAGINATION = { page: 1, per_page: 20, max_per_page: 100 }.freeze
  DEFAULT_FILTERS = { status: 'active', sort: 'created_at' }.freeze

  def index
    pagination = build_pagination_params
    filters = build_filter_params
    
    query_params = DEFAULT_FILTERS
      .merge(filters)
      .merge(pagination: pagination)
    
    render json: fetch_records(query_params)
  end

  private

  def build_pagination_params
    user_pagination = params.permit(:page, :per_page).to_h.symbolize_keys
    
    DEFAULT_PAGINATION.merge(user_pagination) do |key, default, user_value|
      case key
      when :per_page
        [user_value.to_i, DEFAULT_PAGINATION[:max_per_page]].min
      when :page
        [user_value.to_i, 1].max
      else
        user_value
      end
    end
  end

  def build_filter_params
    permitted_filters = params.permit(:status, :category, :sort).to_h
    permitted_filters.reject { |k, v| v.blank? }.symbolize_keys
  end
end

Cache key generation often employs hash merging to combine base cache identifiers with dynamic parameters, creating comprehensive cache keys for complex caching strategies.

class CacheKeyBuilder
  def self.build(base_key, **options)
    base_components = { key: base_key, version: cache_version }
    
    dynamic_components = options.merge(timestamp: current_cache_window)
    
    all_components = base_components.merge(dynamic_components)
    
    generate_key(all_components)
  end

  private

  def self.cache_version
    Rails.cache.fetch('app_cache_version', expires_in: 1.hour) do
      Time.current.to_i / 3600 # Hourly cache version
    end
  end

  def self.current_cache_window
    (Time.current.to_i / 300) * 300 # 5-minute windows
  end

  def self.generate_key(components)
    stable_hash = components.sort.to_h
    Digest::SHA256.hexdigest(stable_hash.to_json)[0, 16]
  end
end

# Usage in application
cache_key = CacheKeyBuilder.build(
  'user_dashboard',
  user_id: current_user.id,
  role: current_user.role,
  features: enabled_features
)

Feature flag systems utilize hash merging to combine global feature settings with user-specific overrides and A/B testing configurations.

class FeatureFlags
  def self.for_user(user)
    global_flags = load_global_flags
    user_overrides = load_user_overrides(user)
    ab_test_flags = load_ab_test_flags(user)
    
    global_flags
      .merge(user_overrides)
      .merge(ab_test_flags)
      .merge(emergency_overrides)
  end

  private

  def self.load_global_flags
    Rails.cache.fetch('global_feature_flags', expires_in: 5.minutes) do
      FeatureFlag.where(scope: 'global').pluck(:name, :enabled).to_h
    end
  end

  def self.load_user_overrides(user)
    return {} unless user&.admin?
    
    user.feature_flag_overrides.pluck(:flag_name, :enabled).to_h
  end

  def self.load_ab_test_flags(user)
    return {} unless user

    AbTest.active.each_with_object({}) do |test, flags|
      variant = test.variant_for_user(user)
      flags.merge!(test.flags_for_variant(variant))
    end
  end

  def self.emergency_overrides
    # Critical flags that override everything else
    emergency_file = Rails.root.join('config', 'emergency_flags.yml')
    return {} unless File.exist?(emergency_file)
    
    YAML.load_file(emergency_file) || {}
  end
end

Reference

Core Methods

Method	Parameters	Returns	Description
`#merge(other_hash)`	`other_hash` (Hash, #to_h)	`Hash`	Returns new hash with combined entries
`#merge(other_hash, &block)`	`other_hash` (Hash), block	`Hash`	Returns new hash using block for conflicts
`#merge!(other_hash)`	`other_hash` (Hash, #to_h)	`Hash`	Modifies receiver, adds entries from other_hash
`#merge!(other_hash, &block)`	`other_hash` (Hash), block	`Hash`	Modifies receiver using block for conflicts
`#update(other_hash)`	`other_hash` (Hash, #to_h)	`Hash`	Alias for merge!, modifies receiver
`#update(other_hash, &block)`	`other_hash` (Hash), block	`Hash`	Alias for merge! with block

Block Parameters

Parameter	Type	Description
`key`	`Object`	The conflicting key present in both hashes
`old_value`	`Object`	Value from the receiver hash
`new_value`	`Object`	Value from the argument hash

Common Merge Patterns

Pattern	Example	Use Case
Default Override	`defaults.merge(options)`	Configuration with fallbacks
Conditional Merge	`hash.merge(other) { \|k,o,n\| condition ? o : n }`	Custom conflict resolution
Array Combination	`h1.merge(h2) { \|k,o,n\| o + n }`	Combining array values
Deep Merge	`deep_merge(base, extension)`	Nested hash structures
Bulk Merge	`[h1,h2,h3].reduce(&:merge)`	Multiple hash combination

Performance Characteristics

Operation	Memory	Time Complexity	Notes
`merge`	O(n+m)	O(n+m)	Allocates new hash with all entries
`merge!`	O(k)	O(m)	Only allocates for new keys
Block merge	O(n+m)	O(n+m+c)	Additional cost c for conflict resolution
Deep merge	O(d*n)	O(d*n)	Recursive depth d multiplies cost

Error Conditions

Error	Cause	Prevention
`FrozenError`	Calling merge! on frozen hash	Use merge or dup before merge!
`ArgumentError`	Invalid argument type	Ensure argument responds to each_pair
`SystemStackError`	Deep recursion in block	Limit recursion depth in merge blocks
`NoMethodError`	Missing method on values	Validate value types in merge blocks

Memory Optimization Tips

Technique	Benefit	Trade-off
Use `merge!` for large base hashes	Reduces memory allocation	Modifies original
Pre-size target hash	Reduces rehashing overhead	Requires size estimation
Avoid block merges for simple cases	Eliminates block call overhead	Less flexibility
Use `dup` strategically	Preserves originals safely	Additional memory cost

Thread Safety Notes

Method	Thread Safe	Notes
`merge`	Yes	Creates new objects
`merge!`	No	Modifies receiver hash
`update`	No	Alias for merge!
Block-based merges	Depends	Block execution determines safety