CrackedRuby logo

CrackedRuby

dump and load Methods

Complete guide to Ruby's dump and load methods for object serialization across Marshal, JSON, YAML, and custom implementations.

Core Modules Marshal Module
3.7.2

Overview

Ruby provides multiple mechanisms for serializing objects into storable formats and deserializing them back into memory. The dump and load pattern appears across several standard library modules, each serving different use cases and data formats.

Marshal offers Ruby's native binary serialization format, preserving Ruby object types and internal structure. JSON provides text-based serialization compatible with web standards but limited to basic data types. YAML delivers human-readable serialization with broader type support than JSON while maintaining cross-language compatibility.

# Marshal preserves Ruby object identity
data = { name: "Alice", scores: [95, 87, 92] }
serialized = Marshal.dump(data)
restored = Marshal.load(serialized)
# => {:name=>"Alice", :scores=>[95, 87, 92]}

# JSON converts to basic types
json_data = JSON.dump(data)
json_restored = JSON.load(json_data)  
# => {"name"=>"Alice", "scores"=>[95, 87, 92]} (keys become strings)

# YAML maintains readability
yaml_data = YAML.dump(data)
yaml_restored = YAML.load(yaml_data)
# => {:name=>"Alice", :scores=>[95, 87, 92]}

Custom classes can implement dump and load behavior through several approaches. The marshal_dump and marshal_load methods control Marshal serialization, while to_json and JSON parsing handle JSON conversion. Classes can also define custom serialization logic for specific business requirements.

Each serialization format involves tradeoffs between performance, portability, security, and feature completeness. Marshal provides the fastest serialization for Ruby objects but creates Ruby-specific binary data. JSON offers broad compatibility but loses type information. YAML balances readability with reasonable performance while supporting complex data structures.

Basic Usage

The fundamental dump and load pattern follows consistent syntax across Ruby's serialization libraries. The dump method converts objects into serialized form, while load reconstructs objects from serialized data.

# Marshal operations
original_object = [1, 2, { key: "value" }]
marshaled_data = Marshal.dump(original_object)
reconstructed = Marshal.load(marshaled_data)

# File-based Marshal serialization
File.open("data.marshal", "wb") do |file|
  Marshal.dump(original_object, file)
end

loaded_from_file = File.open("data.marshal", "rb") do |file|
  Marshal.load(file)
end

JSON serialization converts Ruby objects to JavaScript Object Notation, handling strings, numbers, arrays, and hashes. Ruby symbols convert to strings, and other object types require explicit handling.

# Basic JSON operations
data = { users: ["Alice", "Bob"], count: 2 }
json_string = JSON.dump(data)
# => "{\"users\":[\"Alice\",\"Bob\"],\"count\":2}"

parsed_data = JSON.load(json_string)
# => {"users"=>["Alice", "Bob"], "count"=>2}

# JSON with custom objects requires conversion
class User
  attr_accessor :name, :email
  
  def initialize(name, email)
    @name, @email = name, email
  end
  
  def to_h
    { name: @name, email: @email }
  end
end

user = User.new("Alice", "alice@example.com")
json_data = JSON.dump(user.to_h)
user_data = JSON.load(json_data)
restored_user = User.new(user_data["name"], user_data["email"])

YAML serialization preserves more Ruby types while maintaining human readability. YAML handles symbols, dates, and custom objects more naturally than JSON.

# YAML with complex data
complex_data = {
  timestamp: Time.now,
  config: { debug: true, timeout: 30 },
  tags: [:important, :urgent]
}

yaml_output = YAML.dump(complex_data)
puts yaml_output
# ---
# :timestamp: 2024-03-15 10:30:45.123456 -04:00
# :config:
#   :debug: true
#   :timeout: 30
# :tags:
# - :important
# - :urgent

yaml_loaded = YAML.load(yaml_output)
# Preserves symbols and Time objects

Stream-based operations handle large datasets without loading entire objects into memory. Both Marshal and YAML support streaming through file handles or IO objects.

# Streaming multiple objects
objects = (1..1000).map { |i| { id: i, data: "item_#{i}" } }

File.open("stream.marshal", "wb") do |file|
  objects.each { |obj| Marshal.dump(obj, file) }
end

# Reading streamed objects
loaded_objects = []
File.open("stream.marshal", "rb") do |file|
  while !file.eof?
    loaded_objects << Marshal.load(file)
  end
end

Error Handling & Debugging

Serialization operations encounter various failure modes requiring robust error handling strategies. Marshal raises TypeError for unserializable objects, while JSON and YAML have distinct error patterns for invalid data or parsing failures.

# Marshal error handling
class NonSerializable
  def initialize
    @proc = Proc.new { "cannot serialize" }
  end
end

begin
  Marshal.dump(NonSerializable.new)
rescue TypeError => e
  puts "Marshal error: #{e.message}"
  # Handle by converting to serializable form
  serializable_data = { class_name: "NonSerializable", data: "converted" }
  Marshal.dump(serializable_data)
end

# JSON error patterns
invalid_json = '{"incomplete": true'
begin
  JSON.load(invalid_json)
rescue JSON::ParserError => e
  puts "JSON parsing failed: #{e.message}"
  # Attempt recovery with fallback parsing
  fallback_data = eval(invalid_json.gsub('{', 'Hash[').gsub('}', ']')) rescue {}
end

# Circular reference handling
circular = {}
circular[:self] = circular

begin
  JSON.dump(circular)
rescue JSON::NestingError => e
  puts "Circular reference detected: #{e.message}"
  # Break circular references before serialization
  safe_circular = circular.dup
  safe_circular.delete(:self)
  JSON.dump(safe_circular)
end

YAML parsing errors require special attention due to its flexibility in representing data types. Invalid YAML syntax or security concerns with loaded objects demand careful validation.

# YAML error handling with validation
suspicious_yaml = <<-YAML
--- !ruby/object:User
name: "Alice"
instance_eval: "system('rm -rf /')"
YAML

begin
  # Never use YAML.load with untrusted input
  YAML.safe_load(suspicious_yaml, permitted_classes: [User])
rescue Psych::DisallowedClass => e
  puts "YAML security violation: #{e.message}"
  # Use safe_load with explicit permitted classes
  safe_data = YAML.safe_load(suspicious_yaml, permitted_classes: [])
rescue Psych::SyntaxError => e
  puts "YAML syntax error: #{e.message}"
  # Attempt line-by-line validation for debugging
  lines = suspicious_yaml.split("\n")
  lines.each_with_index do |line, index|
    begin
      YAML.safe_load("---\n#{line}")
    rescue Psych::SyntaxError
      puts "Error on line #{index + 1}: #{line}"
    end
  end
end

Version compatibility issues arise when serialized data contains objects or formats incompatible with current Ruby versions. Implementing version checks prevents runtime failures.

# Version-aware serialization
class VersionedData
  VERSION = "2.1.0".freeze
  
  def self.dump(object)
    versioned = { version: VERSION, data: object, timestamp: Time.now }
    Marshal.dump(versioned)
  end
  
  def self.load(serialized_data)
    container = Marshal.load(serialized_data)
    
    case container[:version]
    when "1.0.0"
      migrate_from_v1(container[:data])
    when "2.0.0"
      migrate_from_v2(container[:data])
    when VERSION
      container[:data]
    else
      raise "Unsupported version: #{container[:version]}"
    end
  end
  
  private
  
  def self.migrate_from_v1(data)
    # Handle v1 to current version migration
    data.transform_keys(&:to_sym) if data.respond_to?(:transform_keys)
  end
  
  def self.migrate_from_v2(data)
    # Handle v2 to current version migration
    data[:migrated_at] = Time.now
    data
  end
end

Debugging serialization issues requires systematic approaches to identify problematic objects or data structures. Implementing diagnostic tools helps locate serialization failures in complex object graphs.

# Serialization diagnostic tools
module SerializationDebugger
  def self.find_unserializable(object, path = "root")
    case object
    when Hash
      object.each do |key, value|
        find_unserializable(key, "#{path}[#{key.inspect}] (key)")
        find_unserializable(value, "#{path}[#{key.inspect}]")
      end
    when Array
      object.each_with_index do |item, index|
        find_unserializable(item, "#{path}[#{index}]")
      end
    else
      begin
        Marshal.dump(object)
      rescue TypeError => e
        puts "Cannot serialize #{object.class} at #{path}: #{e.message}"
        puts "Object: #{object.inspect}"
      end
    end
  end
  
  def self.size_analysis(object)
    marshal_size = Marshal.dump(object).bytesize
    json_size = JSON.dump(object).bytesize rescue nil
    yaml_size = YAML.dump(object).bytesize rescue nil
    
    puts "Serialization sizes:"
    puts "Marshal: #{marshal_size} bytes"
    puts "JSON: #{json_size || 'N/A'} bytes"
    puts "YAML: #{yaml_size || 'N/A'} bytes"
  end
end

Performance & Memory

Serialization performance varies significantly across formats and data types. Marshal provides the fastest serialization for Ruby-native objects, while JSON offers better performance for simple data structures with cross-platform requirements.

require 'benchmark'

# Performance comparison across formats
test_data = {
  users: (1..1000).map do |i|
    {
      id: i,
      name: "User #{i}",
      email: "user#{i}@example.com",
      metadata: { created_at: Time.now - i * 3600 }
    }
  end
}

Benchmark.bm(15) do |x|
  x.report("Marshal dump:") { Marshal.dump(test_data) }
  x.report("JSON dump:") { JSON.dump(test_data) }
  x.report("YAML dump:") { YAML.dump(test_data) }
end

# Typical results:
#                      user     system      total        real
# Marshal dump:    0.015000   0.000000   0.015000 (  0.016234)
# JSON dump:       0.025000   0.000000   0.025000 (  0.026891)  
# YAML dump:       0.180000   0.010000   0.190000 (  0.192356)

Memory usage patterns differ between formats, with Marshal creating the most compact representation for Ruby objects while YAML generates larger output due to human-readable formatting.

# Memory usage analysis
def memory_usage(&block)
  before = GC.stat(:total_allocated_objects)
  yield
  after = GC.stat(:total_allocated_objects)
  after - before
end

large_array = (1..10000).to_a

marshal_objects = memory_usage { Marshal.dump(large_array) }
json_objects = memory_usage { JSON.dump(large_array) }
yaml_objects = memory_usage { YAML.dump(large_array) }

puts "Object allocations during serialization:"
puts "Marshal: #{marshal_objects}"
puts "JSON: #{json_objects}" 
puts "YAML: #{yaml_objects}"

# Size comparison
marshal_data = Marshal.dump(large_array)
json_data = JSON.dump(large_array)
yaml_data = YAML.dump(large_array)

puts "\nSerialized data sizes:"
puts "Marshal: #{marshal_data.bytesize} bytes"
puts "JSON: #{json_data.bytesize} bytes"
puts "YAML: #{yaml_data.bytesize} bytes"

Streaming operations minimize memory consumption when processing large datasets. Implementing custom streaming serializers prevents memory exhaustion with large object collections.

# Custom streaming serializer
class StreamingSerializer
  def initialize(io)
    @io = io
    @count = 0
  end
  
  def dump_object(object)
    Marshal.dump(object, @io)
    @count += 1
  end
  
  def each_object
    @io.rewind
    @count.times do
      yield Marshal.load(@io)
    end
  end
  
  def close
    @io.close
  end
end

# Usage for large datasets
File.open("large_dataset.marshal", "wb") do |file|
  serializer = StreamingSerializer.new(file)
  
  # Process large dataset in chunks
  (1..100000).each_slice(1000) do |chunk|
    chunk_data = { batch: chunk, processed_at: Time.now }
    serializer.dump_object(chunk_data)
  end
end

# Memory-efficient reading
File.open("large_dataset.marshal", "rb") do |file|
  serializer = StreamingSerializer.new(file)
  
  serializer.each_object do |batch_data|
    # Process each batch without loading entire dataset
    process_batch(batch_data[:batch])
  end
end

Optimization strategies focus on reducing serialization overhead through caching, selective serialization, and format-specific techniques.

# Serialization cache for expensive objects
class SerializationCache
  def initialize
    @cache = {}
    @timestamps = {}
  end
  
  def get(key, object)
    cache_key = "#{key}_#{object.object_id}"
    
    if @cache[cache_key] && fresh?(cache_key)
      @cache[cache_key]
    else
      @cache[cache_key] = Marshal.dump(object)
      @timestamps[cache_key] = Time.now
      @cache[cache_key]
    end
  end
  
  def fresh?(cache_key, ttl = 300) # 5 minutes
    @timestamps[cache_key] && (Time.now - @timestamps[cache_key]) < ttl
  end
  
  def clear
    @cache.clear
    @timestamps.clear
  end
end

# Selective serialization for large objects
class SelectiveSerializer
  def self.dump(object, options = {})
    case options[:level]
    when :summary
      extract_summary(object)
    when :full
      object
    else
      extract_essential(object)
    end.then { |data| Marshal.dump(data) }
  end
  
  private
  
  def self.extract_summary(object)
    case object
    when Hash
      object.select { |k, v| [:id, :name, :type].include?(k) }
    when Array
      { count: object.size, sample: object.first(3) }
    else
      { class: object.class.name, id: object.object_id }
    end
  end
  
  def self.extract_essential(object)
    # Custom logic for essential data extraction
    object.respond_to?(:to_essential) ? object.to_essential : object
  end
end

Production Patterns

Production environments require robust serialization strategies that handle failures gracefully, maintain data integrity, and support system monitoring. Implementing reliable dump and load patterns prevents data loss and enables system recovery.

# Production-grade serialization with retries and logging
class ProductionSerializer
  include Logger::Severity
  
  def initialize(logger: Rails.logger, max_retries: 3)
    @logger = logger
    @max_retries = max_retries
  end
  
  def dump(object, format: :marshal, file: nil)
    retries = 0
    
    begin
      case format
      when :marshal
        data = Marshal.dump(object)
      when :json
        data = JSON.dump(object)
      when :yaml
        data = YAML.dump(object)
      else
        raise ArgumentError, "Unsupported format: #{format}"
      end
      
      if file
        write_to_file(data, file)
      else
        data
      end
      
    rescue => e
      retries += 1
      @logger.error("Serialization failed (attempt #{retries}): #{e.message}")
      
      if retries < @max_retries
        sleep(2 ** retries) # Exponential backoff
        retry
      else
        @logger.fatal("Serialization failed permanently after #{retries} attempts")
        raise
      end
    end
  end
  
  def load(data_or_file, format: :marshal)
    data = data_or_file.is_a?(String) && File.exist?(data_or_file) ? 
           File.read(data_or_file) : data_or_file
    
    case format
    when :marshal
      Marshal.load(data)
    when :json
      JSON.load(data)
    when :yaml
      YAML.safe_load(data)
    else
      raise ArgumentError, "Unsupported format: #{format}"
    end
    
  rescue => e
    @logger.error("Deserialization failed: #{e.message}")
    attempt_recovery(data, format)
  end
  
  private
  
  def write_to_file(data, file)
    temp_file = "#{file}.tmp"
    File.write(temp_file, data)
    File.rename(temp_file, file)
  ensure
    File.delete(temp_file) if File.exist?(temp_file)
  end
  
  def attempt_recovery(data, format)
    case format
    when :json
      # Attempt to fix common JSON issues
      cleaned = data.gsub(/,\s*}/, '}').gsub(/,\s*]/, ']')
      JSON.load(cleaned)
    else
      raise "Recovery not available for format: #{format}"
    end
  end
end

Distributed systems require serialization coordination across multiple services and data stores. Implementing version-aware serialization prevents compatibility issues during rolling deployments.

# Distributed serialization with version management
class DistributedSerializer
  CURRENT_VERSION = "3.2.1".freeze
  
  def self.dump(object, metadata = {})
    envelope = {
      version: CURRENT_VERSION,
      timestamp: Time.now.utc.iso8601,
      checksum: calculate_checksum(object),
      metadata: metadata,
      data: object
    }
    
    Marshal.dump(envelope)
  end
  
  def self.load(serialized_data, strict: false)
    envelope = Marshal.load(serialized_data)
    
    validate_version(envelope[:version], strict)
    validate_checksum(envelope[:data], envelope[:checksum])
    
    {
      data: envelope[:data],
      version: envelope[:version],
      timestamp: Time.parse(envelope[:timestamp]),
      metadata: envelope[:metadata] || {}
    }
  end
  
  def self.validate_version(version, strict)
    major, minor, patch = version.split('.').map(&:to_i)
    current_major, current_minor, current_patch = CURRENT_VERSION.split('.').map(&:to_i)
    
    if strict && version != CURRENT_VERSION
      raise "Version mismatch: #{version} != #{CURRENT_VERSION}"
    elsif major > current_major || (major == current_major && minor > current_minor)
      raise "Future version not supported: #{version}"
    end
  end
  
  def self.calculate_checksum(object)
    Digest::SHA256.hexdigest(Marshal.dump(object))
  end
  
  def self.validate_checksum(object, expected)
    actual = calculate_checksum(object)
    raise "Checksum mismatch" unless actual == expected
  end
end

Database integration patterns handle serialization for persistent storage, implementing custom ActiveRecord serializers and handling schema evolution.

# ActiveRecord integration with custom serialization
class ConfigurationSettings < ActiveRecord::Base
  serialize :data, JSON
  
  # Custom serializer for complex objects
  def self.dump(object)
    case object
    when Hash, Array
      JSON.dump(object)
    when Time, Date
      JSON.dump({ _type: object.class.name, _value: object.iso8601 })
    else
      JSON.dump({ _type: 'Object', _class: object.class.name, _value: object.to_h })
    end
  end
  
  def self.load(json_string)
    return nil if json_string.blank?
    
    data = JSON.parse(json_string)
    
    case data
    when Hash
      if data['_type']
        case data['_type']
        when 'Time'
          Time.parse(data['_value'])
        when 'Date'
          Date.parse(data['_value'])
        else
          data
        end
      else
        data.symbolize_keys
      end
    else
      data
    end
  rescue JSON::ParserError => e
    Rails.logger.error("Failed to deserialize configuration: #{e.message}")
    nil
  end
end

# Usage in application
config = ConfigurationSettings.create!(
  name: 'api_settings',
  data: {
    timeout: 30,
    retries: 3,
    endpoints: ['api.service.com', 'backup.service.com'],
    last_updated: Time.current
  }
)

Monitoring and alerting systems track serialization performance and failures, enabling proactive system maintenance.

# Serialization monitoring and metrics
class SerializationMonitor
  def self.with_monitoring(operation_name, &block)
    start_time = Time.current
    
    begin
      result = yield
      record_success(operation_name, Time.current - start_time)
      result
    rescue => e
      record_failure(operation_name, e, Time.current - start_time)
      raise
    end
  end
  
  def self.record_success(operation, duration)
    Rails.logger.info("Serialization success: #{operation} (#{duration.round(3)}s)")
    
    # Send metrics to monitoring system
    StatsD.timing("serialization.#{operation}.duration", duration * 1000)
    StatsD.increment("serialization.#{operation}.success")
  end
  
  def self.record_failure(operation, error, duration)
    Rails.logger.error("Serialization failure: #{operation} - #{error.message} (#{duration.round(3)}s)")
    
    StatsD.timing("serialization.#{operation}.duration", duration * 1000)
    StatsD.increment("serialization.#{operation}.failure")
    StatsD.increment("serialization.#{operation}.#{error.class.name.underscore}")
  end
end

# Usage in application code
class DataProcessor
  def process_batch(batch_data)
    SerializationMonitor.with_monitoring("batch_processing") do
      serialized = Marshal.dump(batch_data)
      Redis.current.setex("batch:#{batch_data[:id]}", 3600, serialized)
    end
  end
  
  def retrieve_batch(batch_id)
    SerializationMonitor.with_monitoring("batch_retrieval") do
      serialized = Redis.current.get("batch:#{batch_id}")
      return nil unless serialized
      Marshal.load(serialized)
    end
  end
end

Common Pitfalls

Security vulnerabilities represent the most critical pitfall in dump and load operations. Marshal.load executes arbitrary Ruby code during deserialization, creating remote code execution risks with untrusted data.

# DANGEROUS - Never do this with untrusted input
untrusted_data = params[:serialized_data] # From user input
dangerous_object = Marshal.load(Base64.decode64(untrusted_data))

# SAFE - Validate and sanitize before deserialization
class SafeDeserializer
  ALLOWED_CLASSES = [String, Integer, Float, Array, Hash, Symbol, Time].freeze
  
  def self.load(serialized_data, allowed_classes: ALLOWED_CLASSES)
    # Use JSON for untrusted data
    JSON.parse(serialized_data, create_additions: false)
  rescue JSON::ParserError
    raise "Invalid serialized data"
  end
  
  def self.marshal_load(data, validate: true)
    if validate
      # Only load from trusted sources
      raise "Untrusted Marshal data" unless trusted_source?
    end
    
    Marshal.load(data)
  end
  
  private
  
  def self.trusted_source?
    # Implement source validation logic
    Thread.current[:trusted_serialization] == true
  end
end

Symbol memory leaks occur when deserializing untrusted data containing symbols. Ruby never garbage collects symbols, leading to memory exhaustion attacks.

# VULNERABLE - Symbols from untrusted data
user_data = JSON.parse(params[:data], symbolize_names: true)
# Attacker can create unlimited symbols: {"aaaa": 1, "bbbb": 2, ...}

# SAFE - String keys with manual conversion
user_data = JSON.parse(params[:data])
safe_symbols = user_data.select { |k, v| k.in?(['name', 'email', 'id']) }
                       .transform_keys(&:to_sym)

# Monitor symbol table growth
class SymbolMonitor
  def self.check_symbol_count(threshold: 100_000)
    count = Symbol.all_symbols.count
    Rails.logger.warn("High symbol count: #{count}") if count > threshold
    count
  end
end

Object identity and reference issues arise when serialization breaks object relationships or creates unexpected duplicates.

# Reference identity problems
shared_object = { data: "shared" }
container = {
  first: shared_object,
  second: shared_object
}

# Marshal preserves identity
marshaled = Marshal.dump(container)
restored = Marshal.load(marshaled)
restored[:first].object_id == restored[:second].object_id
# => true (identity preserved)

# JSON breaks identity  
json_data = JSON.dump(container)
json_restored = JSON.parse(json_data)
json_restored["first"].object_id == json_restored["second"].object_id
# => false (separate objects created)

# Handle circular references
class CircularSafeHash < Hash
  def to_json(*args)
    # Track visited objects to prevent infinite recursion
    visited = Thread.current[:json_visited] ||= Set.new
    
    if visited.include?(self.object_id)
      return '{"_circular_reference": true}'
    end
    
    visited.add(self.object_id)
    result = super
    visited.delete(self.object_id)
    result
  end
end

Encoding and character set issues create corruption when serializing text data across systems with different default encodings.

# Encoding problems in serialization
text_data = { message: "Hello 你好".encode("UTF-8") }

# Marshal preserves encoding
marshaled = Marshal.dump(text_data)
File.write("data.marshal", marshaled, mode: "wb")
restored = Marshal.load(File.read("data.marshal", mode: "rb"))
restored[:message].encoding
# => #<Encoding:UTF-8>

# JSON may lose encoding information
json_data = JSON.dump(text_data)
File.write("data.json", json_data, encoding: "ASCII")
# => Encoding::UndefinedConversionError

# Safe encoding handling
class EncodingAwareSerializer
  def self.dump(object, encoding: Encoding::UTF_8)
    case object
    when String
      { _content: object.force_encoding(encoding).encode(encoding), 
        _encoding: encoding.name }
    when Hash
      object.transform_values { |v| dump(v, encoding: encoding) }
    when Array
      object.map { |item| dump(item, encoding: encoding) }
    else
      object
    end
  end
  
  def self.load(data)
    case data
    when Hash
      if data[:_content] && data[:_encoding]
        data[:_content].force_encoding(data[:_encoding])
      else
        data.transform_values { |v| load(v) }
      end
    when Array
      data.map { |item| load(item) }
    else
      data
    end
  end
end

Version compatibility failures occur when serialized objects contain classes or methods unavailable in different Ruby versions or application deployments.

# Version compatibility issues
class DeprecatedFeature
  def initialize(data)
    @data = data
    @legacy_method = method(:old_behavior) # Method may not exist in newer versions
  end
  
  def marshal_dump
    [@data, @legacy_method.name]
  end
  
  def marshal_load(array)
    @data, method_name = array
    @legacy_method = method(method_name) if respond_to?(method_name, true)
  end
end

# Safe versioned serialization
class VersionedClass
  VERSION = 2
  
  def marshal_dump
    [VERSION, @data, @new_field]
  end
  
  def marshal_load(array)
    version = array.first
    
    case version
    when 1
      @data = array[1]
      @new_field = nil # Provide default for missing field
    when 2
      @data, @new_field = array[1], array[2]
    else
      raise "Unsupported version: #{version}"
    end
  end
end

# Handle missing constants during deserialization
module ConstantMissing
  def self.handle_missing_constant(name)
    case name
    when 'OldClassName'
      # Map to new class name
      NewClassName
    when 'RemovedClass'
      # Create placeholder
      Class.new do
        def initialize(*args); end
        def method_missing(*args); end
      end
    else
      raise NameError, "uninitialized constant #{name}"
    end
  end
end

# Monkey patch for graceful degradation
class Module
  alias_method :original_const_missing, :const_missing
  
  def const_missing(name)
    ConstantMissing.handle_missing_constant(name)
  rescue NameError
    original_const_missing(name)
  end
end

Performance degradation in production often stems from serializing oversized objects, inefficient format choices, or excessive serialization frequency without caching.

# Performance pitfalls and solutions
class IneffientModel
  def initialize
    @large_dataset = (1..1_000_000).to_a # Avoid serializing large collections
    @cached_calculation = expensive_calculation # Don't recalculate on each dump
  end
  
  # BAD - Serializes entire large dataset
  def marshal_dump
    [@large_dataset, @cached_calculation]
  end
end

class OptimizedModel  
  def initialize
    @large_dataset = (1..1_000_000).to_a
    @cached_calculation = expensive_calculation
  end
  
  # GOOD - Serialize only essential data
  def marshal_dump
    {
      dataset_size: @large_dataset.size,
      dataset_checksum: Digest::MD5.hexdigest(@large_dataset.join),
      cached_calculation: @cached_calculation
    }
  end
  
  def marshal_load(data)
    # Reconstruct large dataset only when needed
    @dataset_size = data[:dataset_size]
    @dataset_checksum = data[:dataset_checksum] 
    @cached_calculation = data[:cached_calculation]
    @large_dataset = nil # Lazy load when accessed
  end
  
  def large_dataset
    @large_dataset ||= reconstruct_dataset
  end
  
  private
  
  def reconstruct_dataset
    # Rebuild dataset from database or cache
    (1..@dataset_size).to_a
  end
end

Reference

Marshal Methods

Method Parameters Returns Description
Marshal.dump(obj, port=nil) obj (Object), port (IO, optional) String or nil Serializes object to binary format, writes to port if provided
Marshal.load(source, proc=nil) source (String/IO), proc (Proc, optional) Object Deserializes binary data, calls proc for each object if provided
Marshal.restore(source) source (String/IO) Object Alias for Marshal.load

JSON Methods

Method Parameters Returns Description
JSON.dump(obj, io=nil, limit=nil) obj (Object), io (IO, optional), limit (Integer, optional) String or nil Converts object to JSON string, writes to io if provided
JSON.load(source, proc=nil, options={}) source (String/IO), proc (Proc, optional), options (Hash) Object Parses JSON data with optional processing proc
JSON.parse(source, opts={}) source (String), opts (Hash) Object Parses JSON string with configuration options
JSON.generate(obj, opts={}) obj (Object), opts (Hash) String Generates JSON string with formatting options

YAML Methods

Method Parameters Returns Description
YAML.dump(obj, io=nil) obj (Object), io (IO, optional) String or nil Converts object to YAML format, writes to io if provided
YAML.load(yaml, filename=nil) yaml (String/IO), filename (String, optional) Object Deserializes YAML data (unsafe with untrusted input)
YAML.safe_load(yaml, permitted_classes=[], aliases=false) yaml (String/IO), permitted_classes (Array), aliases (Boolean) Object Safely deserializes YAML with class restrictions
YAML.load_file(filename) filename (String) Object Loads and parses YAML from file

Custom Serialization Methods

Method Parameters Returns Description
#marshal_dump None Object Defines custom Marshal serialization data
#marshal_load(obj) obj (Object) nil Restores object from Marshal serialization data
#to_json(*args) *args (varied) String Defines custom JSON representation
#as_json(options={}) options (Hash) Object Returns object for JSON serialization

Common Options

JSON Options:

  • symbolize_names: true - Convert string keys to symbols
  • allow_nan: true - Allow NaN and Infinity values
  • max_nesting: 100 - Maximum nesting depth
  • create_additions: false - Disable object creation from JSON

YAML Safe Load Options:

  • permitted_classes: [Class1, Class2] - Allow specific classes
  • permitted_symbols: [:symbol1, :symbol2] - Allow specific symbols
  • aliases: true - Enable YAML aliases and anchors

Error Types

Error Class Raised By Description
TypeError Marshal Object cannot be serialized
ArgumentError Marshal Invalid serialized data format
JSON::ParserError JSON Invalid JSON syntax
JSON::NestingError JSON Maximum nesting depth exceeded
JSON::GeneratorError JSON Object cannot be converted to JSON
Psych::SyntaxError YAML Invalid YAML syntax
Psych::DisallowedClass YAML Class not permitted in safe_load

Security Considerations

Never Use With Untrusted Data:

  • Marshal.load - Executes arbitrary code
  • YAML.load - Can instantiate dangerous objects
  • JSON.load with create_additions: true

Safe Alternatives:

  • JSON.parse for basic data types
  • YAML.safe_load with permitted classes
  • Custom validation before deserialization

Performance Guidelines

Speed Ranking (fastest to slowest):

  1. Marshal - Binary format, Ruby-native
  2. JSON - Text format, simple types
  3. YAML - Text format, complex types

Memory Usage:

  • Marshal: Most compact for Ruby objects
  • JSON: Moderate size, cross-platform
  • YAML: Largest output due to formatting