Overview
Object serialization transforms Ruby objects into portable formats that can be stored, transmitted, or reconstructed later. Ruby provides multiple serialization mechanisms, each optimized for different scenarios and compatibility requirements.
The Marshal
module handles Ruby's native binary serialization format, preserving object structure and Ruby-specific data types with maximum fidelity. The JSON
module converts objects to JavaScript Object Notation, prioritizing cross-language compatibility and human readability. The YAML
module serializes to YAML format, balancing readability with support for complex data structures.
user = { name: "Alice", age: 30, roles: ["admin", "user"] }
# Native Ruby serialization
marshal_data = Marshal.dump(user)
# => "\x04\b{\bI\"\tname\x06:\x06EFI\"\nAlice\x06;\x00F..."
# JSON serialization
json_data = JSON.dump(user)
# => "{\"name\":\"Alice\",\"age\":30,\"roles\":[\"admin\",\"user\"]}"
# YAML serialization
yaml_data = YAML.dump(user)
# => "---\n:name: Alice\n:age: 30\n:roles:\n- admin\n- user\n"
Each format handles object reconstruction through corresponding load methods. Marshal.load
reconstructs the exact Ruby object structure, while JSON.parse
and YAML.load
create new objects with equivalent data.
Marshal.load(marshal_data)
# => {:name=>"Alice", :age=>30, :roles=>["admin", "user"]}
JSON.parse(json_data)
# => {"name"=>"Alice", "age"=>30, "roles"=>["admin", "user"]}
YAML.load(yaml_data)
# => {:name=>"Alice", :age=>30, :roles=>["admin", "user"]}
Custom objects require additional considerations. Marshal preserves class information and instance variables automatically. JSON and YAML require explicit conversion logic through to_json
and to_yaml
methods, or custom serialization handlers.
Basic Usage
Marshal provides the most complete serialization for Ruby objects, handling complex data structures, custom classes, and maintaining object relationships including circular references.
class User
attr_accessor :name, :email, :created_at
def initialize(name, email)
@name = name
@email = email
@created_at = Time.now
end
end
user = User.new("Bob", "bob@example.com")
user_data = Marshal.dump(user)
restored_user = Marshal.load(user_data)
restored_user.name
# => "Bob"
restored_user.created_at.class
# => Time
JSON serialization requires converting objects to Hash representations. The JSON
module automatically handles basic Ruby types like strings, numbers, arrays, and hashes.
# Hash serialization
data = { users: [{ id: 1, active: true }, { id: 2, active: false }] }
json_string = JSON.generate(data)
# => "{\"users\":[{\"id\":1,\"active\":true},{\"id\":2,\"active\":false}]}"
parsed_data = JSON.parse(json_string)
# => {"users"=>[{"id"=>1, "active"=>true}, {"id"=>2, "active"=>false}]}
# Array serialization
numbers = [1, 2.5, -10, 1000]
JSON.generate(numbers)
# => "[1,2.5,-10,1000]"
Custom objects need explicit JSON conversion logic. Define to_json
methods or use the object's hash representation.
class Product
attr_accessor :name, :price, :category
def initialize(name, price, category)
@name = name
@price = price
@category = category
end
def to_json(*args)
{
name: @name,
price: @price,
category: @category
}.to_json(*args)
end
end
product = Product.new("Laptop", 999.99, "Electronics")
JSON.generate(product)
# => "{\"name\":\"Laptop\",\"price\":999.99,\"category\":\"Electronics\"}"
YAML handles more complex Ruby data types than JSON, including symbols, dates, and ranges. The syntax remains human-readable while supporting nested structures.
config = {
database: {
host: "localhost",
port: 5432,
credentials: {
username: :admin,
password: "secret123"
}
},
features: {
enabled: true,
beta_range: (1..10),
timeout: 30
}
}
yaml_output = YAML.dump(config)
puts yaml_output
# ---
# :database:
# :host: localhost
# :port: 5432
# :credentials:
# :username: :admin
# :password: secret123
# :features:
# :enabled: true
# :beta_range: 1..10
# :timeout: 30
YAML.load(yaml_output)[:features][:beta_range]
# => 1..10
File-based serialization patterns handle persistence scenarios. Each format provides convenient file I/O methods.
# Marshal file operations
File.open("user_data.marshal", "wb") { |f| Marshal.dump(user, f) }
loaded_user = File.open("user_data.marshal", "rb") { |f| Marshal.load(f) }
# JSON file operations
File.write("config.json", JSON.pretty_generate(data))
loaded_data = JSON.parse(File.read("config.json"))
# YAML file operations
File.write("settings.yml", YAML.dump(config))
loaded_config = YAML.load_file("settings.yml")
Error Handling & Debugging
Serialization operations encounter various error conditions that require specific handling strategies. Marshal operations can fail due to unsupported objects, corrupted data, or class loading issues.
# Handling unsupported objects
proc_object = proc { puts "Hello" }
begin
Marshal.dump(proc_object)
rescue TypeError => e
puts "Cannot serialize: #{e.message}"
# => Cannot serialize: no _dump_data is defined for class Proc
end
# Handling corrupted marshal data
corrupted_data = "invalid marshal data"
begin
Marshal.load(corrupted_data)
rescue ArgumentError => e
puts "Marshal error: #{e.message}"
# => Marshal error: marshal data too short
end
JSON parsing errors occur with malformed data, unsupported types, or encoding issues. The parser provides detailed error information including position data.
# Handling malformed JSON
malformed_json = '{"name": "Alice", "age":}'
begin
JSON.parse(malformed_json)
rescue JSON::ParserError => e
puts "JSON parsing failed: #{e.message}"
# => JSON parsing failed: unexpected token at '}'
puts "Error occurred around position: #{e.to_s.scan(/\d+/).first}"
end
# Handling encoding issues
binary_data = "\x80\x81\x82".force_encoding("UTF-8")
begin
JSON.generate({ data: binary_data })
rescue Encoding::UndefinedConversionError => e
puts "Encoding error: #{e.message}"
# Handle by encoding to base64 or cleaning data
safe_data = { data: [binary_data].pack('m0') } # base64 encoding
JSON.generate(safe_data)
end
YAML deserialization presents security risks when loading untrusted data. Ruby provides safe loading options to prevent code execution vulnerabilities.
# Unsafe YAML with embedded Ruby code
dangerous_yaml = <<~YAML
---
- !ruby/object:OpenStruct
table:
:name: "Alice"
:command: !ruby/object:Kernel
YAML
# Safe loading approach
begin
safe_data = YAML.safe_load(dangerous_yaml, permitted_classes: [])
rescue Psych::DisallowedClass => e
puts "Blocked unsafe class: #{e.message}"
# => Blocked unsafe class: Tried to load unspecified class: OpenStruct
end
# Permitted classes for controlled deserialization
permitted_classes = [Date, Time, Symbol]
safe_config = YAML.safe_load(yaml_string, permitted_classes: permitted_classes)
Circular reference detection prevents infinite loops during serialization. Marshal handles circular references automatically, while JSON and YAML require manual detection.
# Marshal handles circular references
parent = { name: "Parent" }
child = { name: "Child", parent: parent }
parent[:child] = child
marshal_data = Marshal.dump(parent) # Works fine
restored = Marshal.load(marshal_data)
restored[:child][:parent] == restored # => true
# JSON requires circular reference detection
begin
JSON.generate(parent)
rescue JSON::NestingError => e
puts "Circular reference detected: #{e.message}"
# Manual handling approach
def serialize_with_refs(obj, refs = Set.new)
case obj
when Hash
if refs.include?(obj.object_id)
return { "__circular_ref__" => obj.object_id }
end
refs.add(obj.object_id)
result = obj.transform_values { |v| serialize_with_refs(v, refs) }
refs.delete(obj.object_id)
result
else
obj
end
end
safe_parent = serialize_with_refs(parent)
JSON.generate(safe_parent)
end
Version compatibility issues arise when deserializing data created with different Ruby versions or gem versions. Implement version checking and migration strategies.
# Version-aware serialization
module VersionedSerialization
VERSION = "1.2"
def self.dump(obj)
versioned_data = {
version: VERSION,
data: obj,
timestamp: Time.now.iso8601
}
Marshal.dump(versioned_data)
end
def self.load(data)
begin
container = Marshal.load(data)
if container[:version] != VERSION
puts "Version mismatch: #{container[:version]} vs #{VERSION}"
# Apply migration logic here
migrate_data(container[:data], container[:version])
else
container[:data]
end
rescue => e
puts "Failed to load versioned data: #{e.message}"
nil
end
end
private
def self.migrate_data(data, from_version)
case from_version
when "1.1"
# Apply 1.1 to 1.2 migration
migrate_1_1_to_1_2(data)
when "1.0"
# Apply 1.0 to 1.2 migration
migrate_1_0_to_1_2(data)
else
raise "Unknown version: #{from_version}"
end
end
end
Performance & Memory
Serialization performance varies significantly between formats, with trade-offs between speed, size, and compatibility. Marshal provides the fastest serialization for Ruby-to-Ruby communication, while JSON offers better cross-platform performance.
require 'benchmark'
# Test data: complex nested structure
data = {
users: (1..1000).map do |i|
{
id: i,
name: "User #{i}",
email: "user#{i}@example.com",
metadata: {
created_at: Time.now - rand(365) * 24 * 3600,
preferences: {
theme: ["light", "dark"].sample,
notifications: rand > 0.5,
features: (1..rand(10)).map { |f| "feature_#{f}" }
}
}
}
end
}
# Benchmark serialization speed
Benchmark.bm(15) do |x|
x.report("Marshal dump:") { 100.times { Marshal.dump(data) } }
x.report("JSON generate:") { 100.times { JSON.generate(data) } }
x.report("YAML dump:") { 100.times { YAML.dump(data) } }
end
# Typical results (times vary by system):
# user system total real
# Marshal dump: 0.050000 0.000000 0.050000 ( 0.052341)
# JSON generate: 0.180000 0.010000 0.190000 ( 0.191250)
# YAML dump: 2.340000 0.020000 2.360000 ( 2.387453)
Size efficiency impacts storage requirements and network transmission times. Marshal produces compact binary data, JSON creates readable but larger text, and YAML generates the most verbose output.
marshal_size = Marshal.dump(data).bytesize
json_size = JSON.generate(data).bytesize
yaml_size = YAML.dump(data).bytesize
puts "Marshal: #{marshal_size} bytes"
puts "JSON: #{json_size} bytes (#{json_size.to_f / marshal_size:.1f}x larger)"
puts "YAML: #{yaml_size} bytes (#{yaml_size.to_f / marshal_size:.1f}x larger)"
# Example output:
# Marshal: 89432 bytes
# JSON: 142851 bytes (1.6x larger)
# YAML: 198347 bytes (2.2x larger)
Memory usage patterns differ during serialization and deserialization. Large objects can cause memory spikes, particularly with YAML processing.
# Memory monitoring during serialization
def measure_memory
GC.start
GC.disable
memory_before = `ps -o rss= -p #{Process.pid}`.to_i
yield
memory_after = `ps -o rss= -p #{Process.pid}`.to_i
GC.enable
memory_after - memory_before
end
large_array = (1..100_000).map { |i| { id: i, data: "x" * 100 } }
marshal_memory = measure_memory { Marshal.dump(large_array) }
json_memory = measure_memory { JSON.generate(large_array) }
yaml_memory = measure_memory { YAML.dump(large_array) }
puts "Memory usage (KB):"
puts "Marshal: #{marshal_memory}"
puts "JSON: #{json_memory}"
puts "YAML: #{yaml_memory}"
Streaming serialization prevents memory exhaustion when processing large datasets. Implement custom streaming for JSON arrays and YAML documents.
# Streaming JSON array serialization
class JSONStreamer
def initialize(io)
@io = io
@first = true
end
def start_array
@io.write("[")
end
def write_object(obj)
@io.write(",") unless @first
@io.write(JSON.generate(obj))
@first = false
end
def end_array
@io.write("]")
end
end
# Usage for large dataset
File.open("large_data.json", "w") do |file|
streamer = JSONStreamer.new(file)
streamer.start_array
(1..1_000_000).each do |i|
record = { id: i, timestamp: Time.now.to_i }
streamer.write_object(record)
# Process in batches to control memory
GC.start if i % 10_000 == 0
end
streamer.end_array
end
Object pooling and reuse strategies reduce garbage collection pressure during high-frequency serialization operations.
class SerializationPool
def initialize
@json_parsers = []
@marshal_buffers = []
end
def with_json_parser
parser = @json_parsers.pop || JSON
begin
yield parser
ensure
@json_parsers.push(parser) if @json_parsers.length < 10
end
end
def with_marshal_buffer
buffer = @marshal_buffers.pop || StringIO.new
buffer.rewind
buffer.truncate(0)
begin
yield buffer
ensure
@marshal_buffers.push(buffer) if @marshal_buffers.length < 10
end
end
end
# High-frequency serialization with pooling
pool = SerializationPool.new
1000.times do |i|
data = { request_id: i, payload: "data_#{i}" }
pool.with_marshal_buffer do |buffer|
Marshal.dump(data, buffer)
serialized = buffer.string
# Process serialized data
end
end
Production Patterns
Web API serialization requires consistent data formatting, error handling, and performance optimization. JSON dominates API responses due to broad client support and reasonable performance characteristics.
# Rails API serialization pattern
class UserSerializer
def self.serialize(user, options = {})
base_data = {
id: user.id,
name: user.name,
email: user.email,
created_at: user.created_at.iso8601
}
if options[:include_roles]
base_data[:roles] = user.roles.map(&:name)
end
if options[:include_preferences]
base_data[:preferences] = serialize_preferences(user.preferences)
end
base_data
end
def self.serialize_collection(users, options = {})
{
data: users.map { |user| serialize(user, options) },
meta: {
total: users.respond_to?(:total_count) ? users.total_count : users.size,
serialized_at: Time.current.iso8601
}
}
end
private
def self.serialize_preferences(preferences)
return nil unless preferences
{
theme: preferences[:theme] || "default",
notifications: {
email: preferences.dig(:notifications, :email) != false,
push: preferences.dig(:notifications, :push) != false
},
privacy: {
profile_visible: preferences.dig(:privacy, :profile_visible) != false
}
}
end
end
# Controller usage with error handling
class Api::UsersController < ApplicationController
def index
users = User.includes(:roles).page(params[:page])
render json: UserSerializer.serialize_collection(
users,
include_roles: params[:include_roles],
include_preferences: params[:include_preferences]
)
rescue => e
render json: {
error: "Serialization failed",
message: e.message
}, status: 500
end
end
Caching strategies optimize repeated serialization operations. Implement cache invalidation based on object changes and serialization options.
class CachedSerializer
CACHE_TTL = 1.hour
def self.serialize_with_cache(object, options = {})
cache_key = generate_cache_key(object, options)
Rails.cache.fetch(cache_key, expires_in: CACHE_TTL) do
perform_serialization(object, options)
end
end
def self.invalidate_cache(object)
# Clear all cached versions for this object
pattern = "serialized:#{object.class.name}:#{object.id}:*"
Rails.cache.delete_matched(pattern)
end
private
def self.generate_cache_key(object, options)
option_hash = Digest::MD5.hexdigest(options.to_json)
timestamp = object.respond_to?(:updated_at) ? object.updated_at.to_i : Time.current.to_i
"serialized:#{object.class.name}:#{object.id}:#{timestamp}:#{option_hash}"
end
def self.perform_serialization(object, options)
# Actual serialization logic here
case object
when User
UserSerializer.serialize(object, options)
when Product
ProductSerializer.serialize(object, options)
else
raise "Unknown object type: #{object.class}"
end
end
end
# Model integration for automatic cache invalidation
class User < ActiveRecord::Base
after_update :invalidate_serialization_cache
after_destroy :invalidate_serialization_cache
private
def invalidate_serialization_cache
CachedSerializer.invalidate_cache(self)
end
end
Configuration management uses YAML for environment-specific settings with validation and type coercion.
class ConfigManager
CONFIG_PATH = Rails.root.join("config", "application.yml")
def self.load_config
raw_config = YAML.load_file(CONFIG_PATH)
environment_config = raw_config[Rails.env] || {}
validate_config(environment_config)
coerce_types(environment_config)
rescue Psych::SyntaxError => e
raise "Invalid YAML configuration: #{e.message}"
rescue Errno::ENOENT
raise "Configuration file not found: #{CONFIG_PATH}"
end
def self.validate_config(config)
required_keys = %w[database_url redis_url secret_key_base]
missing_keys = required_keys - config.keys
if missing_keys.any?
raise "Missing required configuration: #{missing_keys.join(', ')}"
end
# Validate specific formats
unless config["database_url"].start_with?("postgres://", "postgresql://")
raise "Invalid database_url format"
end
end
def self.coerce_types(config)
# Convert string values to appropriate types
config["worker_threads"] = config["worker_threads"].to_i if config["worker_threads"]
config["enable_ssl"] = config["enable_ssl"] == "true" if config.key?("enable_ssl")
config["timeout"] = config["timeout"].to_f if config["timeout"]
# Parse complex nested values
if config["feature_flags"].is_a?(String)
config["feature_flags"] = config["feature_flags"].split(",").map(&:strip)
end
config
end
end
# Application initialization
begin
APP_CONFIG = ConfigManager.load_config
rescue => e
puts "Configuration error: #{e.message}"
exit 1
end
Background job serialization requires handling complex data structures and maintaining job queue compatibility.
# Sidekiq-compatible job serialization
class DataProcessingJob
include Sidekiq::Worker
def perform(serialized_data, options = {})
# Deserialize complex data structures
data = JSON.parse(serialized_data)
case data["type"]
when "user_export"
process_user_export(data["user_ids"], options)
when "report_generation"
generate_report(data["report_config"], options)
else
raise "Unknown job type: #{data['type']}"
end
end
def self.enqueue_user_export(user_ids, options = {})
job_data = {
type: "user_export",
user_ids: user_ids,
timestamp: Time.current.iso8601
}
perform_async(JSON.generate(job_data), options)
end
def self.enqueue_report_generation(report_config, options = {})
# Sanitize config for serialization
safe_config = sanitize_report_config(report_config)
job_data = {
type: "report_generation",
report_config: safe_config,
timestamp: Time.current.iso8601
}
perform_async(JSON.generate(job_data), options)
end
private
def self.sanitize_report_config(config)
# Remove non-serializable elements
config.except(:callbacks, :lambdas).tap do |safe_config|
# Convert dates to ISO strings
safe_config["start_date"] = config["start_date"].iso8601 if config["start_date"].respond_to?(:iso8601)
safe_config["end_date"] = config["end_date"].iso8601 if config["end_date"].respond_to?(:iso8601)
end
end
end
Common Pitfalls
Symbol and string key inconsistencies create subtle bugs when switching between serialization formats. JSON converts symbol keys to strings, while YAML preserves symbols.
original_data = { name: "Alice", :age => 30, "email" => "alice@example.com" }
# JSON converts all keys to strings
json_round_trip = JSON.parse(JSON.generate(original_data))
# => {"name"=>"Alice", "age"=>30, "email"=>"alice@example.com"}
# YAML preserves symbol keys
yaml_round_trip = YAML.load(YAML.dump(original_data))
# => {:name=>"Alice", :age=>30, "email"=>"alice@example.com"}
# Accessing data fails due to key type changes
json_round_trip[:name] # => nil (key is now string)
json_round_trip["name"] # => "Alice"
# Solution: normalize keys consistently
def normalize_keys(obj, symbolize: false)
case obj
when Hash
method = symbolize ? :to_sym : :to_s
obj.each_with_object({}) do |(key, value), result|
result[key.send(method)] = normalize_keys(value, symbolize: symbolize)
end
when Array
obj.map { |item| normalize_keys(item, symbolize: symbolize) }
else
obj
end
end
consistent_data = normalize_keys(json_round_trip, symbolize: true)
consistent_data[:name] # => "Alice"
Time zone and date handling varies between formats, leading to data corruption during round-trips. Marshal preserves exact Time objects, while JSON loses time zone information.
# Time zone data loss in JSON
original_time = Time.new(2024, 1, 15, 14, 30, 0, "-05:00") # EST
puts "Original: #{original_time} (#{original_time.zone})"
# JSON loses time zone information
json_time = JSON.parse({ timestamp: original_time }.to_json)["timestamp"]
parsed_time = Time.parse(json_time)
puts "After JSON: #{parsed_time} (#{parsed_time.zone})"
# Time zone changed to system default
# Solution: explicit ISO 8601 formatting
safe_json_data = { timestamp: original_time.iso8601 }
json_string = JSON.generate(safe_json_data)
restored_data = JSON.parse(json_string)
restored_time = Time.iso8601(restored_data["timestamp"])
puts "ISO 8601 restored: #{restored_time} (#{restored_time.zone})"
# YAML preserves Time objects but may have compatibility issues
yaml_data = YAML.dump({ timestamp: original_time })
yaml_restored = YAML.load(yaml_data)
puts "YAML restored: #{yaml_restored[:timestamp]} (#{yaml_restored[:timestamp].zone})"
Encoding issues cause serialization failures, particularly with binary data or non-UTF-8 strings. Each format handles encoding differently.
# Binary data in different formats
binary_data = "\xFF\xFE\x00\x01".b # Binary string
text_with_encoding = "Café".encode("ISO-8859-1")
begin
# JSON fails with binary data
JSON.generate({ binary: binary_data })
rescue Encoding::UndefinedConversionError => e
puts "JSON encoding error: #{e.message}"
# Solution: Base64 encoding for binary data
require 'base64'
json_safe = JSON.generate({
binary: Base64.strict_encode64(binary_data),
encoding: "base64"
})
# Decoding
parsed = JSON.parse(json_safe)
if parsed["encoding"] == "base64"
restored_binary = Base64.strict_decode64(parsed["binary"])
restored_binary == binary_data # => true
end
end
# Marshal handles encodings naturally
marshal_data = Marshal.dump({
binary: binary_data,
text: text_with_encoding
})
restored = Marshal.load(marshal_data)
restored[:binary].encoding.name # => "ASCII-8BIT"
restored[:text].encoding.name # => "ISO-8859-1"
# YAML encoding behavior varies by content
yaml_binary = YAML.dump({ binary: binary_data })
# May produce different results on different systems
Class loading dependencies create runtime errors when deserializing objects whose classes are not available. This commonly affects Marshal data.
# Define a class for serialization
class CustomData
attr_accessor :value, :metadata
def initialize(value, metadata = {})
@value = value
@metadata = metadata
end
end
custom_obj = CustomData.new("test", { created: Time.now })
marshal_data = Marshal.dump(custom_obj)
# Simulate class not being available
Object.send(:remove_const, :CustomData)
begin
Marshal.load(marshal_data)
rescue ArgumentError => e
puts "Class loading error: #{e.message}"
# => undefined class/module CustomData
end
# Solution: graceful handling with fallback
module SafeDeserialization
def self.load_marshal(data)
Marshal.load(data)
rescue ArgumentError => e
if e.message.include?("undefined class")
puts "Warning: #{e.message}"
# Return metadata about the object instead
{
error: "class_not_found",
message: e.message,
data_size: data.bytesize
}
else
raise
end
end
end
Recursive data structures cause stack overflow errors or infinite loops during serialization. Implement depth limiting and circular reference detection.
# Create problematic recursive structure
def create_recursive_hash(depth = 1000)
current = { level: depth }
(depth - 1).downto(1) do |i|
current = { level: i, child: current }
end
current
end
deep_structure = create_recursive_hash(5000)
# Stack overflow with JSON
begin
JSON.generate(deep_structure)
rescue SystemStackError
puts "Stack overflow during JSON serialization"
end
# Solution: depth-limited serialization
class SafeSerializer
MAX_DEPTH = 100
def self.serialize(obj, max_depth: MAX_DEPTH)
serialize_recursive(obj, 0, max_depth)
end
private
def self.serialize_recursive(obj, current_depth, max_depth)
if current_depth >= max_depth
return { __truncated: true, type: obj.class.name }
end
case obj
when Hash
obj.each_with_object({}) do |(key, value), result|
result[key] = serialize_recursive(value, current_depth + 1, max_depth)
end
when Array
obj.map { |item| serialize_recursive(item, current_depth + 1, max_depth) }
else
obj
end
end
end
safe_data = SafeSerializer.serialize(deep_structure, max_depth: 50)
JSON.generate(safe_data) # Works without stack overflow
Version compatibility breaks deserialization when Ruby versions or gem versions change. Marshal format changes between Ruby versions can make data unreadable.
# Version-specific serialization wrapper
class CompatibleSerializer
RUBY_VERSION_MAP = {
"2.7" => :ruby_27,
"3.0" => :ruby_30,
"3.1" => :ruby_31,
"3.2" => :ruby_32
}.freeze
def self.dump(obj)
metadata = {
ruby_version: RUBY_VERSION,
marshal_version: Marshal::MAJOR_VERSION.to_s + "." + Marshal::MINOR_VERSION.to_s,
timestamp: Time.now.to_i,
serializer_version: "1.0"
}
Marshal.dump([metadata, obj])
end
def self.load(data)
begin
metadata, obj = Marshal.load(data)
if metadata[:ruby_version] != RUBY_VERSION
puts "Warning: Data serialized with Ruby #{metadata[:ruby_version]}, " \
"loading with Ruby #{RUBY_VERSION}"
end
obj
rescue TypeError, ArgumentError => e
# Attempt fallback strategies
load_with_fallback(data, e)
end
end
private
def self.load_with_fallback(data, original_error)
# Try loading as raw Marshal data (older format)
begin
Marshal.load(data)
rescue
# Try JSON if Marshal fails completely
begin
JSON.parse(data.force_encoding("UTF-8"))
rescue
raise original_error
end
end
end
end
Reference
Marshal Module
Method | Parameters | Returns | Description |
---|---|---|---|
Marshal.dump(obj, port=nil) |
obj (Object), port (IO, optional) |
String or writes to IO |
Serializes object to binary format |
Marshal.load(source, proc=nil) |
source (String/IO), proc (Proc, optional) |
Object |
Deserializes binary data to object |
Marshal.restore(source) |
source (String/IO) |
Object |
Alias for Marshal.load |
Marshal Constants:
Marshal::MAJOR_VERSION
- Major version number (4)Marshal::MINOR_VERSION
- Minor version number (8 in Ruby 3.2)
Serializable Types: All basic Ruby types, custom classes with instance variables, modules, constants, global variables (with restrictions)
Non-serializable Types: Proc
, Method
, UnboundMethod
, IO
, File
, Dir
, singleton objects
JSON Module
Method | Parameters | Returns | Description |
---|---|---|---|
JSON.generate(obj, opts={}) |
obj (Object), opts (Hash) |
String |
Converts object to JSON string |
JSON.dump(obj, io=nil, limit=nil) |
obj (Object), io (IO), limit (Integer) |
String or writes to IO |
Serializes with recursion limit |
JSON.parse(source, opts={}) |
source (String), opts (Hash) |
Object |
Parses JSON string to Ruby object |
JSON.load(source, proc=nil, opts={}) |
source (String/IO), proc (Proc), opts (Hash) |
Object |
Loads JSON with optional processing |
JSON.pretty_generate(obj, opts={}) |
obj (Object), opts (Hash) |
String |
Formatted JSON output |
JSON Generation Options:
:max_nesting
- Maximum nesting depth (default 100):allow_nan
- Allow NaN and Infinity values:indent
- Indentation string for pretty printing:space
- Space after colon and comma:object_nl
- Newline after objects:array_nl
- Newline after arrays
JSON Parsing Options:
:symbolize_names
- Convert keys to symbols:create_additions
- Enable JSON additions:object_class
- Class to create for JSON objects (default Hash):array_class
- Class to create for JSON arrays (default Array)
YAML Module
Method | Parameters | Returns | Description |
---|---|---|---|
YAML.dump(obj, io=nil) |
obj (Object), io (IO, optional) |
String or writes to IO |
Serializes object to YAML |
YAML.load(yaml, filename=nil) |
yaml (String), filename (String, optional) |
Object |
Deserializes YAML to object |
YAML.safe_load(yaml, permitted_classes: [], aliases: false) |
yaml (String), options (Hash) |
Object |
Safe YAML loading with restrictions |
YAML.load_file(filename) |
filename (String) |
Object |
Loads YAML from file |
YAML.dump_stream(*objects) |
*objects (Array) |
String |
Multiple documents in one stream |
YAML Safe Loading Options:
:permitted_classes
- Array of allowed classes:permitted_symbols
- Array of allowed symbols:aliases
- Allow aliases (default false):filename
- Filename for error reporting
YAML-specific Types: Symbols, Ranges, Regular expressions, Complex numbers, Rational numbers, Sets, custom tagged types
Error Classes
Exception | Module | Description |
---|---|---|
TypeError |
Marshal | Unsupported object type for serialization |
ArgumentError |
Marshal | Invalid marshal data or format |
JSON::ParserError |
JSON | Malformed JSON syntax |
JSON::NestingError |
JSON | Maximum nesting depth exceeded |
JSON::GeneratorError |
JSON | Object cannot be converted to JSON |
Psych::SyntaxError |
YAML | Invalid YAML syntax |
Psych::DisallowedClass |
YAML | Class not permitted in safe loading |
Psych::BadAlias |
YAML | Invalid alias reference |
Performance Characteristics
Format | Serialization Speed | Size Efficiency | Cross-platform | Human Readable |
---|---|---|---|---|
Marshal | Fastest | Most compact | Ruby only | No |
JSON | Moderate | Moderate | Universal | Yes |
YAML | Slowest | Least compact | Wide support | Yes |
Type Mapping
Ruby Type | Marshal | JSON | YAML |
---|---|---|---|
String |
Preserved with encoding | UTF-8 string | String with encoding |
Symbol |
Preserved | Converted to string | Preserved |
Integer |
All sizes preserved | Number (limited range) | Integer |
Float |
Preserved | Number | Float |
Array |
Preserved | Array | Sequence |
Hash |
Preserved | Object | Mapping |
Time |
Preserved with timezone | ISO 8601 string | Timestamp |
Date |
Preserved | String representation | Date |
Range |
Preserved | Not supported | Range |
Regexp |
Preserved | Not supported | Regular expression |
NilClass |
Preserved | null |
null |
TrueClass/FalseClass |
Preserved | true /false |
true /false |