Overview
Ruby provides multiple mechanisms for serializing objects into storable formats and deserializing them back into memory. The dump and load pattern appears across several standard library modules, each serving different use cases and data formats.
Marshal
offers Ruby's native binary serialization format, preserving Ruby object types and internal structure. JSON
provides text-based serialization compatible with web standards but limited to basic data types. YAML
delivers human-readable serialization with broader type support than JSON while maintaining cross-language compatibility.
# Marshal preserves Ruby object identity
data = { name: "Alice", scores: [95, 87, 92] }
serialized = Marshal.dump(data)
restored = Marshal.load(serialized)
# => {:name=>"Alice", :scores=>[95, 87, 92]}
# JSON converts to basic types
json_data = JSON.dump(data)
json_restored = JSON.load(json_data)
# => {"name"=>"Alice", "scores"=>[95, 87, 92]} (keys become strings)
# YAML maintains readability
yaml_data = YAML.dump(data)
yaml_restored = YAML.load(yaml_data)
# => {:name=>"Alice", :scores=>[95, 87, 92]}
Custom classes can implement dump and load behavior through several approaches. The marshal_dump
and marshal_load
methods control Marshal serialization, while to_json
and JSON parsing handle JSON conversion. Classes can also define custom serialization logic for specific business requirements.
Each serialization format involves tradeoffs between performance, portability, security, and feature completeness. Marshal provides the fastest serialization for Ruby objects but creates Ruby-specific binary data. JSON offers broad compatibility but loses type information. YAML balances readability with reasonable performance while supporting complex data structures.
Basic Usage
The fundamental dump and load pattern follows consistent syntax across Ruby's serialization libraries. The dump method converts objects into serialized form, while load reconstructs objects from serialized data.
# Marshal operations
original_object = [1, 2, { key: "value" }]
marshaled_data = Marshal.dump(original_object)
reconstructed = Marshal.load(marshaled_data)
# File-based Marshal serialization
File.open("data.marshal", "wb") do |file|
Marshal.dump(original_object, file)
end
loaded_from_file = File.open("data.marshal", "rb") do |file|
Marshal.load(file)
end
JSON serialization converts Ruby objects to JavaScript Object Notation, handling strings, numbers, arrays, and hashes. Ruby symbols convert to strings, and other object types require explicit handling.
# Basic JSON operations
data = { users: ["Alice", "Bob"], count: 2 }
json_string = JSON.dump(data)
# => "{\"users\":[\"Alice\",\"Bob\"],\"count\":2}"
parsed_data = JSON.load(json_string)
# => {"users"=>["Alice", "Bob"], "count"=>2}
# JSON with custom objects requires conversion
class User
attr_accessor :name, :email
def initialize(name, email)
@name, @email = name, email
end
def to_h
{ name: @name, email: @email }
end
end
user = User.new("Alice", "alice@example.com")
json_data = JSON.dump(user.to_h)
user_data = JSON.load(json_data)
restored_user = User.new(user_data["name"], user_data["email"])
YAML serialization preserves more Ruby types while maintaining human readability. YAML handles symbols, dates, and custom objects more naturally than JSON.
# YAML with complex data
complex_data = {
timestamp: Time.now,
config: { debug: true, timeout: 30 },
tags: [:important, :urgent]
}
yaml_output = YAML.dump(complex_data)
puts yaml_output
# ---
# :timestamp: 2024-03-15 10:30:45.123456 -04:00
# :config:
# :debug: true
# :timeout: 30
# :tags:
# - :important
# - :urgent
yaml_loaded = YAML.load(yaml_output)
# Preserves symbols and Time objects
Stream-based operations handle large datasets without loading entire objects into memory. Both Marshal and YAML support streaming through file handles or IO objects.
# Streaming multiple objects
objects = (1..1000).map { |i| { id: i, data: "item_#{i}" } }
File.open("stream.marshal", "wb") do |file|
objects.each { |obj| Marshal.dump(obj, file) }
end
# Reading streamed objects
loaded_objects = []
File.open("stream.marshal", "rb") do |file|
while !file.eof?
loaded_objects << Marshal.load(file)
end
end
Error Handling & Debugging
Serialization operations encounter various failure modes requiring robust error handling strategies. Marshal raises TypeError
for unserializable objects, while JSON and YAML have distinct error patterns for invalid data or parsing failures.
# Marshal error handling
class NonSerializable
def initialize
@proc = Proc.new { "cannot serialize" }
end
end
begin
Marshal.dump(NonSerializable.new)
rescue TypeError => e
puts "Marshal error: #{e.message}"
# Handle by converting to serializable form
serializable_data = { class_name: "NonSerializable", data: "converted" }
Marshal.dump(serializable_data)
end
# JSON error patterns
invalid_json = '{"incomplete": true'
begin
JSON.load(invalid_json)
rescue JSON::ParserError => e
puts "JSON parsing failed: #{e.message}"
# Attempt recovery with fallback parsing
fallback_data = eval(invalid_json.gsub('{', 'Hash[').gsub('}', ']')) rescue {}
end
# Circular reference handling
circular = {}
circular[:self] = circular
begin
JSON.dump(circular)
rescue JSON::NestingError => e
puts "Circular reference detected: #{e.message}"
# Break circular references before serialization
safe_circular = circular.dup
safe_circular.delete(:self)
JSON.dump(safe_circular)
end
YAML parsing errors require special attention due to its flexibility in representing data types. Invalid YAML syntax or security concerns with loaded objects demand careful validation.
# YAML error handling with validation
suspicious_yaml = <<-YAML
--- !ruby/object:User
name: "Alice"
instance_eval: "system('rm -rf /')"
YAML
begin
# Never use YAML.load with untrusted input
YAML.safe_load(suspicious_yaml, permitted_classes: [User])
rescue Psych::DisallowedClass => e
puts "YAML security violation: #{e.message}"
# Use safe_load with explicit permitted classes
safe_data = YAML.safe_load(suspicious_yaml, permitted_classes: [])
rescue Psych::SyntaxError => e
puts "YAML syntax error: #{e.message}"
# Attempt line-by-line validation for debugging
lines = suspicious_yaml.split("\n")
lines.each_with_index do |line, index|
begin
YAML.safe_load("---\n#{line}")
rescue Psych::SyntaxError
puts "Error on line #{index + 1}: #{line}"
end
end
end
Version compatibility issues arise when serialized data contains objects or formats incompatible with current Ruby versions. Implementing version checks prevents runtime failures.
# Version-aware serialization
class VersionedData
VERSION = "2.1.0".freeze
def self.dump(object)
versioned = { version: VERSION, data: object, timestamp: Time.now }
Marshal.dump(versioned)
end
def self.load(serialized_data)
container = Marshal.load(serialized_data)
case container[:version]
when "1.0.0"
migrate_from_v1(container[:data])
when "2.0.0"
migrate_from_v2(container[:data])
when VERSION
container[:data]
else
raise "Unsupported version: #{container[:version]}"
end
end
private
def self.migrate_from_v1(data)
# Handle v1 to current version migration
data.transform_keys(&:to_sym) if data.respond_to?(:transform_keys)
end
def self.migrate_from_v2(data)
# Handle v2 to current version migration
data[:migrated_at] = Time.now
data
end
end
Debugging serialization issues requires systematic approaches to identify problematic objects or data structures. Implementing diagnostic tools helps locate serialization failures in complex object graphs.
# Serialization diagnostic tools
module SerializationDebugger
def self.find_unserializable(object, path = "root")
case object
when Hash
object.each do |key, value|
find_unserializable(key, "#{path}[#{key.inspect}] (key)")
find_unserializable(value, "#{path}[#{key.inspect}]")
end
when Array
object.each_with_index do |item, index|
find_unserializable(item, "#{path}[#{index}]")
end
else
begin
Marshal.dump(object)
rescue TypeError => e
puts "Cannot serialize #{object.class} at #{path}: #{e.message}"
puts "Object: #{object.inspect}"
end
end
end
def self.size_analysis(object)
marshal_size = Marshal.dump(object).bytesize
json_size = JSON.dump(object).bytesize rescue nil
yaml_size = YAML.dump(object).bytesize rescue nil
puts "Serialization sizes:"
puts "Marshal: #{marshal_size} bytes"
puts "JSON: #{json_size || 'N/A'} bytes"
puts "YAML: #{yaml_size || 'N/A'} bytes"
end
end
Performance & Memory
Serialization performance varies significantly across formats and data types. Marshal provides the fastest serialization for Ruby-native objects, while JSON offers better performance for simple data structures with cross-platform requirements.
require 'benchmark'
# Performance comparison across formats
test_data = {
users: (1..1000).map do |i|
{
id: i,
name: "User #{i}",
email: "user#{i}@example.com",
metadata: { created_at: Time.now - i * 3600 }
}
end
}
Benchmark.bm(15) do |x|
x.report("Marshal dump:") { Marshal.dump(test_data) }
x.report("JSON dump:") { JSON.dump(test_data) }
x.report("YAML dump:") { YAML.dump(test_data) }
end
# Typical results:
# user system total real
# Marshal dump: 0.015000 0.000000 0.015000 ( 0.016234)
# JSON dump: 0.025000 0.000000 0.025000 ( 0.026891)
# YAML dump: 0.180000 0.010000 0.190000 ( 0.192356)
Memory usage patterns differ between formats, with Marshal creating the most compact representation for Ruby objects while YAML generates larger output due to human-readable formatting.
# Memory usage analysis
def memory_usage(&block)
before = GC.stat(:total_allocated_objects)
yield
after = GC.stat(:total_allocated_objects)
after - before
end
large_array = (1..10000).to_a
marshal_objects = memory_usage { Marshal.dump(large_array) }
json_objects = memory_usage { JSON.dump(large_array) }
yaml_objects = memory_usage { YAML.dump(large_array) }
puts "Object allocations during serialization:"
puts "Marshal: #{marshal_objects}"
puts "JSON: #{json_objects}"
puts "YAML: #{yaml_objects}"
# Size comparison
marshal_data = Marshal.dump(large_array)
json_data = JSON.dump(large_array)
yaml_data = YAML.dump(large_array)
puts "\nSerialized data sizes:"
puts "Marshal: #{marshal_data.bytesize} bytes"
puts "JSON: #{json_data.bytesize} bytes"
puts "YAML: #{yaml_data.bytesize} bytes"
Streaming operations minimize memory consumption when processing large datasets. Implementing custom streaming serializers prevents memory exhaustion with large object collections.
# Custom streaming serializer
class StreamingSerializer
def initialize(io)
@io = io
@count = 0
end
def dump_object(object)
Marshal.dump(object, @io)
@count += 1
end
def each_object
@io.rewind
@count.times do
yield Marshal.load(@io)
end
end
def close
@io.close
end
end
# Usage for large datasets
File.open("large_dataset.marshal", "wb") do |file|
serializer = StreamingSerializer.new(file)
# Process large dataset in chunks
(1..100000).each_slice(1000) do |chunk|
chunk_data = { batch: chunk, processed_at: Time.now }
serializer.dump_object(chunk_data)
end
end
# Memory-efficient reading
File.open("large_dataset.marshal", "rb") do |file|
serializer = StreamingSerializer.new(file)
serializer.each_object do |batch_data|
# Process each batch without loading entire dataset
process_batch(batch_data[:batch])
end
end
Optimization strategies focus on reducing serialization overhead through caching, selective serialization, and format-specific techniques.
# Serialization cache for expensive objects
class SerializationCache
def initialize
@cache = {}
@timestamps = {}
end
def get(key, object)
cache_key = "#{key}_#{object.object_id}"
if @cache[cache_key] && fresh?(cache_key)
@cache[cache_key]
else
@cache[cache_key] = Marshal.dump(object)
@timestamps[cache_key] = Time.now
@cache[cache_key]
end
end
def fresh?(cache_key, ttl = 300) # 5 minutes
@timestamps[cache_key] && (Time.now - @timestamps[cache_key]) < ttl
end
def clear
@cache.clear
@timestamps.clear
end
end
# Selective serialization for large objects
class SelectiveSerializer
def self.dump(object, options = {})
case options[:level]
when :summary
extract_summary(object)
when :full
object
else
extract_essential(object)
end.then { |data| Marshal.dump(data) }
end
private
def self.extract_summary(object)
case object
when Hash
object.select { |k, v| [:id, :name, :type].include?(k) }
when Array
{ count: object.size, sample: object.first(3) }
else
{ class: object.class.name, id: object.object_id }
end
end
def self.extract_essential(object)
# Custom logic for essential data extraction
object.respond_to?(:to_essential) ? object.to_essential : object
end
end
Production Patterns
Production environments require robust serialization strategies that handle failures gracefully, maintain data integrity, and support system monitoring. Implementing reliable dump and load patterns prevents data loss and enables system recovery.
# Production-grade serialization with retries and logging
class ProductionSerializer
include Logger::Severity
def initialize(logger: Rails.logger, max_retries: 3)
@logger = logger
@max_retries = max_retries
end
def dump(object, format: :marshal, file: nil)
retries = 0
begin
case format
when :marshal
data = Marshal.dump(object)
when :json
data = JSON.dump(object)
when :yaml
data = YAML.dump(object)
else
raise ArgumentError, "Unsupported format: #{format}"
end
if file
write_to_file(data, file)
else
data
end
rescue => e
retries += 1
@logger.error("Serialization failed (attempt #{retries}): #{e.message}")
if retries < @max_retries
sleep(2 ** retries) # Exponential backoff
retry
else
@logger.fatal("Serialization failed permanently after #{retries} attempts")
raise
end
end
end
def load(data_or_file, format: :marshal)
data = data_or_file.is_a?(String) && File.exist?(data_or_file) ?
File.read(data_or_file) : data_or_file
case format
when :marshal
Marshal.load(data)
when :json
JSON.load(data)
when :yaml
YAML.safe_load(data)
else
raise ArgumentError, "Unsupported format: #{format}"
end
rescue => e
@logger.error("Deserialization failed: #{e.message}")
attempt_recovery(data, format)
end
private
def write_to_file(data, file)
temp_file = "#{file}.tmp"
File.write(temp_file, data)
File.rename(temp_file, file)
ensure
File.delete(temp_file) if File.exist?(temp_file)
end
def attempt_recovery(data, format)
case format
when :json
# Attempt to fix common JSON issues
cleaned = data.gsub(/,\s*}/, '}').gsub(/,\s*]/, ']')
JSON.load(cleaned)
else
raise "Recovery not available for format: #{format}"
end
end
end
Distributed systems require serialization coordination across multiple services and data stores. Implementing version-aware serialization prevents compatibility issues during rolling deployments.
# Distributed serialization with version management
class DistributedSerializer
CURRENT_VERSION = "3.2.1".freeze
def self.dump(object, metadata = {})
envelope = {
version: CURRENT_VERSION,
timestamp: Time.now.utc.iso8601,
checksum: calculate_checksum(object),
metadata: metadata,
data: object
}
Marshal.dump(envelope)
end
def self.load(serialized_data, strict: false)
envelope = Marshal.load(serialized_data)
validate_version(envelope[:version], strict)
validate_checksum(envelope[:data], envelope[:checksum])
{
data: envelope[:data],
version: envelope[:version],
timestamp: Time.parse(envelope[:timestamp]),
metadata: envelope[:metadata] || {}
}
end
def self.validate_version(version, strict)
major, minor, patch = version.split('.').map(&:to_i)
current_major, current_minor, current_patch = CURRENT_VERSION.split('.').map(&:to_i)
if strict && version != CURRENT_VERSION
raise "Version mismatch: #{version} != #{CURRENT_VERSION}"
elsif major > current_major || (major == current_major && minor > current_minor)
raise "Future version not supported: #{version}"
end
end
def self.calculate_checksum(object)
Digest::SHA256.hexdigest(Marshal.dump(object))
end
def self.validate_checksum(object, expected)
actual = calculate_checksum(object)
raise "Checksum mismatch" unless actual == expected
end
end
Database integration patterns handle serialization for persistent storage, implementing custom ActiveRecord serializers and handling schema evolution.
# ActiveRecord integration with custom serialization
class ConfigurationSettings < ActiveRecord::Base
serialize :data, JSON
# Custom serializer for complex objects
def self.dump(object)
case object
when Hash, Array
JSON.dump(object)
when Time, Date
JSON.dump({ _type: object.class.name, _value: object.iso8601 })
else
JSON.dump({ _type: 'Object', _class: object.class.name, _value: object.to_h })
end
end
def self.load(json_string)
return nil if json_string.blank?
data = JSON.parse(json_string)
case data
when Hash
if data['_type']
case data['_type']
when 'Time'
Time.parse(data['_value'])
when 'Date'
Date.parse(data['_value'])
else
data
end
else
data.symbolize_keys
end
else
data
end
rescue JSON::ParserError => e
Rails.logger.error("Failed to deserialize configuration: #{e.message}")
nil
end
end
# Usage in application
config = ConfigurationSettings.create!(
name: 'api_settings',
data: {
timeout: 30,
retries: 3,
endpoints: ['api.service.com', 'backup.service.com'],
last_updated: Time.current
}
)
Monitoring and alerting systems track serialization performance and failures, enabling proactive system maintenance.
# Serialization monitoring and metrics
class SerializationMonitor
def self.with_monitoring(operation_name, &block)
start_time = Time.current
begin
result = yield
record_success(operation_name, Time.current - start_time)
result
rescue => e
record_failure(operation_name, e, Time.current - start_time)
raise
end
end
def self.record_success(operation, duration)
Rails.logger.info("Serialization success: #{operation} (#{duration.round(3)}s)")
# Send metrics to monitoring system
StatsD.timing("serialization.#{operation}.duration", duration * 1000)
StatsD.increment("serialization.#{operation}.success")
end
def self.record_failure(operation, error, duration)
Rails.logger.error("Serialization failure: #{operation} - #{error.message} (#{duration.round(3)}s)")
StatsD.timing("serialization.#{operation}.duration", duration * 1000)
StatsD.increment("serialization.#{operation}.failure")
StatsD.increment("serialization.#{operation}.#{error.class.name.underscore}")
end
end
# Usage in application code
class DataProcessor
def process_batch(batch_data)
SerializationMonitor.with_monitoring("batch_processing") do
serialized = Marshal.dump(batch_data)
Redis.current.setex("batch:#{batch_data[:id]}", 3600, serialized)
end
end
def retrieve_batch(batch_id)
SerializationMonitor.with_monitoring("batch_retrieval") do
serialized = Redis.current.get("batch:#{batch_id}")
return nil unless serialized
Marshal.load(serialized)
end
end
end
Common Pitfalls
Security vulnerabilities represent the most critical pitfall in dump and load operations. Marshal.load executes arbitrary Ruby code during deserialization, creating remote code execution risks with untrusted data.
# DANGEROUS - Never do this with untrusted input
untrusted_data = params[:serialized_data] # From user input
dangerous_object = Marshal.load(Base64.decode64(untrusted_data))
# SAFE - Validate and sanitize before deserialization
class SafeDeserializer
ALLOWED_CLASSES = [String, Integer, Float, Array, Hash, Symbol, Time].freeze
def self.load(serialized_data, allowed_classes: ALLOWED_CLASSES)
# Use JSON for untrusted data
JSON.parse(serialized_data, create_additions: false)
rescue JSON::ParserError
raise "Invalid serialized data"
end
def self.marshal_load(data, validate: true)
if validate
# Only load from trusted sources
raise "Untrusted Marshal data" unless trusted_source?
end
Marshal.load(data)
end
private
def self.trusted_source?
# Implement source validation logic
Thread.current[:trusted_serialization] == true
end
end
Symbol memory leaks occur when deserializing untrusted data containing symbols. Ruby never garbage collects symbols, leading to memory exhaustion attacks.
# VULNERABLE - Symbols from untrusted data
user_data = JSON.parse(params[:data], symbolize_names: true)
# Attacker can create unlimited symbols: {"aaaa": 1, "bbbb": 2, ...}
# SAFE - String keys with manual conversion
user_data = JSON.parse(params[:data])
safe_symbols = user_data.select { |k, v| k.in?(['name', 'email', 'id']) }
.transform_keys(&:to_sym)
# Monitor symbol table growth
class SymbolMonitor
def self.check_symbol_count(threshold: 100_000)
count = Symbol.all_symbols.count
Rails.logger.warn("High symbol count: #{count}") if count > threshold
count
end
end
Object identity and reference issues arise when serialization breaks object relationships or creates unexpected duplicates.
# Reference identity problems
shared_object = { data: "shared" }
container = {
first: shared_object,
second: shared_object
}
# Marshal preserves identity
marshaled = Marshal.dump(container)
restored = Marshal.load(marshaled)
restored[:first].object_id == restored[:second].object_id
# => true (identity preserved)
# JSON breaks identity
json_data = JSON.dump(container)
json_restored = JSON.parse(json_data)
json_restored["first"].object_id == json_restored["second"].object_id
# => false (separate objects created)
# Handle circular references
class CircularSafeHash < Hash
def to_json(*args)
# Track visited objects to prevent infinite recursion
visited = Thread.current[:json_visited] ||= Set.new
if visited.include?(self.object_id)
return '{"_circular_reference": true}'
end
visited.add(self.object_id)
result = super
visited.delete(self.object_id)
result
end
end
Encoding and character set issues create corruption when serializing text data across systems with different default encodings.
# Encoding problems in serialization
text_data = { message: "Hello 你好".encode("UTF-8") }
# Marshal preserves encoding
marshaled = Marshal.dump(text_data)
File.write("data.marshal", marshaled, mode: "wb")
restored = Marshal.load(File.read("data.marshal", mode: "rb"))
restored[:message].encoding
# => #<Encoding:UTF-8>
# JSON may lose encoding information
json_data = JSON.dump(text_data)
File.write("data.json", json_data, encoding: "ASCII")
# => Encoding::UndefinedConversionError
# Safe encoding handling
class EncodingAwareSerializer
def self.dump(object, encoding: Encoding::UTF_8)
case object
when String
{ _content: object.force_encoding(encoding).encode(encoding),
_encoding: encoding.name }
when Hash
object.transform_values { |v| dump(v, encoding: encoding) }
when Array
object.map { |item| dump(item, encoding: encoding) }
else
object
end
end
def self.load(data)
case data
when Hash
if data[:_content] && data[:_encoding]
data[:_content].force_encoding(data[:_encoding])
else
data.transform_values { |v| load(v) }
end
when Array
data.map { |item| load(item) }
else
data
end
end
end
Version compatibility failures occur when serialized objects contain classes or methods unavailable in different Ruby versions or application deployments.
# Version compatibility issues
class DeprecatedFeature
def initialize(data)
@data = data
@legacy_method = method(:old_behavior) # Method may not exist in newer versions
end
def marshal_dump
[@data, @legacy_method.name]
end
def marshal_load(array)
@data, method_name = array
@legacy_method = method(method_name) if respond_to?(method_name, true)
end
end
# Safe versioned serialization
class VersionedClass
VERSION = 2
def marshal_dump
[VERSION, @data, @new_field]
end
def marshal_load(array)
version = array.first
case version
when 1
@data = array[1]
@new_field = nil # Provide default for missing field
when 2
@data, @new_field = array[1], array[2]
else
raise "Unsupported version: #{version}"
end
end
end
# Handle missing constants during deserialization
module ConstantMissing
def self.handle_missing_constant(name)
case name
when 'OldClassName'
# Map to new class name
NewClassName
when 'RemovedClass'
# Create placeholder
Class.new do
def initialize(*args); end
def method_missing(*args); end
end
else
raise NameError, "uninitialized constant #{name}"
end
end
end
# Monkey patch for graceful degradation
class Module
alias_method :original_const_missing, :const_missing
def const_missing(name)
ConstantMissing.handle_missing_constant(name)
rescue NameError
original_const_missing(name)
end
end
Performance degradation in production often stems from serializing oversized objects, inefficient format choices, or excessive serialization frequency without caching.
# Performance pitfalls and solutions
class IneffientModel
def initialize
@large_dataset = (1..1_000_000).to_a # Avoid serializing large collections
@cached_calculation = expensive_calculation # Don't recalculate on each dump
end
# BAD - Serializes entire large dataset
def marshal_dump
[@large_dataset, @cached_calculation]
end
end
class OptimizedModel
def initialize
@large_dataset = (1..1_000_000).to_a
@cached_calculation = expensive_calculation
end
# GOOD - Serialize only essential data
def marshal_dump
{
dataset_size: @large_dataset.size,
dataset_checksum: Digest::MD5.hexdigest(@large_dataset.join),
cached_calculation: @cached_calculation
}
end
def marshal_load(data)
# Reconstruct large dataset only when needed
@dataset_size = data[:dataset_size]
@dataset_checksum = data[:dataset_checksum]
@cached_calculation = data[:cached_calculation]
@large_dataset = nil # Lazy load when accessed
end
def large_dataset
@large_dataset ||= reconstruct_dataset
end
private
def reconstruct_dataset
# Rebuild dataset from database or cache
(1..@dataset_size).to_a
end
end
Reference
Marshal Methods
Method | Parameters | Returns | Description |
---|---|---|---|
Marshal.dump(obj, port=nil) |
obj (Object), port (IO, optional) |
String or nil |
Serializes object to binary format, writes to port if provided |
Marshal.load(source, proc=nil) |
source (String/IO), proc (Proc, optional) |
Object |
Deserializes binary data, calls proc for each object if provided |
Marshal.restore(source) |
source (String/IO) |
Object |
Alias for Marshal.load |
JSON Methods
Method | Parameters | Returns | Description |
---|---|---|---|
JSON.dump(obj, io=nil, limit=nil) |
obj (Object), io (IO, optional), limit (Integer, optional) |
String or nil |
Converts object to JSON string, writes to io if provided |
JSON.load(source, proc=nil, options={}) |
source (String/IO), proc (Proc, optional), options (Hash) |
Object |
Parses JSON data with optional processing proc |
JSON.parse(source, opts={}) |
source (String), opts (Hash) |
Object |
Parses JSON string with configuration options |
JSON.generate(obj, opts={}) |
obj (Object), opts (Hash) |
String |
Generates JSON string with formatting options |
YAML Methods
Method | Parameters | Returns | Description |
---|---|---|---|
YAML.dump(obj, io=nil) |
obj (Object), io (IO, optional) |
String or nil |
Converts object to YAML format, writes to io if provided |
YAML.load(yaml, filename=nil) |
yaml (String/IO), filename (String, optional) |
Object |
Deserializes YAML data (unsafe with untrusted input) |
YAML.safe_load(yaml, permitted_classes=[], aliases=false) |
yaml (String/IO), permitted_classes (Array), aliases (Boolean) |
Object |
Safely deserializes YAML with class restrictions |
YAML.load_file(filename) |
filename (String) |
Object |
Loads and parses YAML from file |
Custom Serialization Methods
Method | Parameters | Returns | Description |
---|---|---|---|
#marshal_dump |
None | Object |
Defines custom Marshal serialization data |
#marshal_load(obj) |
obj (Object) |
nil |
Restores object from Marshal serialization data |
#to_json(*args) |
*args (varied) |
String |
Defines custom JSON representation |
#as_json(options={}) |
options (Hash) |
Object |
Returns object for JSON serialization |
Common Options
JSON Options:
symbolize_names: true
- Convert string keys to symbolsallow_nan: true
- Allow NaN and Infinity valuesmax_nesting: 100
- Maximum nesting depthcreate_additions: false
- Disable object creation from JSON
YAML Safe Load Options:
permitted_classes: [Class1, Class2]
- Allow specific classespermitted_symbols: [:symbol1, :symbol2]
- Allow specific symbolsaliases: true
- Enable YAML aliases and anchors
Error Types
Error Class | Raised By | Description |
---|---|---|
TypeError |
Marshal | Object cannot be serialized |
ArgumentError |
Marshal | Invalid serialized data format |
JSON::ParserError |
JSON | Invalid JSON syntax |
JSON::NestingError |
JSON | Maximum nesting depth exceeded |
JSON::GeneratorError |
JSON | Object cannot be converted to JSON |
Psych::SyntaxError |
YAML | Invalid YAML syntax |
Psych::DisallowedClass |
YAML | Class not permitted in safe_load |
Security Considerations
Never Use With Untrusted Data:
Marshal.load
- Executes arbitrary codeYAML.load
- Can instantiate dangerous objectsJSON.load
withcreate_additions: true
Safe Alternatives:
JSON.parse
for basic data typesYAML.safe_load
with permitted classes- Custom validation before deserialization
Performance Guidelines
Speed Ranking (fastest to slowest):
- Marshal - Binary format, Ruby-native
- JSON - Text format, simple types
- YAML - Text format, complex types
Memory Usage:
- Marshal: Most compact for Ruby objects
- JSON: Moderate size, cross-platform
- YAML: Largest output due to formatting