Overview
Ruby provides two primary classes for creating simple data structures: Struct
and Data
. Both classes generate value objects with named attributes, but they serve different purposes and exhibit distinct behaviors. Struct
creates mutable objects with optional method definitions, while Data
produces immutable value objects focused on data integrity and functional programming patterns.
Struct
has been part of Ruby since early versions, designed as a convenient way to create classes with named attributes and accessor methods. The Struct.new
method returns a new class with the specified attributes, supporting both positional and keyword arguments for initialization.
Person = Struct.new(:name, :age)
person = Person.new("Alice", 30)
person.name = "Bob" # Mutable
# => "Bob"
Data
was introduced in Ruby 3.2 as an immutable alternative. Data objects cannot be modified after creation, making them suitable for functional programming patterns and situations requiring data integrity guarantees.
Person = Data.define(:name, :age)
person = Person.new(name: "Alice", age: 30)
person.with(name: "Bob") # Returns new instance
# => #<data Person name="Bob", age=30>
The fundamental difference lies in mutability. Struct
instances can be modified after creation, while Data
instances are frozen and immutable. This affects memory usage, thread safety, and programming patterns. Data
objects also provide built-in pattern matching support and more restrictive initialization semantics.
Both classes automatically generate accessor methods, equality comparisons, and hash methods. However, they differ in their approach to customization, inheritance, and method definition. Struct
allows defining methods within the class definition block, while Data
focuses on pure data representation with minimal behavior.
Basic Usage
Struct
creation supports multiple initialization patterns. The most common approach defines attributes as symbols, creating a new class with accessor methods for each attribute.
# Basic struct definition
Point = Struct.new(:x, :y)
point = Point.new(10, 20)
point.x # => 10
point.y = 30 # Modifies existing instance
# Keyword arguments
Person = Struct.new(:name, :age, keyword_init: true)
person = Person.new(name: "Carol", age: 25)
person.age = 26 # Direct modification
Data
requires keyword arguments for initialization and provides a different creation syntax. The Data.define
method creates an immutable class with the specified attributes.
# Basic data definition
Point = Data.define(:x, :y)
point = Point.new(x: 10, y: 20)
point.x # => 10
# point.y = 30 # Raises FrozenError
# Creating modified copies
new_point = point.with(y: 30)
# => #<data Point x=10, y=30>
Both structures support destructuring and pattern matching, but Data
provides enhanced pattern matching capabilities. Struct
can be destructured using array-like syntax, while Data
supports both array and hash-like destructuring.
# Struct destructuring
Point = Struct.new(:x, :y)
point = Point.new(5, 15)
x, y = point.to_a
# => [5, 15]
# Data pattern matching
Point = Data.define(:x, :y)
point = Point.new(x: 5, y: 15)
case point
in Point(x: 0, y:)
puts "On Y axis: #{y}"
in Point(x:, y: 0)
puts "On X axis: #{x}"
in Point(x:, y:)
puts "Point at #{x}, #{y}"
end
Default values work differently between the two classes. Struct
supports default values through initialization parameters, while Data
handles defaults through the with
method and careful initialization patterns.
# Struct with defaults
Config = Struct.new(:host, :port, :timeout) do
def initialize(host: "localhost", port: 8080, timeout: 30)
super(host, port, timeout)
end
end
# Data with defaults
Config = Data.define(:host, :port, :timeout) do
def self.default
new(host: "localhost", port: 8080, timeout: 30)
end
end
Advanced Usage
Both Struct
and Data
support method definition, but with different philosophies. Struct
encourages adding behavior directly to the generated class, while Data
promotes composition and functional patterns.
# Struct with custom methods
class Rectangle < Struct.new(:width, :height)
def area
width * height
end
def resize!(factor)
self.width *= factor
self.height *= factor
self
end
def perimeter
2 * (width + height)
end
end
rect = Rectangle.new(10, 5)
rect.resize!(2) # Modifies in place
rect.area # => 100
Data
classes focus on immutable transformations and functional composition. Method definitions typically return new instances rather than modifying existing ones.
Rectangle = Data.define(:width, :height) do
def area
width * height
end
def resize(factor)
with(width: width * factor, height: height * factor)
end
def perimeter
2 * (width + height)
end
def scale_to_area(target_area)
factor = Math.sqrt(target_area.to_f / area)
resize(factor)
end
end
rect = Rectangle.new(width: 10, height: 5)
bigger_rect = rect.resize(2) # Returns new instance
scaled_rect = rect.scale_to_area(200)
Inheritance patterns differ significantly. Struct
supports classical inheritance with shared mutable state, while Data
inheritance maintains immutability constraints across the hierarchy.
# Struct inheritance
Animal = Struct.new(:name, :species)
class Dog < Animal
def initialize(name, breed)
super(name, "dog")
@breed = breed
end
attr_reader :breed
def bark
"#{name} says woof!"
end
end
# Data inheritance
Animal = Data.define(:name, :species)
Dog = Data.define(:name, :breed) do
def initialize(name:, breed:)
super(name: name, species: "dog", breed: breed)
end
def bark
"#{name} says woof!"
end
end
Complex initialization and validation logic requires different approaches. Struct
can modify instance variables during initialization, while Data
must validate during creation since instances become immutable.
# Struct with validation
class EmailContact < Struct.new(:email, :name)
def initialize(email, name = nil)
raise ArgumentError, "Invalid email" unless email.include?("@")
super
normalize_email!
end
private
def normalize_email!
self.email = email.downcase.strip
end
end
# Data with validation
EmailContact = Data.define(:email, :name) do
def initialize(email:, name: nil)
raise ArgumentError, "Invalid email" unless email.include?("@")
super(email: email.downcase.strip, name: name)
end
def update_email(new_email)
self.class.new(email: new_email, name: name)
end
end
Performance & Memory
Memory usage patterns differ substantially between Struct
and Data
due to their mutability characteristics. Struct
instances consume less memory initially but may require additional allocations when modified. Data
instances are frozen and optimized for sharing but create new objects for each modification.
require 'benchmark/memory'
# Memory comparison for creation
Benchmark.memory do |x|
Point = Struct.new(:x, :y)
x.report("Struct creation") do
1000.times { Point.new(rand(100), rand(100)) }
end
DataPoint = Data.define(:x, :y)
x.report("Data creation") do
1000.times { DataPoint.new(x: rand(100), y: rand(100)) }
end
x.compare!
end
Performance characteristics vary based on usage patterns. Struct
excels at in-place modifications and scenarios requiring frequent updates. Data
performs better in functional programming contexts with many intermediate values and sharing scenarios.
require 'benchmark'
# Performance comparison for modifications
Benchmark.bm do |x|
struct_point = Struct.new(:x, :y).new(0, 0)
data_point = Data.define(:x, :y).new(x: 0, y: 0)
x.report("Struct mutation") do
point = struct_point.dup
1000.times do |i|
point.x = i
point.y = i * 2
end
end
x.report("Data transformation") do
point = data_point
1000.times do |i|
point = point.with(x: i, y: i * 2)
end
end
end
Hash and equality operations show different performance profiles. Data
objects benefit from cached hash values and optimized equality checks, while Struct
instances recalculate these values based on current attribute states.
# Hash performance comparison
struct_points = Array.new(1000) { Struct.new(:x, :y).new(rand(100), rand(100)) }
data_points = Array.new(1000) { Data.define(:x, :y).new(x: rand(100), y: rand(100)) }
Benchmark.bm do |x|
x.report("Struct hash operations") do
hash = {}
struct_points.each { |point| hash[point] = true }
end
x.report("Data hash operations") do
hash = {}
data_points.each { |point| hash[point] = true }
end
end
Memory sharing scenarios favor Data
objects. Since they're immutable, multiple references to the same Data
instance don't risk unexpected mutations. Struct
instances require defensive copying in shared contexts.
# Memory sharing example
shared_config = Data.define(:host, :port, :ssl).new(
host: "api.example.com",
port: 443,
ssl: true
)
# Safe to share across threads and contexts
clients = Array.new(10) do |i|
# Each client can safely reference shared config
{ id: i, config: shared_config }
end
# Struct requires defensive copying
StructConfig = Struct.new(:host, :port, :ssl)
base_config = StructConfig.new("api.example.com", 443, true)
clients = Array.new(10) do |i|
# Must duplicate to prevent accidental mutations
{ id: i, config: base_config.dup }
end
Common Pitfalls
Mutability assumptions cause frequent errors when switching between Struct
and Data
. Code expecting mutable behavior fails with Data
objects, while functional code may not account for Struct
mutations.
# Dangerous assumption with Data
def update_coordinates(point, x, y)
point.x = x # FrozenError with Data objects
point.y = y
point
end
# Correct approach for both
def update_coordinates(point, x, y)
if point.respond_to?(:with)
point.with(x: x, y: y) # Data
else
point.dup.tap { |p| p.x = x; p.y = y } # Struct
end
end
Initialization syntax differences create subtle bugs. Struct
accepts both positional and keyword arguments depending on configuration, while Data
always requires keywords.
# Struct flexibility can hide bugs
Person = Struct.new(:name, :age)
person1 = Person.new("Alice", 30) # Positional
person2 = Person.new(age: 25) # Partial keyword
person3 = Person.new("Bob", age: 40) # Mixed - dangerous!
# Data consistency
Person = Data.define(:name, :age)
person1 = Person.new(name: "Alice", age: 30) # Always keywords
# person2 = Person.new("Bob", 25) # ArgumentError
Pattern matching behavior varies between the classes. Data
provides first-class pattern matching support, while Struct
requires array-style destructuring or additional setup.
# Pattern matching pitfall
def process_point(point)
case point
in { x: 0, y: } # Works with Data, not with Struct
"On Y axis"
in [0, y] # Works with Struct, not with Data
"On Y axis"
end
end
# Robust pattern matching
def process_point(point)
case point
when ->(p) { p.x == 0 }
"On Y axis: #{point.y}"
when ->(p) { p.y == 0 }
"On X axis: #{point.x}"
else
"Point at #{point.x}, #{point.y}"
end
end
Thread safety misconceptions occur frequently. While Data
objects are immutable and thread-safe, Struct
instances require synchronization for shared access.
# Thread safety pitfall with Struct
counter = Struct.new(:value).new(0)
threads = 10.times.map do
Thread.new do
1000.times { counter.value += 1 } # Race condition
end
end
threads.each(&:join)
# counter.value is unpredictable
# Data approach requires different pattern
Counter = Data.define(:value)
counter = Counter.new(value: 0)
mutex = Mutex.new
threads = 10.times.map do
Thread.new do
1000.times do
mutex.synchronize do
counter = counter.with(value: counter.value + 1)
end
end
end
end
Serialization and deserialization behavior differs subtly. Both classes support standard Ruby serialization, but Data
objects maintain their frozen state across serialization boundaries.
# Serialization behavior
struct_point = Struct.new(:x, :y).new(10, 20)
data_point = Data.define(:x, :y).new(x: 10, y: 20)
# Both serialize similarly
struct_yaml = YAML.dump(struct_point)
data_yaml = YAML.dump(data_point)
# But deserialize with different mutability
restored_struct = YAML.load(struct_yaml)
restored_data = YAML.load(data_yaml)
restored_struct.x = 30 # Works
# restored_data.x = 30 # FrozenError
Production Patterns
Web application contexts often require different approaches for Struct
and Data
usage. Data
objects work well for configuration, request/response objects, and functional pipelines, while Struct
fits mutable model attributes and builder patterns.
# API response modeling with Data
APIResponse = Data.define(:status, :data, :errors) do
def success?
status == 200 && errors.empty?
end
def with_error(error)
with(errors: errors + [error])
end
def transform_data(&block)
return self unless success?
with(data: block.call(data))
end
end
# Usage in Rails controller
class UsersController < ApplicationController
def show
user = User.find(params[:id])
response = APIResponse.new(
status: 200,
data: user.as_json,
errors: []
)
enriched = response
.transform_data { |data| data.merge(preferences: user.preferences) }
.transform_data { |data| data.merge(avatar_url: avatar_service.url_for(user)) }
render json: enriched.data
rescue ActiveRecord::RecordNotFound => e
error_response = APIResponse.new(status: 404, data: nil, errors: [e.message])
render json: error_response, status: 404
end
end
Database integration patterns highlight the differences in approach. Struct
objects can represent mutable active record attributes, while Data
objects work better for value objects and immutable domain models.
# Struct for mutable database representations
class UserProfile < Struct.new(:user_id, :bio, :website, :location, keyword_init: true)
def self.from_database(row)
new(
user_id: row['user_id'],
bio: row['bio'],
website: row['website'],
location: row['location']
)
end
def update_from_params(params)
self.bio = params[:bio] if params.key?(:bio)
self.website = params[:website] if params.key?(:website)
self.location = params[:location] if params.key?(:location)
end
def to_database_hash
{ user_id: user_id, bio: bio, website: website, location: location }
end
end
# Data for immutable domain models
Address = Data.define(:street, :city, :state, :zip_code) do
def self.from_string(address_string)
parts = address_string.split(', ')
new(
street: parts[0],
city: parts[1],
state: parts[2]&.split(' ')&.first,
zip_code: parts[2]&.split(' ')&.last
)
end
def formatted
"#{street}, #{city}, #{state} #{zip_code}"
end
def in_state?(target_state)
state.downcase == target_state.downcase
end
end
Background job processing shows clear distinctions in usage patterns. Data
objects excel as immutable job parameters, while Struct
objects work well for mutable job state tracking.
# Data for immutable job parameters
EmailJob = Data.define(:recipient, :subject, :template, :variables) do
def perform
EmailService.send_email(
to: recipient,
subject: subject,
body: TemplateRenderer.render(template, variables)
)
end
def retry_with_delay(delay_seconds)
with(variables: variables.merge(retry_delay: delay_seconds))
end
end
# Struct for mutable job tracking
class JobStatus < Struct.new(:job_id, :status, :progress, :started_at, :completed_at, keyword_init: true)
def start!
self.status = 'running'
self.started_at = Time.current
save_to_redis
end
def update_progress!(percent)
self.progress = percent
save_to_redis
end
def complete!
self.status = 'completed'
self.progress = 100
self.completed_at = Time.current
save_to_redis
end
private
def save_to_redis
Redis.current.setex("job:#{job_id}", 3600, to_json)
end
end
Caching strategies require different approaches. Data
objects make excellent cache keys due to their immutability and hash consistency, while Struct
objects need careful handling to avoid cache invalidation issues.
# Data objects as cache keys
UserPreferences = Data.define(:theme, :language, :timezone, :notifications) do
def cache_key
"preferences:#{hash}"
end
def self.cached_for_user(user_id)
cache_key = "user_preferences:#{user_id}"
Rails.cache.fetch(cache_key, expires_in: 1.hour) do
# Load from database and return Data object
row = Database.query("SELECT * FROM user_preferences WHERE user_id = ?", user_id).first
new(
theme: row['theme'],
language: row['language'],
timezone: row['timezone'],
notifications: JSON.parse(row['notifications'])
)
end
end
end
# Struct requires cache invalidation management
class MutableUserPreferences < Struct.new(:user_id, :theme, :language, :timezone, :notifications, keyword_init: true)
def save!
Database.query("UPDATE user_preferences SET ... WHERE user_id = ?", user_id)
invalidate_cache
end
def update_theme(new_theme)
self.theme = new_theme
save!
end
private
def invalidate_cache
Rails.cache.delete("user_preferences:#{user_id}")
end
end
Reference
Class Creation Methods
Method | Parameters | Returns | Description |
---|---|---|---|
Struct.new(*attrs, keyword_init: false, &block) |
attrs (Array), keyword_init (Boolean) |
Class |
Creates new Struct class with specified attributes |
Data.define(*attrs, &block) |
attrs (Array) |
Class |
Creates new Data class with specified attributes |
Instance Creation
Method | Parameters | Returns | Description |
---|---|---|---|
StructClass.new(*values) |
values (Array) |
StructInstance |
Creates struct instance with positional arguments |
StructClass.new(**kwargs) |
kwargs (Hash) |
StructInstance |
Creates struct instance with keyword arguments (if enabled) |
DataClass.new(**kwargs) |
kwargs (Hash) |
DataInstance |
Creates data instance with keyword arguments only |
Instance Methods - Common
Method | Parameters | Returns | Description |
---|---|---|---|
#to_a |
None | Array |
Returns array of attribute values |
#to_h |
None | Hash |
Returns hash of attribute name/value pairs |
#== |
other (Object) |
Boolean |
Compares objects by attribute values |
#eql? |
other (Object) |
Boolean |
Strict equality comparison |
#hash |
None | Integer |
Returns hash value for object |
#inspect |
None | String |
Returns string representation |
Instance Methods - Struct Only
Method | Parameters | Returns | Description |
---|---|---|---|
#[](name_or_index) |
name_or_index (Symbol/Integer) |
Object |
Gets attribute value by name or index |
#[]=(name_or_index, value) |
name_or_index (Symbol/Integer), value (Object) |
Object |
Sets attribute value by name or index |
#each |
&block |
Enumerator/self |
Iterates over attribute values |
#each_pair |
&block |
Enumerator/self |
Iterates over attribute name/value pairs |
#length |
None | Integer |
Returns number of attributes |
#size |
None | Integer |
Alias for length |
Instance Methods - Data Only
Method | Parameters | Returns | Description |
---|---|---|---|
#with(**kwargs) |
kwargs (Hash) |
DataInstance |
Returns new instance with updated attributes |
#deconstruct |
None | Array |
Returns array for pattern matching |
#deconstruct_keys |
keys (Array) |
Hash |
Returns hash for pattern matching |
Mutability Characteristics
Feature | Struct | Data |
---|---|---|
Attribute Modification | Mutable via attr= methods |
Immutable, frozen after creation |
In-place Updates | Supported | Not supported, raises FrozenError |
Thread Safety | Requires synchronization | Thread-safe due to immutability |
Memory Sharing | Requires defensive copying | Safe to share references |
Functional Patterns | Requires explicit copying | Built-in via #with method |
Initialization Patterns
Pattern | Struct | Data |
---|---|---|
Positional Args | Point.new(x, y) |
Not supported |
Keyword Args | Point.new(x: 1, y: 2) (if enabled) |
Point.new(x: 1, y: 2) (required) |
Mixed Args | Supported but discouraged | Not supported |
Partial Init | Fills missing with nil |
Requires all attributes |
Default Values | Via custom initialize |
Via factory methods or #with |
Performance Characteristics
Operation | Struct | Data |
---|---|---|
Creation | Faster | Slightly slower (immutability setup) |
Modification | In-place, very fast | Creates new instance, slower |
Hash Operations | Recalculates hash | Cached hash value |
Equality Checks | Standard comparison | Optimized for immutable data |
Memory Usage | Lower per instance | Higher per instance, better sharing |
GC Pressure | Lower for mutations | Higher for transformations |
Pattern Matching Support
Feature | Struct | Data |
---|---|---|
Array Patterns | in [x, y] |
in [x, y] |
Hash Patterns | Limited support | in {x:, y:} |
Deconstruction | Via to_a |
Via #deconstruct and #deconstruct_keys |
Variable Binding | Manual extraction | Automatic via pattern matching |
Guard Clauses | External conditions | Integrated pattern support |
Common Error Conditions
Error | Struct | Data |
---|---|---|
FrozenError | Only if explicitly frozen | Always raised on mutation attempts |
ArgumentError | Wrong number of arguments | Missing required keywords |
NoMethodError | Invalid attribute names | Invalid attribute names |
TypeError | Type mismatches in custom logic | Type mismatches in custom logic |