CrackedRuby CrackedRuby

Entity-Relationship Modeling

Overview

Entity-Relationship Modeling represents the logical structure of databases through entities, attributes, and relationships. Developed by Peter Chen in 1976, ER modeling provides a visual and conceptual framework for database design that precedes physical implementation. The model abstracts real-world scenarios into a schema that database management systems can implement.

An entity represents a distinguishable object or concept in the problem domain. Attributes describe properties of entities. Relationships define associations between entities. These three components form the foundation of database schema design.

# Conceptual ER model represented in Ruby classes
class Customer
  attr_accessor :customer_id, :name, :email
  # Relationship: Customer has_many Orders
end

class Order
  attr_accessor :order_id, :order_date, :total
  # Relationship: Order belongs_to Customer
  # Relationship: Order has_many LineItems
end

ER modeling serves as the bridge between business requirements and database implementation. The process converts narrative descriptions of data requirements into structured schemas that eliminate redundancy and ensure data integrity. Without proper ER modeling, databases suffer from update anomalies, deletion anomalies, and insertion anomalies that compromise data quality.

The modeling process operates at three levels of abstraction: conceptual, logical, and physical. Conceptual models capture high-level business requirements independent of any database system. Logical models refine the conceptual model with additional detail while remaining platform-independent. Physical models translate the logical design into database-specific implementation including indexes, constraints, and storage parameters.

Key Principles

Entities represent objects or concepts that require persistent storage. An entity possesses independent existence and contains multiple instances. Each instance must be distinguishable from other instances through unique identification. Strong entities exist independently while weak entities depend on other entities for identification.

# Strong entity - exists independently
class Product
  attr_accessor :product_id, :name, :price
end

# Weak entity - depends on Order for identification
class LineItem
  attr_accessor :order_id, :line_number, :quantity, :price
end

Attributes describe properties of entities. Attributes divide into simple (atomic), composite (divisible into subparts), derived (calculated from other attributes), and multivalued (multiple values per entity). The choice of attribute type affects normalization and query efficiency.

Simple attributes contain indivisible values. A product price represents a simple attribute. Composite attributes divide into constituent parts. An address breaks down into street, city, state, and postal code. Derived attributes calculate from other values. Total order amount derives from line item quantities and prices. Multivalued attributes store multiple values. A product might have multiple images or tags.

Relationships connect entities through associations. Relationships possess cardinality (how many instances relate) and participation (whether all instances must participate). Cardinality includes one-to-one (1:1), one-to-many (1:N), and many-to-many (M:N) relationships.

# One-to-many relationship
class Author
  attr_accessor :author_id, :name
  # One author writes many books
end

class Book
  attr_accessor :book_id, :title, :author_id
  # Many books belong to one author
end

# Many-to-many relationship requires junction table
class Student
  attr_accessor :student_id, :name
end

class Course
  attr_accessor :course_id, :title
end

class Enrollment
  attr_accessor :student_id, :course_id, :enrollment_date, :grade
end

Participation constraints specify whether entity instances must participate in relationships. Total participation requires every entity instance to participate. Partial participation allows instances to exist without participating. An order must have a customer (total participation), but a customer need not have orders (partial participation).

Cardinality ratios define the maximum number of relationship instances an entity can participate in. One-to-one relationships limit both entities to single instances. One-to-many relationships allow one entity to relate to multiple instances of another. Many-to-many relationships permit multiple instances on both sides.

Keys uniquely identify entity instances. A primary key consists of one or more attributes that uniquely identify each instance. Candidate keys represent alternative unique identifiers. Foreign keys reference primary keys in related entities to establish relationships. Composite keys combine multiple attributes for unique identification.

# Primary key: user_id
# Foreign key: department_id references departments
class User
  attr_accessor :user_id, :username, :email, :department_id
end

# Composite key: course_id + semester
class CourseOffering
  attr_accessor :course_id, :semester, :instructor, :capacity
end

Design Considerations

Granularity decisions affect storage efficiency and query complexity. Fine-grained entities with many small tables increase join operations but reduce redundancy. Coarse-grained entities with fewer large tables simplify queries but introduce duplication. The optimal granularity balances query performance against storage efficiency.

Consider a contact management system. Separate entities for Person, EmailAddress, PhoneNumber, and Address maximize normalization but require multiple joins. A single Contact entity with composite attributes simplifies queries but duplicates data when persons share addresses.

# Fine-grained approach
class Person
  attr_accessor :person_id, :first_name, :last_name
end

class EmailAddress
  attr_accessor :email_id, :person_id, :email, :email_type
end

# Coarse-grained approach
class Contact
  attr_accessor :contact_id, :name, :email, :phone, :address
end

Relationship representation choices impact query patterns and data integrity. Many-to-many relationships require junction tables that add complexity but maintain flexibility. Converting many-to-many to two one-to-many relationships through junction entities allows additional relationship attributes.

A product-supplier relationship illustrates this pattern. Products and suppliers have a many-to-many relationship since products come from multiple suppliers and suppliers provide multiple products. The junction table stores supply-specific data like cost and lead time.

class Product
  attr_accessor :product_id, :name
end

class Supplier
  attr_accessor :supplier_id, :name
end

class ProductSupplier
  attr_accessor :product_id, :supplier_id, :cost, :lead_time_days
end

Attribute placement determines which entity owns particular data. Attributes belong to the entity they most directly describe. Misplaced attributes create update anomalies and complicate queries. An order date belongs to Order, not Customer, even though customers place orders.

Redundancy elimination through normalization removes update anomalies. First normal form eliminates repeating groups. Second normal form removes partial dependencies. Third normal form eliminates transitive dependencies. Higher normal forms address specific anomaly types. Over-normalization creates excessive joins while under-normalization causes anomalies.

# Unnormalized - order contains repeating product data
class Order
  attr_accessor :order_id, :customer_id, :product_id, :product_name, :product_price
end

# Normalized - product data separated
class Order
  attr_accessor :order_id, :customer_id
end

class OrderLine
  attr_accessor :order_id, :product_id, :quantity
end

class Product
  attr_accessor :product_id, :name, :price
end

Inheritance modeling represents specialization hierarchies. Supertype entities contain common attributes while subtype entities add specialized attributes. Implementation strategies include single table inheritance, class table inheritance, and concrete table inheritance. Each strategy trades query simplicity against storage efficiency.

Single table inheritance stores all subtypes in one table with a type discriminator column. This approach simplifies queries but wastes space with null values. Class table inheritance creates separate tables for supertype and subtypes with shared primary keys. Concrete table inheritance creates independent tables for each subtype with duplicated common attributes.

Temporal considerations capture how data changes over time. Effective dating tracks when relationships or attributes are valid. Historical tracking maintains audit trails of changes. Current-state-only models simplify queries but lose history. The temporal strategy depends on business requirements for historical analysis and audit capabilities.

Ruby Implementation

Ruby implements ER concepts primarily through Object-Relational Mapping (ORM) frameworks. ActiveRecord, the ORM in Ruby on Rails, translates ER models into Ruby classes and database tables. The framework provides declarative syntax for defining entities, relationships, and constraints.

Entity Definition uses Ruby classes that inherit from ActiveRecord::Base. Each class represents an entity with attributes mapped to table columns. Migrations define the database schema with column types and constraints.

class User < ApplicationRecord
  # Implicit attribute definition through database schema
  # Columns: id, username, email, created_at, updated_at
end

# Migration defines entity structure
class CreateUsers < ActiveRecord::Migration[7.0]
  def change
    create_table :users do |t|
      t.string :username, null: false, limit: 50
      t.string :email, null: false
      t.timestamps
      
      t.index :username, unique: true
      t.index :email, unique: true
    end
  end
end

Relationship Declaration uses association methods that generate relationship behavior. The belongs_to method defines the many side of one-to-many relationships. The has_many method defines the one side. The has_one method creates one-to-one relationships. The has_and_belongs_to_many and has_many :through methods implement many-to-many relationships.

class Author < ApplicationRecord
  has_many :books, dependent: :destroy
  has_one :biography
end

class Book < ApplicationRecord
  belongs_to :author
  has_many :reviews
  has_many :book_categories
  has_many :categories, through: :book_categories
end

class Category < ApplicationRecord
  has_many :book_categories
  has_many :books, through: :book_categories
end

class BookCategory < ApplicationRecord
  belongs_to :book
  belongs_to :category
end

Foreign Key Constraints enforce referential integrity at the database level. Rails migrations add foreign key constraints that prevent orphaned records and invalid relationships.

class CreateOrders < ActiveRecord::Migration[7.0]
  def change
    create_table :orders do |t|
      t.references :customer, null: false, foreign_key: true
      t.decimal :total, precision: 10, scale: 2
      t.timestamps
    end
    
    add_foreign_key :orders, :customers, on_delete: :restrict
  end
end

Polymorphic Associations model relationships where an entity relates to multiple entity types. A comment might belong to either a post or a photo. Polymorphic associations store the related entity type and ID.

class Comment < ApplicationRecord
  belongs_to :commentable, polymorphic: true
end

class Post < ApplicationRecord
  has_many :comments, as: :commentable
end

class Photo < ApplicationRecord
  has_many :comments, as: :commentable
end

# Migration for polymorphic association
class CreateComments < ActiveRecord::Migration[7.0]
  def change
    create_table :comments do |t|
      t.references :commentable, polymorphic: true, null: false
      t.text :content
      t.timestamps
    end
  end
end

Single Table Inheritance implements specialization hierarchies with a type column that distinguishes subtypes. All subtypes share the same table with subtype-specific columns nullable.

class Vehicle < ApplicationRecord
  # Type column: 'Car', 'Truck', 'Motorcycle'
end

class Car < Vehicle
  # Inherits all Vehicle attributes
  # Specific behavior for cars
end

class Truck < Vehicle
  # Specific behavior for trucks
end

# Migration includes type column
class CreateVehicles < ActiveRecord::Migration[7.0]
  def change
    create_table :vehicles do |t|
      t.string :type, null: false
      t.string :make
      t.string :model
      t.integer :year
      t.integer :cargo_capacity  # Truck-specific, null for others
      t.integer :passenger_count # Car-specific, null for others
      t.timestamps
    end
    
    t.index :type
  end
end

Delegated Types provide an alternative to single table inheritance with separate tables for each subtype sharing a common interface through a delegator table.

class Entry < ApplicationRecord
  delegated_type :entryable, types: %w[Message Comment]
end

class Message < ApplicationRecord
  has_one :entry, as: :entryable, touch: true
end

class Comment < ApplicationRecord
  has_one :entry, as: :entryable, touch: true
end

Composite Primary Keys identify records through multiple columns. The composite_primary_keys gem extends ActiveRecord for composite key support.

class CourseEnrollment < ApplicationRecord
  self.primary_keys = :student_id, :course_id
  
  belongs_to :student
  belongs_to :course
end

Practical Examples

E-commerce Order System demonstrates entity decomposition and relationship modeling. The system tracks customers, orders, products, and inventory.

class Customer < ApplicationRecord
  has_many :orders
  has_one :loyalty_account
  
  validates :email, presence: true, uniqueness: true
end

class Order < ApplicationRecord
  belongs_to :customer
  has_many :line_items, dependent: :destroy
  has_many :products, through: :line_items
  
  validates :order_date, presence: true
  validates :status, inclusion: { in: %w[pending confirmed shipped delivered cancelled] }
end

class LineItem < ApplicationRecord
  belongs_to :order
  belongs_to :product
  
  validates :quantity, numericality: { greater_than: 0 }
  validates :price, numericality: { greater_than_or_equal_to: 0 }
  
  def subtotal
    quantity * price
  end
end

class Product < ApplicationRecord
  has_many :line_items
  has_many :orders, through: :line_items
  belongs_to :category
  
  validates :sku, presence: true, uniqueness: true
  validates :name, presence: true
  validates :price, numericality: { greater_than: 0 }
end

class Category < ApplicationRecord
  has_many :products
  belongs_to :parent_category, class_name: 'Category', optional: true
  has_many :subcategories, class_name: 'Category', foreign_key: :parent_category_id
end

This schema separates concerns across entities. Customer contains customer-specific data. Order tracks order-level information like date and status. LineItem represents the many-to-many relationship between orders and products while storing order-specific pricing and quantities. Products maintain current catalog information independent of historical orders.

Content Management System illustrates polymorphic relationships and inheritance hierarchies.

class User < ApplicationRecord
  has_many :articles, foreign_key: :author_id
  has_many :comments
end

class Article < ApplicationRecord
  belongs_to :author, class_name: 'User'
  has_many :comments, as: :commentable
  has_many :taggings, as: :taggable
  has_many :tags, through: :taggings
  
  validates :title, presence: true, length: { maximum: 200 }
  validates :content, presence: true
end

class Comment < ApplicationRecord
  belongs_to :user
  belongs_to :commentable, polymorphic: true
  belongs_to :parent_comment, class_name: 'Comment', optional: true
  has_many :replies, class_name: 'Comment', foreign_key: :parent_comment_id
  
  validates :content, presence: true, length: { minimum: 1, maximum: 1000 }
end

class Tag < ApplicationRecord
  has_many :taggings
  has_many :articles, through: :taggings, source: :taggable, source_type: 'Article'
  
  validates :name, presence: true, uniqueness: true
end

class Tagging < ApplicationRecord
  belongs_to :tag
  belongs_to :taggable, polymorphic: true
end

The polymorphic commentable association allows comments on multiple entity types without code duplication. The self-referential parent_comment relationship enables threaded discussions. Tags connect to articles through a join table supporting many-to-many relationships.

Project Management System demonstrates complex relationships and temporal data.

class Project < ApplicationRecord
  belongs_to :organization
  has_many :project_memberships
  has_many :users, through: :project_memberships
  has_many :tasks
  
  validates :name, presence: true
  validates :status, inclusion: { in: %w[planning active completed archived] }
end

class Task < ApplicationRecord
  belongs_to :project
  belongs_to :assigned_to, class_name: 'User', optional: true
  belongs_to :created_by, class_name: 'User'
  belongs_to :parent_task, class_name: 'Task', optional: true
  has_many :subtasks, class_name: 'Task', foreign_key: :parent_task_id
  has_many :task_dependencies, foreign_key: :task_id
  has_many :blocking_tasks, through: :task_dependencies, source: :depends_on_task
  
  validates :title, presence: true
  validates :priority, inclusion: { in: %w[low medium high critical] }
end

class TaskDependency < ApplicationRecord
  belongs_to :task
  belongs_to :depends_on_task, class_name: 'Task'
  
  validates :dependency_type, inclusion: { in: %w[finish_to_start start_to_start] }
end

class ProjectMembership < ApplicationRecord
  belongs_to :project
  belongs_to :user
  
  validates :role, inclusion: { in: %w[owner admin member viewer] }
  validates :joined_at, presence: true
end

This design captures project hierarchy, task dependencies, and user roles. The TaskDependency entity models relationships between tasks with dependency types. ProjectMembership tracks user participation with role-based access control. Self-referential tasks support task breakdown into subtasks.

Common Patterns

Audit Trail Pattern tracks changes to entity state over time through history tables or event logs.

class Document < ApplicationRecord
  has_many :document_versions, dependent: :destroy
  
  after_save :create_version
  
  private
  
  def create_version
    document_versions.create(
      content: content,
      version_number: document_versions.count + 1,
      created_by: current_user_id
    )
  end
end

class DocumentVersion < ApplicationRecord
  belongs_to :document
  belongs_to :creator, class_name: 'User', foreign_key: :created_by
  
  validates :version_number, presence: true
  validates :content, presence: true
end

Soft Delete Pattern marks records as deleted without physical removal, maintaining referential integrity and enabling recovery.

class Account < ApplicationRecord
  default_scope { where(deleted_at: nil) }
  
  def soft_delete
    update(deleted_at: Time.current)
  end
  
  def restore
    update(deleted_at: nil)
  end
  
  def self.with_deleted
    unscope(where: :deleted_at)
  end
end

Effective Dating Pattern tracks when relationships or attributes are valid through start and end dates.

class EmployeeDepartment < ApplicationRecord
  belongs_to :employee
  belongs_to :department
  
  validates :effective_from, presence: true
  validate :dates_must_be_logical
  
  scope :current, -> { where('effective_from <= ? AND (effective_to IS NULL OR effective_to > ?)', 
                              Time.current, Time.current) }
  
  private
  
  def dates_must_be_logical
    return unless effective_to.present?
    errors.add(:effective_to, 'must be after effective_from') if effective_to <= effective_from
  end
end

State Machine Pattern models entities with defined states and transitions.

class Workflow < ApplicationRecord
  has_many :workflow_transitions
  
  enum status: {
    draft: 0,
    submitted: 1,
    approved: 2,
    rejected: 3,
    completed: 4
  }
  
  validates :status, presence: true
  
  def submit!
    return false unless draft?
    
    transaction do
      submitted!
      workflow_transitions.create!(
        from_state: 'draft',
        to_state: 'submitted',
        transitioned_at: Time.current
      )
    end
  end
end

class WorkflowTransition < ApplicationRecord
  belongs_to :workflow
  
  validates :from_state, :to_state, :transitioned_at, presence: true
end

Party Pattern unifies person and organization entities through a common supertype for systems dealing with multiple party types.

class Party < ApplicationRecord
  has_many :contacts
  has_many :addresses
  
  validates :type, presence: true
end

class Person < Party
  validates :first_name, :last_name, presence: true
  
  def full_name
    "#{first_name} #{last_name}"
  end
end

class Organization < Party
  validates :legal_name, presence: true
  has_many :employees, class_name: 'Person'
end

Common Pitfalls

Many-to-Many Without Junction Attributes causes data loss when relationship attributes exist but no junction table stores them. A student-course relationship requires enrollment date and grade storage in the junction table.

# Incorrect - loses enrollment data
class Student < ApplicationRecord
  has_and_belongs_to_many :courses
end

# Correct - preserves relationship attributes
class Student < ApplicationRecord
  has_many :enrollments
  has_many :courses, through: :enrollments
end

class Enrollment < ApplicationRecord
  belongs_to :student
  belongs_to :course
  
  validates :enrolled_at, presence: true
  validates :grade, inclusion: { in: %w[A B C D F] }, allow_nil: true
end

Circular Dependencies create update and deletion complexities when entities reference each other. Employee has a manager who is also an employee. Department has a head who is an employee in that department.

# Problematic without careful handling
class Employee < ApplicationRecord
  belongs_to :department
  belongs_to :manager, class_name: 'Employee', optional: true
end

class Department < ApplicationRecord
  belongs_to :head, class_name: 'Employee', optional: true
  has_many :employees
end

# Requires deferred constraint checking or careful ordering

Redundant Relationships complicate maintenance when multiple paths connect entities. If Order belongs to Customer and LineItem belongs to both Order and Customer, the Customer-LineItem relationship duplicates information.

# Redundant - customer_id duplicated
class LineItem < ApplicationRecord
  belongs_to :order
  belongs_to :customer  # Redundant, derivable through order
  belongs_to :product
end

# Sufficient relationship
class LineItem < ApplicationRecord
  belongs_to :order
  belongs_to :product
  
  delegate :customer, to: :order
end

Null Foreign Keys in Required Relationships allow orphaned records that violate business rules. An order must belong to a customer, but nullable customer_id permits invalid data.

# Permits invalid data
class CreateOrders < ActiveRecord::Migration[7.0]
  def change
    create_table :orders do |t|
      t.references :customer  # Nullable
    end
  end
end

# Enforces requirement
class CreateOrders < ActiveRecord::Migration[7.0]
  def change
    create_table :orders do |t|
      t.references :customer, null: false, foreign_key: true
    end
  end
end

Overusing Inheritance complicates schemas when composition provides better flexibility. Single table inheritance with many subtypes creates sparse tables with numerous null values.

# Overused inheritance
class Animal < ApplicationRecord
  # Many subtype-specific nullable columns
end

class Dog < Animal
end

class Cat < Animal
end

class Bird < Animal
end

# Better with composition
class Animal < ApplicationRecord
  belongs_to :species
  has_many :animal_traits
end

class Species < ApplicationRecord
  has_many :animals
end

class AnimalTrait < ApplicationRecord
  belongs_to :animal
end

Missing Indexes on Foreign Keys degrade query performance for joins and lookups. Foreign key columns without indexes cause full table scans.

# Missing index causes slow queries
class CreateOrders < ActiveRecord::Migration[7.0]
  def change
    create_table :orders do |t|
      t.integer :customer_id
    end
  end
end

# Indexed foreign key improves performance
class CreateOrders < ActiveRecord::Migration[7.0]
  def change
    create_table :orders do |t|
      t.references :customer, index: true, foreign_key: true
    end
  end
end

Tools & Ecosystem

ActiveRecord provides the primary ORM implementation in Ruby applications. The framework handles schema definition, relationship management, query generation, and data validation. ActiveRecord follows convention over configuration principles with sensible defaults.

Sequel offers an alternative ORM with different design philosophy. Sequel provides more explicit control over queries and supports advanced database features. The toolkit approach separates concerns more than ActiveRecord.

# Sequel entity definition
DB.create_table :users do
  primary_key :id
  String :username, null: false, unique: true
  String :email, null: false
  DateTime :created_at
end

class User < Sequel::Model
  one_to_many :posts
  many_to_one :account
end

Rails ERD generates entity-relationship diagrams from ActiveRecord models. The gem analyzes model definitions and relationships to produce visual schema documentation.

# Generate ERD diagram
bundle exec erd --attributes=foreign_keys,content

Annotate adds schema information as comments to model files. The gem maintains synchronization between database schema and model documentation.

# == Schema Information
#
# Table name: users
#
#  id         :bigint           not null, primary key
#  username   :string           not null
#  email      :string           not null
#  created_at :datetime         not null
#
class User < ApplicationRecord
end

Strong Migrations detects dangerous schema changes that cause downtime or data loss. The gem analyzes migrations for operations like removing columns, changing column types, or adding non-concurrent indexes.

Schema Plus extends migration capabilities with features like foreign key constraints, indexes, and views. The gem provides cleaner syntax for advanced database features.

create_table :orders do |t|
  t.references :customer, foreign_key: { on_delete: :restrict, on_update: :cascade }
  t.index :created_at
end

Database Design Tools outside Ruby assist with schema planning. Draw.io, Lucidchart, and DBDesigner provide visual modeling interfaces. pgModeler and MySQL Workbench offer database-specific design tools with forward and reverse engineering capabilities.

Reference

Entity Types

Type Description Example
Strong Entity Exists independently with own primary key Customer, Product
Weak Entity Depends on another entity for identification OrderLine, Comment
Associative Entity Represents many-to-many relationship with attributes Enrollment, Assignment
Supertype Abstract entity with common attributes Vehicle, Account
Subtype Specialized entity inheriting from supertype Car, SavingsAccount

Relationship Cardinalities

Cardinality Notation Description Implementation
One-to-One 1:1 Single instance relates to single instance Foreign key with unique constraint
One-to-Many 1:N Single instance relates to multiple instances Foreign key in many side
Many-to-Many M:N Multiple instances relate to multiple instances Junction table with two foreign keys

Participation Constraints

Constraint Notation Description Database Enforcement
Total Double line All instances must participate NOT NULL constraint on foreign key
Partial Single line Instances may not participate Nullable foreign key

Attribute Types

Type Description Storage Approach
Simple Atomic, indivisible value Single column
Composite Divisible into components Multiple columns or separate table
Derived Calculated from other attributes Computed or method
Multivalued Multiple values per entity Separate related table
Key Uniquely identifies entity instance Primary key constraint

Normal Forms

Form Requirement Eliminates
1NF Atomic values, no repeating groups Repeating groups
2NF 1NF plus no partial dependencies Partial dependencies on composite keys
3NF 2NF plus no transitive dependencies Transitive dependencies
BCNF 3NF plus every determinant is candidate key Remaining anomalies in 3NF

ActiveRecord Association Types

Association Purpose Usage
belongs_to Declares many side of 1:N relationship Foreign key in this table
has_one Declares one side of 1:1 relationship Foreign key in other table
has_many Declares one side of 1:N relationship Foreign key in other table
has_many :through Many-to-many with explicit join model Two belongs_to in join model
has_and_belongs_to_many Simple many-to-many without join model Junction table only
has_one :through One-to-one through intermediary Foreign keys in intermediate tables

Association Options

Option Purpose Example
class_name Specify model class when differs from association name class_name: 'User'
foreign_key Override default foreign key column foreign_key: :author_id
primary_key Override default primary key primary_key: :uuid
dependent Define deletion behavior dependent: :destroy
inverse_of Specify inverse association inverse_of: :orders
optional Allow nil belongs_to optional: true
counter_cache Cache associated record count counter_cache: true

Foreign Key Actions

Action Behavior Use Case
restrict Prevent deletion if references exist Prevent orphaning critical data
cascade Delete related records Remove dependent data automatically
nullify Set foreign key to null Preserve records but remove association
no_action No automatic action Handle manually in application