Overview
Logical data models serve as the bridge between business requirements and physical database implementation. A logical data model represents what data the system stores and how different data elements relate to each other, without specifying how the database stores or retrieves that data. This abstraction layer separates business logic from technical implementation details.
The logical data model emerged from the work of Edgar F. Codd in the 1970s with his relational model. Before logical modeling became standard practice, applications directly manipulated physical storage structures, creating tight coupling between business logic and storage mechanisms. Logical models introduced a conceptual layer that represents data from a business perspective rather than a technical one.
A logical data model consists of entities (things the business cares about), attributes (properties of those entities), and relationships (how entities connect to each other). For example, a retail system might have Customer and Order entities. The Customer entity has attributes like name and email. A relationship connects Customer to Order, indicating that customers place orders.
# Conceptual representation in Ruby using a simple class structure
class Customer
attr_accessor :id, :name, :email
def orders
Order.where(customer_id: id)
end
end
class Order
attr_accessor :id, :customer_id, :order_date, :total_amount
def customer
Customer.find(customer_id)
end
end
Logical data models differ from conceptual and physical models. A conceptual model provides a high-level view of major entities and relationships without detailed attributes. A logical model adds complete attribute definitions and relationship rules. A physical model translates the logical model into database-specific structures like tables, columns, indexes, and partitions.
The primary value of logical modeling lies in communication and validation. Business stakeholders can review logical models to verify that the system captures their requirements correctly. Developers use logical models to understand data relationships before writing code. Database administrators translate logical models into physical designs optimized for specific database systems.
Key Principles
Entity Definition: An entity represents a person, place, thing, or concept about which the system stores information. Entities must be distinguishable from each other through unique identifiers. Each entity instance corresponds to one real-world occurrence. The Customer entity represents individual customers, not the concept of customers in general.
Attribute Specification: Attributes describe entity properties. Each attribute has a name, data type, and constraints. The data type defines what values the attribute can hold (text, number, date). Constraints define rules like required attributes, unique values, or valid value ranges. An email attribute typically requires a text type, must contain a valid email format, and often must be unique across all customers.
Relationship Cardinality: Relationships define how entity instances connect. Cardinality specifies how many instances of one entity can relate to instances of another entity. One-to-many relationships occur when one entity instance relates to multiple instances of another entity. A customer can place many orders, but each order belongs to one customer. Many-to-many relationships occur when multiple instances of each entity can relate to multiple instances of the other entity. A product can appear in many orders, and an order can contain many products.
Normalization: Normalization eliminates data redundancy and ensures data integrity through a series of normal forms. First normal form (1NF) requires atomic attribute values—no repeating groups or arrays within attributes. Second normal form (2NF) requires all non-key attributes to depend on the entire primary key, not just part of it. Third normal form (3NF) requires all non-key attributes to depend only on the primary key, not on other non-key attributes.
# First normal form violation - repeating groups
class Order
attr_accessor :id, :product1, :product2, :product3
end
# First normal form corrected - separate entity for line items
class Order
attr_accessor :id, :customer_id
def line_items
LineItem.where(order_id: id)
end
end
class LineItem
attr_accessor :id, :order_id, :product_id, :quantity
end
Primary Keys: Every entity requires a primary key—an attribute or set of attributes that uniquely identifies each entity instance. Primary keys must be unique, required (not null), and immutable. Natural keys use business-meaningful attributes like email addresses or product codes. Surrogate keys use system-generated values like auto-incrementing integers or UUIDs. Composite keys combine multiple attributes to form uniqueness.
Foreign Keys: Foreign keys implement relationships between entities. A foreign key attribute in one entity references the primary key of another entity. Foreign key constraints enforce referential integrity—the database prevents creating relationships to non-existent entities. Cascade behaviors define what happens when the referenced entity is deleted or updated.
Domain Integrity: Domain integrity ensures attribute values conform to defined rules. Data types provide basic constraints. Check constraints enforce complex rules like "order total must be positive" or "end date must follow start date". Default values provide automatic values when not specified. Null constraints specify whether attributes allow missing values.
Design Considerations
Granularity Trade-offs: Fine-grained models create separate entities for detailed concepts, improving flexibility and normalization but increasing complexity. Coarse-grained models combine related concepts into fewer entities, simplifying queries but potentially introducing redundancy. An address could be separate Address entities referenced by Customer, or embedded attributes within Customer. Separate entities support addresses shared across customers and maintain address history. Embedded attributes simplify queries when addresses are customer-specific.
# Fine-grained approach
class Customer
attr_accessor :id, :name
def addresses
Address.where(customer_id: id)
end
end
class Address
attr_accessor :id, :customer_id, :street, :city, :state, :zip
end
# Coarse-grained approach
class Customer
attr_accessor :id, :name, :street, :city, :state, :zip
end
Temporal Data Handling: Logical models must address whether the system tracks data changes over time. Non-temporal models store only current state—updating a customer's address replaces the old address. Temporal models maintain history through effective dating or versioning. Effective dating adds start and end date columns to track when each version was valid. Versioning creates new records for each change while preserving old records.
Temporal models support auditing, compliance, and historical analysis but increase storage requirements and query complexity. Historical queries require filtering by effective dates. Insertions become more complex because the system must close the previous version before creating a new version.
Optional vs Required Relationships: Relationships can be mandatory or optional. A mandatory relationship requires every entity instance to have a related instance. Every order must have a customer. An optional relationship allows entity instances without related instances. A customer might not have any orders yet.
Optional relationships introduce null foreign keys, which complicate queries requiring related data. Mandatory relationships enforce business rules at the model level but reduce flexibility when business rules change. A logical model that requires every employee to have a department prevents hiring employees before assigning departments.
Relationship Directionality: Relationships can be unidirectional or bidirectional. Unidirectional relationships navigate in one direction only—from Order to Customer. Bidirectional relationships navigate both directions—from Order to Customer and from Customer to Orders. Bidirectional relationships require foreign keys or join tables but provide convenient navigation.
ORM frameworks typically implement bidirectional relationships through association declarations, but the database stores only foreign keys. The choice affects query patterns and code ergonomics rather than database structure.
Entity vs Attribute Classification: Determining whether a concept should be an entity or an attribute affects model flexibility and complexity. Phone numbers could be attributes of Customer (phone_number column) or separate PhoneNumber entities. Separate entities support multiple phone numbers per customer, phone number types (mobile, home, work), and independent phone number management. Attributes simplify the model when customers have exactly one phone number.
Implementation Approaches
Top-Down Modeling: Top-down modeling starts with high-level business concepts and progressively refines them into detailed entities and attributes. Analysts identify major subject areas like Sales, Inventory, and Human Resources. Each subject area decomposes into entities, then attributes, then relationships. This approach ensures the model aligns with business organization and terminology.
Top-down modeling works well when building new systems or replacing legacy systems. The approach requires extensive business analysis and stakeholder collaboration. The risk lies in creating overly abstract models disconnected from real implementation constraints. Iterative refinement mitigates this risk by validating intermediate models against actual use cases.
Bottom-Up Modeling: Bottom-up modeling starts with existing data sources and combines them into a coherent logical model. Analysts examine database schemas, file formats, API responses, and spreadsheets to identify entities and attributes. This approach ensures the model accommodates existing data but may perpetuate poor design decisions from legacy systems.
Bottom-up modeling suits data integration projects and migrations. The approach discovers implicit relationships and business rules embedded in existing systems. Normalization applied during bottom-up modeling improves data quality. The challenge involves reconciling inconsistent naming conventions, data types, and business rules across sources.
Hybrid Modeling: Hybrid modeling combines top-down and bottom-up approaches. Analysts create an initial model based on business requirements (top-down), then validate and refine it against existing data (bottom-up). This iterative process balances business needs with technical reality.
Hybrid modeling reduces risk by validating theoretical models against actual data early. The approach identifies missing requirements and implementation challenges before development begins. Frequent validation cycles keep the model grounded in reality while maintaining alignment with business goals.
Pattern-Based Modeling: Pattern-based modeling applies established design patterns to common modeling scenarios. Patterns like Party-Role, Subtype/Supertype, and Associative Entity provide proven solutions to recurring problems. Using patterns accelerates modeling, improves consistency, and reduces errors.
The Party-Role pattern models entities that play multiple roles. Instead of separate Customer and Employee entities, create Party (person or organization) and Role entities. A Party can have Customer and Employee roles simultaneously. This pattern eliminates duplicate data when the same person is both a customer and an employee.
class Party
attr_accessor :id, :name, :type # type = 'Person' or 'Organization'
def roles
PartyRole.where(party_id: id)
end
end
class PartyRole
attr_accessor :id, :party_id, :role_type, :effective_date
# role_type = 'Customer', 'Employee', 'Supplier', etc.
end
class Customer
def self.find_by_party(party_id)
PartyRole.where(party_id: party_id, role_type: 'Customer').first
end
end
Ruby Implementation
ActiveRecord Associations: Ruby on Rails uses ActiveRecord to map logical data models to database schemas and object models. Associations define relationships between models. The belongs_to association creates a many-to-one relationship. The has_many association creates a one-to-many relationship. The has_many :through association implements many-to-many relationships through a join model.
class Customer < ApplicationRecord
has_many :orders
has_many :products, through: :orders
end
class Order < ApplicationRecord
belongs_to :customer
has_many :line_items
has_many :products, through: :line_items
end
class Product < ApplicationRecord
has_many :line_items
has_many :orders, through: :line_items
end
class LineItem < ApplicationRecord
belongs_to :order
belongs_to :product
end
Validations and Constraints: ActiveRecord validations enforce domain integrity at the application level. Validations check data before saving to the database. Presence validations require attributes. Uniqueness validations prevent duplicate values. Format validations enforce patterns like email addresses. Custom validations implement complex business rules.
class Customer < ApplicationRecord
validates :email, presence: true,
uniqueness: true,
format: { with: URI::MailTo::EMAIL_REGEXP }
validates :name, presence: true, length: { minimum: 2, maximum: 100 }
validate :email_domain_allowed
private
def email_domain_allowed
return if email.blank?
domain = email.split('@').last
unless ['example.com', 'company.com'].include?(domain)
errors.add(:email, 'must be from an allowed domain')
end
end
end
Database Migrations: Migrations translate logical models into physical database schemas. Each migration creates or modifies tables, columns, indexes, and constraints. Migrations provide version control for database schemas, enabling teams to evolve the schema alongside application code.
class CreateCustomersAndOrders < ActiveRecord::Migration[7.0]
def change
create_table :customers do |t|
t.string :name, null: false
t.string :email, null: false
t.timestamps
end
add_index :customers, :email, unique: true
create_table :orders do |t|
t.references :customer, null: false, foreign_key: true
t.date :order_date, null: false
t.decimal :total_amount, precision: 10, scale: 2, null: false
t.timestamps
end
create_table :products do |t|
t.string :name, null: false
t.string :sku, null: false
t.decimal :price, precision: 10, scale: 2, null: false
t.timestamps
end
add_index :products, :sku, unique: true
create_table :line_items do |t|
t.references :order, null: false, foreign_key: true
t.references :product, null: false, foreign_key: true
t.integer :quantity, null: false
t.decimal :unit_price, precision: 10, scale: 2, null: false
t.timestamps
end
end
end
Single Table Inheritance: Single Table Inheritance (STI) implements entity subtypes in ActiveRecord. All subtypes share one database table. A type column distinguishes between subtypes. STI works well when subtypes share most attributes but differ in behavior.
class Vehicle < ApplicationRecord
validates :make, :model, :year, presence: true
end
class Car < Vehicle
validates :num_doors, presence: true
end
class Motorcycle < Vehicle
validates :engine_cc, presence: true
end
# Database schema
create_table :vehicles do |t|
t.string :type # Stores 'Car' or 'Motorcycle'
t.string :make
t.string :model
t.integer :year
t.integer :num_doors # Only for Car
t.integer :engine_cc # Only for Motorcycle
t.timestamps
end
Polymorphic Associations: Polymorphic associations allow one model to belong to multiple other models through a single association. A Comment model might belong to Article, Photo, or Video. The polymorphic association stores both the related model's ID and type.
class Comment < ApplicationRecord
belongs_to :commentable, polymorphic: true
validates :content, presence: true
end
class Article < ApplicationRecord
has_many :comments, as: :commentable
end
class Photo < ApplicationRecord
has_many :comments, as: :commentable
end
# Database schema
create_table :comments do |t|
t.references :commentable, polymorphic: true, null: false
t.text :content
t.timestamps
end
# Usage
article = Article.create(title: "Ruby Data Modeling")
article.comments.create(content: "Great article!")
Scopes and Query Objects: Scopes encapsulate common queries within models, providing reusable query logic that reflects business rules from the logical model. Query objects extract complex queries into separate classes for better organization and testability.
class Order < ApplicationRecord
belongs_to :customer
has_many :line_items
scope :completed, -> { where(status: 'completed') }
scope :pending, -> { where(status: 'pending') }
scope :recent, -> { where('created_at > ?', 30.days.ago) }
scope :high_value, -> { where('total_amount > ?', 1000) }
def self.by_customer_and_date_range(customer_id, start_date, end_date)
where(customer_id: customer_id)
.where('order_date BETWEEN ? AND ?', start_date, end_date)
.order(order_date: :desc)
end
end
# Query Object pattern
class CustomerOrderSummary
def initialize(customer)
@customer = customer
end
def total_spent
@customer.orders.completed.sum(:total_amount)
end
def average_order_value
@customer.orders.completed.average(:total_amount)
end
def order_count
@customer.orders.completed.count
end
end
Common Patterns
Supertype and Subtype Pattern: This pattern models entities with shared characteristics and specialized variations. A supertype entity contains common attributes. Subtype entities inherit from the supertype and add specialized attributes. Implementation options include single table inheritance, class table inheritance, or concrete table inheritance.
Single table inheritance stores all subtypes in one table with a discriminator column. Class table inheritance creates separate tables for supertype and each subtype, linked by foreign keys. Concrete table inheritance creates independent tables for each subtype without a shared supertype table.
# Single Table Inheritance approach
class Employee < ApplicationRecord
validates :name, :employee_number, presence: true
end
class FullTimeEmployee < Employee
validates :salary, presence: true
validates :benefits_tier, presence: true
end
class Contractor < Employee
validates :hourly_rate, presence: true
validates :contract_end_date, presence: true
end
# Class Table Inheritance approach (using delegated types in Rails 6.1+)
class Employee < ApplicationRecord
delegated_type :employable, types: %w[FullTimeEmployee Contractor]
validates :name, :employee_number, presence: true
end
class FullTimeEmployee < ApplicationRecord
has_one :employee, as: :employable, touch: true
validates :salary, :benefits_tier, presence: true
end
class Contractor < ApplicationRecord
has_one :employee, as: :employable, touch: true
validates :hourly_rate, :contract_end_date, presence: true
end
Associative Entity Pattern: Associative entities resolve many-to-many relationships and store relationship attributes. A Student can enroll in many Courses, and a Course can have many Students. The Enrollment entity resolves this relationship and stores enrollment-specific attributes like enrollment date and grade.
class Student < ApplicationRecord
has_many :enrollments
has_many :courses, through: :enrollments
end
class Course < ApplicationRecord
has_many :enrollments
has_many :students, through: :enrollments
end
class Enrollment < ApplicationRecord
belongs_to :student
belongs_to :course
validates :enrollment_date, presence: true
validates :student_id, uniqueness: { scope: :course_id }
enum status: { active: 0, completed: 1, dropped: 2 }
end
Lookup Table Pattern: Lookup tables store reference data like status codes, categories, or types. These tables contain relatively static data that other entities reference. Lookup tables centralize reference data management and enforce valid values through foreign key constraints.
class OrderStatus < ApplicationRecord
has_many :orders
validates :code, :name, presence: true
validates :code, uniqueness: true
end
class Order < ApplicationRecord
belongs_to :order_status
validates :order_date, :total_amount, presence: true
end
# Seed data
OrderStatus.create(code: 'PENDING', name: 'Pending', display_order: 1)
OrderStatus.create(code: 'PROCESSING', name: 'Processing', display_order: 2)
OrderStatus.create(code: 'SHIPPED', name: 'Shipped', display_order: 3)
OrderStatus.create(code: 'DELIVERED', name: 'Delivered', display_order: 4)
Party Pattern: The Party pattern provides a flexible way to model people and organizations that play multiple roles. Instead of creating separate entities for Customer, Employee, and Supplier, create a Party entity that can have multiple Role entities. This pattern eliminates redundancy when the same individual or organization plays multiple roles.
class Party < ApplicationRecord
has_many :party_roles
has_many :customer_roles, -> { where(role_type: 'customer') },
class_name: 'PartyRole'
has_many :employee_roles, -> { where(role_type: 'employee') },
class_name: 'PartyRole'
validates :name, presence: true
validates :party_type, inclusion: { in: %w[person organization] }
end
class PartyRole < ApplicationRecord
belongs_to :party
validates :role_type, presence: true
validates :party_id, uniqueness: { scope: :role_type,
conditions: -> { where(active: true) } }
end
# Usage
party = Party.create(name: "Jane Smith", party_type: "person")
party.party_roles.create(role_type: "customer", active: true)
party.party_roles.create(role_type: "employee", active: true)
Audit Trail Pattern: Audit trails track changes to data over time for compliance, debugging, and historical analysis. Implementation approaches include audit tables that mirror entity tables, event sourcing that stores all state changes, or temporal tables that maintain version history.
class Order < ApplicationRecord
has_many :order_audits
after_save :create_audit_entry
private
def create_audit_entry
order_audits.create(
action: persisted? ? 'update' : 'create',
order_date: order_date,
total_amount: total_amount,
status: status,
changed_by: Current.user&.id,
changed_at: Time.current
)
end
end
class OrderAudit < ApplicationRecord
belongs_to :order
validates :action, :changed_at, presence: true
end
Tools & Ecosystem
Schema Visualization Tools: Rails ERD generates entity-relationship diagrams from ActiveRecord models. The gem analyzes model associations and creates graphical representations of the data model. These diagrams help developers understand complex data relationships and communicate model structure to stakeholders.
# Gemfile
gem 'rails-erd', group: :development
# Generate diagram
bundle exec erd --filename=docs/data_model
# Configure output format and styling
bundle exec erd --inheritance --direct --attributes=foreign_keys,content
Database Schema Management: Schema management tools help maintain consistency between logical models and physical databases. Annotate adds schema information as comments to model files. Schema Plus extends migrations with additional database features like foreign key constraints and indexes.
# Gemfile
gem 'annotate', group: :development
gem 'schema_plus'
# Run annotation
bundle exec annotate --models --routes
# Example annotated model
# == Schema Information
#
# Table name: customers
#
# id :bigint not null, primary key
# name :string not null
# email :string not null
# created_at :datetime not null
# updated_at :datetime not null
#
class Customer < ApplicationRecord
has_many :orders
end
Data Validation Gems: Additional validation gems extend ActiveRecord's built-in validations. Validates Email Format Of provides robust email validation. Date Validator validates date ranges and comparisons. PhonyRails validates and normalizes phone numbers.
# Gemfile
gem 'validates_email_format_of'
gem 'date_validator'
gem 'phonelib'
class Customer < ApplicationRecord
validates :email, email_format: { message: 'is not a valid email' }
validates :birth_date, date: { before: -> { 18.years.ago },
message: 'must be at least 18 years old' }
validates :phone, phone: true
end
Query Analysis Tools: Bullet detects N+1 queries and unused eager loading. PgHero provides database performance insights. These tools help ensure logical models translate into efficient physical queries.
# Gemfile
gem 'bullet', group: :development
gem 'pghero'
# config/environments/development.rb
config.after_initialize do
Bullet.enable = true
Bullet.alert = true
Bullet.bullet_logger = true
Bullet.console = true
Bullet.rails_logger = true
end
Data Modeling Patterns Libraries: Gems like Trailblazer provide architectural patterns for domain modeling beyond basic ActiveRecord models. Dry-rb libraries offer functional programming approaches to data validation and transformation.
# Gemfile
gem 'trailblazer'
gem 'dry-validation'
class CustomerContract < Dry::Validation::Contract
params do
required(:name).filled(:string)
required(:email).filled(:string)
required(:birth_date).filled(:date)
end
rule(:email) do
unless value.match?(URI::MailTo::EMAIL_REGEXP)
key.failure('must be a valid email')
end
end
rule(:birth_date) do
if value > 18.years.ago.to_date
key.failure('must be at least 18 years old')
end
end
end
Reference
Entity Relationship Types
| Relationship | Description | Implementation |
|---|---|---|
| One-to-One | Each instance of Entity A relates to at most one instance of Entity B | Foreign key in either table with unique constraint |
| One-to-Many | Each instance of Entity A relates to multiple instances of Entity B | Foreign key in the many side table |
| Many-to-Many | Multiple instances of Entity A relate to multiple instances of Entity B | Join table with foreign keys to both entities |
| Self-Referential | Entity instances relate to other instances of the same entity | Foreign key references same table primary key |
| Hierarchical | Parent-child relationships within same entity | Parent foreign key references same table |
Normal Forms Quick Reference
| Normal Form | Requirements | Violations |
|---|---|---|
| 1NF | Atomic values, no repeating groups, primary key exists | Multiple values in single column, no unique identifier |
| 2NF | 1NF plus all non-key attributes depend on entire primary key | Partial dependency on composite key |
| 3NF | 2NF plus no transitive dependencies | Non-key attributes depend on other non-key attributes |
| BCNF | 3NF plus every determinant is a candidate key | Functional dependency on non-candidate key |
| 4NF | BCNF plus no multi-valued dependencies | Independent multi-valued facts in same table |
| 5NF | 4NF plus no join dependencies | Data requires specific joins to maintain integrity |
ActiveRecord Association Types
| Association | Purpose | Database Structure |
|---|---|---|
| belongs_to | Defines many-to-one relationship from child side | Foreign key column in this table |
| has_one | Defines one-to-one relationship from parent side | Foreign key column in other table |
| has_many | Defines one-to-many relationship from parent side | Foreign key column in other table |
| has_many :through | Defines many-to-many through join model | Join table with two foreign keys |
| has_and_belongs_to_many | Defines many-to-many without join model | Join table without additional columns |
| has_one :through | One-to-one through intermediary model | Foreign keys through multiple tables |
Common Validation Rules
| Validation | Purpose | Example Usage |
|---|---|---|
| presence | Ensures attribute has a value | Required fields, mandatory relationships |
| uniqueness | Ensures attribute value is unique | Email addresses, SKU codes, usernames |
| format | Ensures attribute matches pattern | Email format, phone numbers, postal codes |
| length | Ensures attribute length within range | Name length, description limits |
| numericality | Ensures attribute is valid number | Prices, quantities, percentages |
| inclusion | Ensures attribute in allowed set | Status codes, categories, types |
| exclusion | Ensures attribute not in forbidden set | Reserved words, invalid values |
| comparison | Ensures attribute relates to other attribute | Start date before end date |
Key Selection Criteria
| Key Type | Advantages | Disadvantages | When to Use |
|---|---|---|---|
| Auto-increment Integer | Simple, small size, sequential | Predictable, security concerns | Internal systems, high volume |
| UUID | Globally unique, unpredictable | Large size, not sequential | Distributed systems, public APIs |
| Natural Key | Business meaning, human readable | May change, complex composites | Stable business identifiers |
| Composite Key | Represents natural uniqueness | Complex joins, larger indexes | Junction tables, natural compounds |
Cardinality Notation
| Notation | Minimum | Maximum | Meaning |
|---|---|---|---|
| 1 | 1 | 1 | Exactly one |
| 0..1 | 0 | 1 | Zero or one |
| 1..* | 1 | Many | One or more |
| 0..* | 0 | Many | Zero or more |
| m..n | m | n | Between m and n |
Schema Migration Commands
| Command | Purpose | Usage Pattern |
|---|---|---|
| create_table | Creates new table | Define entity with columns and constraints |
| add_column | Adds column to existing table | Add new attribute to entity |
| remove_column | Removes column from table | Remove obsolete attribute |
| change_column | Modifies column definition | Change data type or constraints |
| add_index | Creates database index | Improve query performance |
| add_foreign_key | Adds foreign key constraint | Enforce referential integrity |
| add_reference | Adds foreign key column and index | Create relationship efficiently |