CrackedRuby CrackedRuby

Overview

Conceptual data models represent the high-level structure of data and relationships in a domain without concern for how the data will be stored or accessed. These models focus on business entities, their attributes, and relationships between entities, providing a shared understanding between stakeholders, developers, and database designers.

A conceptual data model serves as the foundation for logical and physical data models. While logical models introduce implementation concepts like normalization and physical models specify storage details, the conceptual model remains technology-agnostic and describes what data exists rather than how it exists.

The primary notation for conceptual data models is Entity-Relationship (ER) modeling, introduced by Peter Chen in 1976. ER models use entities (things of significance), attributes (properties of entities), and relationships (associations between entities) to describe data structure. Alternative approaches include UML class diagrams and object-role modeling, but ER diagrams remain the dominant standard.

[Customer] ----< places >---- [Order] ----< contains >---- [Product]
    |                            |                              |
attributes:                  attributes:                   attributes:
- customer_id               - order_id                    - product_id
- name                      - order_date                  - name
- email                     - total_amount                - price

Conceptual models differ from implementation code in abstraction level. A conceptual model identifies that customers place orders, while implementation code specifies foreign keys, join tables, validation rules, and API endpoints. This separation allows business logic discussions without technical constraints.

Key Principles

Entity identification forms the core of conceptual modeling. An entity represents a distinct object or concept in the business domain that requires data storage. Entities must have independent existence meaning, clear boundaries, and multiple instances. A customer is an entity because multiple customers exist independently. A customer's address might be an attribute or a separate entity depending on whether addresses have independent significance.

Attribute definition specifies properties that describe entities. Attributes represent atomic facts about an entity that the system needs to track. Each attribute has a name, data type (at conceptual level: text, number, date, boolean), and optionality. The attribute customer_name describes a customer entity, while order_date describes an order. Attributes should not contain derivable information unless business rules require storing the derived value.

Relationship cardinality defines how many instances of one entity relate to instances of another entity. Cardinality types include:

  • One-to-one: Each instance of Entity A relates to exactly one instance of Entity B (Employee to Desk)
  • One-to-many: Each instance of Entity A relates to multiple instances of Entity B (Customer to Orders)
  • Many-to-many: Multiple instances of Entity A relate to multiple instances of Entity B (Students to Courses)

Cardinality also includes participation constraints. Mandatory participation means every entity instance must participate in the relationship. Optional participation allows entity instances without the relationship.

Customer ||--o{ Order
(one customer, zero or more orders)

Order }o--|| Customer  
(each order belongs to exactly one customer)

Student }o--o{ Course
(students enroll in courses, courses contain students)

Relationship attributes describe facts about relationships rather than entities. In a many-to-many relationship between Student and Course, enrollment_date and grade are relationship attributes because they describe the enrollment itself, not the student or course independently. Relationship attributes often indicate the need for an associative entity.

Associative entities resolve many-to-many relationships by creating an entity that represents the relationship itself. The Student-Course relationship becomes an Enrollment entity with its own attributes and relationships. This transformation clarifies the model and simplifies later implementation.

Identifier specification defines how to uniquely distinguish entity instances. Natural identifiers use business-meaningful attributes (social_security_number for Person). Surrogate identifiers introduce artificial keys (customer_id). Composite identifiers combine multiple attributes (country_code + phone_number). At the conceptual level, identifiers establish entity identity without implementation details like auto-increment or UUID generation.

Normalization principles apply at the conceptual level to reduce redundancy. While full normalization occurs in logical modeling, conceptual models benefit from basic principles:

  • Group related attributes with the appropriate entity
  • Avoid repeating groups of attributes
  • Store facts once in the natural entity
  • Represent multi-valued properties as relationships or separate entities

Abstraction and generalization organize entities hierarchically. Generalization combines similar entities into a supertype (Person generalizes Customer and Employee). Specialization divides a general entity into subtypes (Vehicle specializes into Car, Truck, Motorcycle). Subtype relationships inherit attributes and relationships from supertypes while adding specific properties.

Design Considerations

Scope boundaries determine what entities and relationships the model includes. Overly broad scope creates unwieldy models that mix concerns. Narrow scope misses critical relationships and forces awkward workarounds later. Define the business process or domain area first, then identify entities directly involved in that scope. Adjacent domains require interface definitions rather than full modeling.

For an e-commerce system, core scope includes customers, products, orders, and payments. Inventory management might be separate scope with defined touchpoints. Including warehouse operations, shipping logistics, and supplier management in the same conceptual model obscures the primary purchase flow.

Entity versus attribute decisions significantly impact model clarity. An address might be attributes on a Customer entity (street, city, postal_code) or a separate Address entity with a relationship to Customer. Choose entity representation when:

  • The concept has multiple instances per parent entity (Customer has multiple addresses)
  • The concept relates to multiple entity types (both Customer and Warehouse have addresses)
  • The concept has its own complex attributes or relationships (Address relates to TaxJurisdiction)
  • The concept requires independent management (address validation services)

Choose attribute representation for simple, single-valued properties tightly bound to their parent entity without independent relationships.

Relationship directionality affects model comprehension. While relationships are bidirectional at the conceptual level, naming and diagram placement communicate primary navigation. "Customer places Order" reads more naturally than "Order is placed by Customer," though both directions exist. Choose names and placement that match business terminology and primary access patterns.

Weak entity identification applies when entities depend on other entities for identity. An OrderLineItem cannot exist without its parent Order. The line item number identifies items within an order, but globally requires the combination of order_id and line_item_number. Weak entities simplify models by acknowledging dependency relationships but complicate implementation if the dependency chain grows deep.

Temporal modeling choices address how the model represents time. Current-state models store only present values, discarding history. Historical models retain past states, requiring effective dates or versioning. The choice depends on business requirements:

  • Transaction history requires temporal modeling (order status changes over time)
  • Reference data might not need history (current product price is sufficient)
  • Regulatory compliance often mandates temporal tracking (audit trails)
  • Analytical reporting benefits from historical snapshots (trending analysis)

Temporal modeling at the conceptual level identifies which entities need time-based tracking without specifying implementation strategies like slowly changing dimensions or event sourcing.

Denormalization trade-offs sometimes appear at the conceptual level. Strictly normalized models eliminate redundancy but create complex relationship chains. Strategic denormalization stores calculated or frequently accessed values directly. At the conceptual stage, note these decisions as design annotations rather than altering the fundamental model structure. Implementation phases apply denormalization based on performance requirements.

Inheritance versus composition organizes entity hierarchies. Inheritance (is-a relationships) creates subtype hierarchies where subtypes inherit supertype characteristics. Composition (has-a relationships) assembles complex entities from component entities.

A Vehicle supertype with Car and Truck subtypes uses inheritance. A Car entity with references to Engine, Transmission, and Body entities uses composition. Inheritance clarifies taxonomies but can create rigid hierarchies. Composition provides flexibility but increases relationship complexity. Business domain understanding guides this choice.

Implementation Approaches

Entity-Relationship modeling uses three primary notations: Chen notation, Crow's Foot notation, and IDEF1X notation. Chen notation emphasizes relationship diamonds and clear relationship naming. Crow's Foot notation provides visual cardinality indicators that resemble crow's feet at relationship ends. IDEF1X adds more rigorous syntax for key specification.

Chen notation example:

[Customer] ---(places)--< [Order]

Crow's Foot notation example:

Customer ||--o{ Order
Product  }o--|| Category

Crow's Foot notation dominates current practice due to visual clarity and tool support. The notation positions cardinality markers at relationship ends: single lines indicate "one," crow's feet indicate "many," circles indicate "optional," and perpendicular lines indicate "mandatory."

Top-down modeling begins with high-level entities and progressively adds detail. Start by identifying major entities in the domain, then define relationships between them, then add attributes, then refine with subtypes and constraints. This approach works well when business requirements are clear and domain understanding is strong.

Process:

  1. List candidate entities from business process descriptions
  2. Eliminate duplicates and synonyms
  3. Define primary relationships between major entities
  4. Add attributes to entities
  5. Specify cardinality and participation constraints
  6. Identify subtypes and supertypes
  7. Validate with stakeholders
  8. Iterate to add detail

Bottom-up modeling starts with specific data elements and groups them into entities. Analyze existing forms, reports, and data sources to identify attributes, then cluster attributes that describe the same concept into entities. This approach fits situations with extensive existing data but unclear conceptual organization.

Process:

  1. Collect all data elements from sources
  2. Normalize attribute names and eliminate duplicates
  3. Group attributes by subject area
  4. Identify candidate entities from groups
  5. Define relationships based on data dependencies
  6. Validate entity cohesion
  7. Refine and abstract to appropriate level

Inside-out modeling identifies core entities first, then expands outward by exploring relationships. Begin with the most critical entity in the domain, define its attributes and immediate relationships, then follow relationship paths to discover connected entities. This approach focuses effort on the most important areas first.

View integration modeling develops separate models for different user groups or processes, then merges them into a comprehensive model. Each stakeholder group describes their view of the data, creating focused submodels. Integration resolves naming conflicts, combines overlapping entities, and ensures consistency.

Integration challenges include:

  • Naming conflicts: different names for the same entity (Client vs Customer)
  • Entity conflicts: different granularity or scope for similar entities
  • Relationship conflicts: different cardinality or participation constraints
  • Attribute conflicts: different attributes for similar entities

Resolution requires negotiation, standard naming conventions, and often compromise between ideal representations.

Iterative refinement cycles through modeling, validation, and revision. Initial models capture basic structure, validation sessions with stakeholders identify gaps and errors, and revisions improve accuracy. Multiple iterations converge on a stable model that satisfies requirements.

Validation techniques:

  • Walkthrough scenarios: describe business processes using the model
  • Data population: attempt to populate entities with sample instances
  • Query testing: verify the model supports required information retrieval
  • Exception handling: check how the model handles unusual cases

Pattern application reuses standard data modeling patterns for common situations. Party patterns model people and organizations. Product hierarchy patterns structure product catalogs. Transaction patterns capture business events. Applying proven patterns accelerates modeling and reduces errors, but requires adapting patterns to specific business contexts.

Practical Examples

E-commerce order management illustrates fundamental conceptual modeling. The domain includes customers who place orders for products. Orders contain multiple products with specified quantities. Products belong to categories and have pricing information.

Core entities:

  • Customer: individuals who purchase products
  • Order: records of customer purchases
  • Product: items available for sale
  • Category: product groupings
  • OrderItem: line items within orders

Relationships:

  • Customer places Order (one-to-many)
  • Order contains OrderItem (one-to-many)
  • OrderItem references Product (many-to-one)
  • Product belongs to Category (many-to-one)

Conceptual model:

Customer
- customer_id (identifier)
- email
- name
- registration_date

Order
- order_id (identifier)
- customer_id (foreign concept)
- order_date
- status
- total_amount

OrderItem
- order_id (part of composite identifier)
- line_number (part of composite identifier)
- product_id (foreign concept)
- quantity
- unit_price
- line_total

Product
- product_id (identifier)
- category_id (foreign concept)
- name
- description
- base_price
- stock_quantity

Category
- category_id (identifier)
- name
- parent_category_id (for hierarchy)

The OrderItem entity as an associative entity resolves the many-to-many relationship between Order and Product. The line_number combined with order_id forms the identifier, showing weak entity dependency. The unit_price attribute on OrderItem rather than solely on Product captures historical pricing.

Healthcare patient records demonstrates complex relationships and temporal requirements. Patients receive treatments from providers, have diagnoses, and receive prescriptions. The model must track historical information for legal and medical continuity reasons.

Core entities:

  • Patient: individuals receiving care
  • Provider: healthcare professionals
  • Appointment: scheduled interactions
  • Diagnosis: identified conditions
  • Prescription: medication orders
  • Medication: drug inventory

Relationships:

  • Patient schedules Appointment (one-to-many)
  • Provider conducts Appointment (one-to-many)
  • Appointment yields Diagnosis (one-to-many)
  • Diagnosis leads to Prescription (one-to-many)
  • Prescription specifies Medication (many-to-one)

Conceptual model with temporal attributes:

Patient
- patient_id (identifier)
- medical_record_number
- name
- date_of_birth
- blood_type
- allergies

Provider
- provider_id (identifier)
- license_number
- name
- specialty
- credential

Appointment
- appointment_id (identifier)
- patient_id
- provider_id
- scheduled_time
- actual_start_time
- actual_end_time
- appointment_type
- status

Diagnosis
- diagnosis_id (identifier)
- appointment_id
- diagnosis_code
- description
- diagnosed_date
- severity

Prescription
- prescription_id (identifier)
- diagnosis_id
- medication_id
- prescribed_date
- dosage
- frequency
- duration
- refills_allowed
- status

Medication
- medication_id (identifier)
- generic_name
- brand_names
- medication_class
- controlled_substance_schedule

The model separates scheduled_time from actual times in Appointment to handle real-world scheduling variations. Prescription references both Diagnosis (why prescribed) and Medication (what prescribed), creating a dependent relationship chain. Status attributes track temporal states without full history tables at the conceptual level.

Content management system shows inheritance hierarchies and polymorphic relationships. The system manages various content types (articles, videos, images) with common properties and type-specific attributes.

Core entities:

  • User: content creators and consumers
  • Content (supertype): base content properties
  • Article (subtype): text-based content
  • Video (subtype): video content
  • Image (subtype): image content
  • Tag: classification labels
  • Comment: user feedback

Relationships:

  • User creates Content (one-to-many)
  • Content tagged with Tag (many-to-many)
  • User writes Comment on Content (many-to-many associative)
  • Content references related Content (many-to-many self-referential)

Conceptual model with inheritance:

User
- user_id (identifier)
- username
- email
- role
- registration_date

Content (supertype)
- content_id (identifier)
- creator_id
- title
- created_date
- modified_date
- published_date
- status
- view_count

Article (subtype of Content)
- body_text
- word_count
- reading_time
- language

Video (subtype of Content)
- video_file_reference
- duration
- resolution
- format

Image (subtype of Content)
- image_file_reference
- width
- height
- format
- alt_text

Tag
- tag_id (identifier)
- tag_name
- category

ContentTag (associative entity)
- content_id
- tag_id
- assigned_date

Comment
- comment_id (identifier)
- content_id
- user_id
- comment_text
- posted_date
- parent_comment_id (for threading)

The Content supertype captures common attributes inherited by all subtypes. Subtype-specific attributes remain with their respective entities. The self-referential relationship on Comment creates threaded discussions. The many-to-many relationship between Content and Tag uses an associative entity that could later capture additional relationship attributes like assigned_by or confidence_score.

Financial transaction ledger demonstrates temporal tracking and audit requirements. The system records financial transactions, account balances, and transaction history with immutability constraints.

Core entities:

  • Account: financial accounts
  • Transaction: recorded financial events
  • TransactionLine: double-entry line items
  • AccountBalance: point-in-time balance snapshots
  • AuditLog: change tracking

Relationships:

  • Account contains TransactionLine (one-to-many)
  • Transaction composed of TransactionLine (one-to-many)
  • Account has AccountBalance history (one-to-many)

Conceptual model:

Account
- account_id (identifier)
- account_number
- account_type
- account_name
- currency
- opened_date
- status

Transaction
- transaction_id (identifier)
- transaction_date
- transaction_time
- transaction_type
- description
- reference_number
- created_by
- created_timestamp

TransactionLine
- transaction_line_id (identifier)
- transaction_id
- account_id
- line_type (debit/credit)
- amount
- currency
- exchange_rate

AccountBalance
- balance_id (identifier)
- account_id
- balance_date
- opening_balance
- closing_balance
- calculated_timestamp

AuditLog
- audit_id (identifier)
- entity_type
- entity_id
- action_type
- old_values
- new_values
- changed_by
- changed_timestamp

TransactionLine implements double-entry bookkeeping where each Transaction has multiple lines that must balance. AccountBalance provides snapshot history without recalculating from all transactions. AuditLog tracks all changes for compliance and debugging. The model enforces temporal integrity through timestamp attributes and immutable transaction records.

Ruby Implementation

Ruby applications reflect conceptual data models through class definitions, associations, and business logic. While conceptual models remain implementation-agnostic, Ruby code embodies the conceptual structure using object-oriented principles.

ActiveRecord ORM mapping translates conceptual entities to Ruby classes. Each entity becomes a class inheriting from ActiveRecord::Base. Attributes map to database columns accessed as Ruby attributes. Relationships use association macros.

# Customer entity
class Customer < ApplicationRecord
  has_many :orders
  
  validates :email, presence: true, uniqueness: true
  validates :name, presence: true
end

# Order entity  
class Order < ApplicationRecord
  belongs_to :customer
  has_many :order_items
  has_many :products, through: :order_items
  
  validates :order_date, presence: true
  validates :total_amount, numericality: { greater_than_or_equal_to: 0 }
end

# Product entity
class Product < ApplicationRecord
  belongs_to :category
  has_many :order_items
  
  validates :name, presence: true
  validates :base_price, numericality: { greater_than: 0 }
end

# OrderItem associative entity
class OrderItem < ApplicationRecord
  belongs_to :order
  belongs_to :product
  
  validates :quantity, numericality: { greater_than: 0 }
  validates :unit_price, numericality: { greater_than_or_equal_to: 0 }
  
  before_save :calculate_line_total
  
  private
  
  def calculate_line_total
    self.line_total = quantity * unit_price
  end
end

The has_many and belongs_to macros declare relationships matching conceptual cardinality. The through option implements many-to-many relationships via associative entities. Validations enforce conceptual constraints like mandatory attributes and value ranges.

Plain Ruby object modeling represents conceptual models without database coupling. Each entity becomes a Ruby class with attributes and methods. Relationships use instance variables holding references to related objects.

class Customer
  attr_accessor :customer_id, :email, :name, :registration_date
  attr_reader :orders
  
  def initialize(attributes = {})
    @customer_id = attributes[:customer_id]
    @email = attributes[:email]
    @name = attributes[:name]
    @registration_date = attributes[:registration_date] || Date.today
    @orders = []
  end
  
  def place_order(order)
    @orders << order
    order.customer = self
  end
  
  def total_order_value
    @orders.sum(&:total_amount)
  end
end

class Order
  attr_accessor :order_id, :customer, :order_date, :status, :total_amount
  attr_reader :order_items
  
  def initialize(attributes = {})
    @order_id = attributes[:order_id]
    @order_date = attributes[:order_date] || Date.today
    @status = attributes[:status] || 'pending'
    @total_amount = attributes[:total_amount] || 0
    @order_items = []
  end
  
  def add_item(product, quantity, unit_price)
    item = OrderItem.new(
      order: self,
      product: product,
      quantity: quantity,
      unit_price: unit_price
    )
    @order_items << item
    calculate_total
    item
  end
  
  private
  
  def calculate_total
    @total_amount = @order_items.sum(&:line_total)
  end
end

class OrderItem
  attr_accessor :order, :product, :quantity, :unit_price
  
  def initialize(attributes = {})
    @order = attributes[:order]
    @product = attributes[:product]
    @quantity = attributes[:quantity]
    @unit_price = attributes[:unit_price]
  end
  
  def line_total
    quantity * unit_price
  end
end

This approach maintains conceptual relationships through object references without database dependencies. Methods implement business logic derived from the conceptual model. The place_order and add_item methods maintain bidirectional relationships.

Inheritance mapping implements conceptual subtypes using Ruby class inheritance. The supertype becomes a base class with shared attributes and behavior. Subtypes inherit from the supertype and add specific attributes.

# Content supertype
class Content
  attr_accessor :content_id, :creator_id, :title, :created_date,
                :modified_date, :published_date, :status, :view_count
  
  def initialize(attributes = {})
    @content_id = attributes[:content_id]
    @creator_id = attributes[:creator_id]
    @title = attributes[:title]
    @created_date = attributes[:created_date] || Time.now
    @modified_date = attributes[:modified_date]
    @published_date = attributes[:published_date]
    @status = attributes[:status] || 'draft'
    @view_count = attributes[:view_count] || 0
  end
  
  def publish
    @published_date = Time.now
    @status = 'published'
  end
  
  def increment_views
    @view_count += 1
  end
end

# Article subtype
class Article < Content
  attr_accessor :body_text, :word_count, :reading_time, :language
  
  def initialize(attributes = {})
    super
    @body_text = attributes[:body_text]
    @word_count = attributes[:word_count]
    @reading_time = attributes[:reading_time]
    @language = attributes[:language] || 'en'
  end
  
  def calculate_reading_time
    @reading_time = (word_count / 200.0).ceil
  end
end

# Video subtype
class Video < Content
  attr_accessor :video_file_reference, :duration, :resolution, :format
  
  def initialize(attributes = {})
    super
    @video_file_reference = attributes[:video_file_reference]
    @duration = attributes[:duration]
    @resolution = attributes[:resolution]
    @format = attributes[:format]
  end
  
  def formatted_duration
    minutes = duration / 60
    seconds = duration % 60
    "%d:%02d" % [minutes, seconds]
  end
end

Ruby's single inheritance limits direct implementation of multiple inheritance from conceptual models. Modules provide mixins for shared behavior across unrelated classes:

module Taggable
  def add_tag(tag)
    @tags ||= []
    @tags << tag unless @tags.include?(tag)
  end
  
  def remove_tag(tag)
    @tags ||= []
    @tags.delete(tag)
  end
  
  def tags
    @tags ||= []
  end
end

class Article < Content
  include Taggable
end

class Video < Content
  include Taggable
end

Temporal modeling in Ruby requires explicit attribute tracking. Historical records use timestamp attributes and immutability patterns.

class Account
  attr_reader :account_id, :account_number, :balances
  
  def initialize(attributes = {})
    @account_id = attributes[:account_id]
    @account_number = attributes[:account_number]
    @balances = []
  end
  
  def current_balance
    @balances.last&.closing_balance || 0
  end
  
  def balance_at(date)
    @balances
      .select { |b| b.balance_date <= date }
      .max_by(&:balance_date)
      &.closing_balance || 0
  end
  
  def record_balance(date, closing_balance)
    balance = AccountBalance.new(
      account: self,
      balance_date: date,
      closing_balance: closing_balance
    )
    @balances << balance
    @balances.sort_by!(&:balance_date)
    balance
  end
end

class AccountBalance
  attr_reader :account, :balance_date, :closing_balance, :calculated_timestamp
  
  def initialize(attributes = {})
    @account = attributes[:account]
    @balance_date = attributes[:balance_date]
    @closing_balance = attributes[:closing_balance]
    @calculated_timestamp = Time.now
  end
end

The immutable AccountBalance instances preserve historical states. The balance_at method queries history by date. This pattern translates conceptual temporal requirements into Ruby implementation.

Validation and constraints enforce conceptual rules through Ruby code. Validations check attribute values, relationship constraints ensure referential integrity, and business rules implement domain logic.

class Order
  attr_accessor :order_id, :customer, :order_date, :status
  attr_reader :order_items
  
  def initialize(attributes = {})
    @order_items = []
    validate_and_assign(attributes)
  end
  
  def add_item(product, quantity, unit_price)
    raise ArgumentError, "Product required" unless product
    raise ArgumentError, "Quantity must be positive" unless quantity > 0
    raise ArgumentError, "Unit price cannot be negative" unless unit_price >= 0
    
    item = OrderItem.new(
      order: self,
      product: product,
      quantity: quantity,
      unit_price: unit_price
    )
    @order_items << item
    item
  end
  
  def finalize
    raise "Cannot finalize empty order" if @order_items.empty?
    raise "Order already finalized" if @status == 'finalized'
    
    @status = 'finalized'
    @finalized_date = Time.now
  end
  
  private
  
  def validate_and_assign(attributes)
    @customer = attributes[:customer]
    raise ArgumentError, "Customer required" unless @customer
    
    @order_date = attributes[:order_date] || Date.today
    @status = attributes[:status] || 'draft'
  end
end

These validations translate conceptual participation constraints (mandatory customer) and business rules (positive quantities) into executable code.

Tools & Ecosystem

Diagramming tools create visual representations of conceptual data models. Lucidchart, Draw.io, and Microsoft Visio support ER diagram notation with drag-drop entity creation, relationship drawing, and cardinality specification. These tools export diagrams as images for documentation but lack code generation capabilities.

Specialized data modeling tools provide enhanced features. ERwin Data Modeler offers comprehensive modeling with forward and reverse engineering. Oracle SQL Developer Data Modeler integrates with Oracle databases. These tools distinguish between conceptual, logical, and physical models, managing transformations between levels.

Ruby gems for schema definition translate conceptual models into database schemas. ActiveRecord migrations define tables, columns, and constraints:

class CreateEcommerceSchema < ActiveRecord::Migration[7.0]
  def change
    create_table :customers do |t|
      t.string :email, null: false
      t.string :name, null: false
      t.date :registration_date
      t.timestamps
      
      t.index :email, unique: true
    end
    
    create_table :orders do |t|
      t.references :customer, null: false, foreign_key: true
      t.date :order_date, null: false
      t.string :status
      t.decimal :total_amount, precision: 10, scale: 2
      t.timestamps
    end
    
    create_table :products do |t|
      t.references :category, foreign_key: true
      t.string :name, null: false
      t.text :description
      t.decimal :base_price, precision: 10, scale: 2
      t.integer :stock_quantity
      t.timestamps
    end
    
    create_table :order_items do |t|
      t.references :order, null: false, foreign_key: true
      t.references :product, null: false, foreign_key: true
      t.integer :line_number
      t.integer :quantity, null: false
      t.decimal :unit_price, precision: 10, scale: 2, null: false
      t.decimal :line_total, precision: 10, scale: 2
      t.timestamps
    end
  end
end

The migration syntax maps conceptual constructs to physical schema. The t.references method creates foreign keys for relationships. Constraints like null: false enforce mandatory attributes from the conceptual model.

Schema.rb analyzers reverse-engineer existing schemas into conceptual understanding. The rails-erd gem generates ER diagrams from ActiveRecord models:

# Gemfile
gem 'rails-erd', group: :development

# Generate diagram
bundle exec erd --attributes=foreign_keys,content

The generated diagram shows entities (models), attributes, and relationships inferred from ActiveRecord associations and foreign keys. This helps visualize implemented data structures and compare them against conceptual designs.

Documentation generators create reference documentation from conceptual models. Tools like SchemaSpy analyze database schemas and generate HTML documentation with entity descriptions, relationship diagrams, and constraint information. While these tools work with physical schemas, they reveal the underlying conceptual structure.

Model validation tools check conceptual model quality. Database normalization analyzers identify redundancy and anomalies. Relationship consistency checkers verify cardinality matches business rules. Naming convention validators ensure consistent entity and attribute naming.

Version control for models tracks conceptual model evolution. Storing ER diagrams as text-based formats (GraphML, PlantUML) enables diff comparisons. Git tracks changes to model files, providing history and collaboration features:

@startuml
entity Customer {
  * customer_id : int
  * email : string
  * name : string
  registration_date : date
}

entity Order {
  * order_id : int
  * customer_id : int <<FK>>
  * order_date : date
  status : string
  total_amount : decimal
}

Customer ||--o{ Order
@enduml

Text-based formats integrate with development workflows, allowing model reviews in pull requests alongside code changes.

Database design patterns libraries provide reusable conceptual model fragments. Martin Fowler's "Analysis Patterns" and David Hay's "Data Model Patterns" catalog proven solutions for common modeling situations. These references accelerate modeling by providing starting points for party relationships, product hierarchies, transaction patterns, and temporal modeling.

Ruby-specific resources include ActiveRecord pattern documentation and Rails guide database sections. These resources show how conceptual patterns map to ActiveRecord implementations, bridging conceptual design and Ruby code.

Reference

Entity Relationship Notation

Symbol Meaning Usage
Rectangle Entity Represents a thing or concept
Diamond Relationship Shows association between entities
Oval Attribute Property of entity or relationship
Underline Identifier Uniquely identifies instances
Double rectangle Weak entity Depends on parent entity for identity
Double diamond Identifying relationship Links weak entity to parent

Cardinality Markers (Crow's Foot)

Symbol Meaning Description
Single line One Exactly one instance
Crow's foot (three lines) Many Multiple instances
Circle Optional Zero or more
Perpendicular line Mandatory One or more
Double line Identifying Relationship part of identity

Common Cardinality Patterns

Pattern Notation Example
One-to-one mandatory Entity1 mandatory line mandatory line Entity2 Employee to Desk assignment
One-to-many optional Entity1 single line crow's foot with circle Entity2 Customer to Orders (customer may have no orders)
One-to-many mandatory Entity1 single line crow's foot with line Entity2 Order to OrderItems (order must have items)
Many-to-many Entity1 crow's foot crow's foot Entity2 Students to Courses

Entity Types

Type Definition Example
Strong entity Exists independently Customer, Product, Employee
Weak entity Depends on parent entity OrderLineItem, DependentFamilyMember
Associative entity Represents many-to-many relationship Enrollment (Student-Course), Assignment (Employee-Project)
Supertype General entity with subtypes Vehicle, Content, Person
Subtype Specialized entity inheriting from supertype Car (Vehicle), Article (Content), Employee (Person)

Attribute Types

Type Definition Example
Simple Atomic value age, price, email
Composite Multiple components address (street, city, postal_code)
Single-valued One value per instance date_of_birth, customer_id
Multi-valued Multiple values per instance phone_numbers, email_addresses
Derived Calculated from other attributes age (from date_of_birth), total_price
Optional May be null middle_name, fax_number
Mandatory Must have value customer_id, order_date

Identifier Types

Type Description Characteristics
Natural Business-meaningful attribute Social security number, email address, ISBN
Surrogate Artificial key Auto-increment ID, UUID, sequence number
Composite Multiple attributes combined country_code + phone_number, order_id + line_number
Candidate Potential identifier Multiple options exist, one selected as primary

Relationship Types

Type Description Implementation Note
Binary Two entities Most common relationship type
Ternary Three entities Often decomposed into binary relationships
Unary (recursive) Entity to itself Employee manages Employee, Category parent Category
Identifying Determines weak entity identity Order identifies OrderLineItem
Non-identifying Does not affect identity Customer places Order

Modeling Decisions Checklist

Decision Considerations Impact
Entity vs Attribute Independent existence, multiple instances, relationships to other entities Model complexity, query patterns
Natural vs Surrogate Key Stability, meaningfulness, uniqueness guarantees Data integrity, migration complexity
Subtype vs Separate Entities Shared attributes, is-a relationship, behavior inheritance Query complexity, schema flexibility
Normalized vs Denormalized Redundancy, update anomalies, query performance Data integrity, maintenance overhead
Current State vs Historical Audit requirements, reporting needs, compliance Storage requirements, query complexity
Weak vs Strong Entity Identity dependency, existence dependency Implementation complexity, cascade rules

Common Modeling Patterns

Pattern Structure Use Case
Party Person and Organization as subtypes of Party Customer relationship management
Product Hierarchy Product with recursive category relationship E-commerce catalogs
Transaction Transaction with multiple line items Financial systems, order processing
Temporal Entity with effective date and end date Historical tracking, versioning
Audit Trail Change log with old and new values Compliance, debugging
Type-Instance Type entity defining instances Product types, document templates
Role Person in multiple roles Employee who is also customer

ActiveRecord Association Methods

Method Cardinality Usage
belongs_to Many-to-one Order belongs_to Customer
has_one One-to-one User has_one Profile
has_many One-to-many Customer has_many Orders
has_many :through Many-to-many Student has_many Courses through Enrollments
has_and_belongs_to_many Many-to-many Student has_and_belongs_to_many Courses

Normalization Levels (Conceptual Relevance)

Level Requirement Conceptual Impact
1NF Atomic attributes, no repeating groups Entity attribute structure
2NF No partial dependencies on composite keys Weak entity identification
3NF No transitive dependencies Entity responsibility boundaries
BCNF Every determinant is candidate key Relationship refinement

Model Quality Metrics

Metric Good Practice Warning Sign
Entities per model 10-30 for focused domain Over 50 indicates scope issues
Attributes per entity 5-15 typical Over 20 suggests decomposition needed
Relationships per entity 3-7 average Over 12 indicates central entity problems
Many-to-many relationships Use associative entities Direct many-to-many hides attributes
Weak entities Minimal depth Multi-level weakness complicates implementation
Subtype depth 2-3 levels maximum Deep hierarchies become rigid