CrackedRuby CrackedRuby

DRY (Don't Repeat Yourself)

Overview

DRY (Don't Repeat Yourself) states that every piece of knowledge must have a single, unambiguous, authoritative representation within a system. Coined by Andy Hunt and Dave Thomas in "The Pragmatic Programmer" (1999), this principle extends beyond code duplication to encompass all forms of knowledge repetition in software systems.

The principle addresses repetition at multiple levels: duplicate code logic, redundant data representations, repeated business rules, duplicated documentation, and parallel test scenarios. When knowledge exists in multiple places, changes require updates across all locations, creating maintenance burden and introducing inconsistency risks.

DRY applies to code structure, data schemas, configuration files, build scripts, documentation, and test suites. A violation occurs when the same knowledge appears in multiple forms or locations, not merely when similar code patterns exist. Two code blocks may look identical but represent different concepts, making their similarity coincidental rather than a DRY violation.

# DRY violation: user validation logic duplicated
class UserController
  def create
    user = User.new(params[:user])
    if user.email.blank? || !user.email.match?(/@/)
      render json: { error: "Invalid email" }, status: 422
    end
  end
end

class RegistrationController
  def register
    user = User.new(params[:user])
    if user.email.blank? || !user.email.match?(/@/)
      render json: { error: "Invalid email" }, status: 422
    end
  end
end

The core insight is that duplication creates coupling across time. When requirements change, developers must identify and update every instance of duplicated knowledge. Missing one location introduces bugs and inconsistencies that compound over time.

Key Principles

Knowledge in software systems takes multiple forms: business logic, data structures, algorithms, validation rules, formatting specifications, and integration contracts. Each knowledge piece should exist in exactly one place within the system. This single location becomes the authoritative source for that knowledge.

Knowledge vs. Code Similarity: DRY focuses on knowledge repetition, not syntactic similarity. Two functions might share identical structure while representing distinct business concepts. Conversely, different code implementations might express the same underlying knowledge.

# Not a DRY violation: similar code, different knowledge
def calculate_employee_tax(income)
  income * 0.20
end

def calculate_contractor_tax(income)
  income * 0.20
end

# These represent different tax rules that happen to have the same rate currently.
# When employee or contractor tax rates change independently, these should diverge.

Sources of Truth: A single source of truth means that when specific knowledge changes, only one location requires modification. The system then derives all related behavior, data, or outputs from that single definition.

# Single source of truth for user status definitions
class User
  STATUSES = {
    active: { code: 'ACT', description: 'Active user', ui_class: 'status-active' },
    suspended: { code: 'SUS', description: 'Suspended user', ui_class: 'status-warning' },
    deleted: { code: 'DEL', description: 'Deleted user', ui_class: 'status-danger' }
  }.freeze

  def status_code
    STATUSES[status][:code]
  end

  def status_label
    STATUSES[status][:description]
  end
end

Levels of Abstraction: DRY operates at multiple abstraction levels. At the code level, it prevents duplicate functions or methods. At the data level, it eliminates redundant storage. At the system level, it avoids duplicate services or components. At the architecture level, it prevents multiple systems maintaining the same knowledge.

Derived Information: Information derived from other data should not be stored redundantly. Calculate derived values on demand or cache them with explicit invalidation mechanisms. Storing derived data creates synchronization requirements and violates DRY principles.

# DRY violation: storing derived data
class Order
  attr_accessor :items, :total_price  # total_price is derived from items

  def add_item(item)
    items << item
    self.total_price += item.price  # Manual synchronization required
  end
end

# DRY compliant: calculate on demand
class Order
  attr_reader :items

  def add_item(item)
    items << item
  end

  def total_price
    items.sum(&:price)
  end
end

Documentation and Code: Comments that explain what code does duplicate knowledge already expressed in the code itself. Comments should explain why decisions were made, not restate what the code does. When code changes but comments remain unchanged, documentation becomes incorrect.

Configuration and Defaults: Default values, connection strings, and environment-specific settings should exist in single configuration locations. Hardcoding these values throughout the codebase creates maintenance nightmares during deployment or infrastructure changes.

Ruby Implementation

Ruby provides multiple mechanisms for implementing DRY principles through its object-oriented features, metaprogramming capabilities, and module system.

Modules for Shared Behavior: Modules encapsulate reusable behavior that multiple classes share. Unlike inheritance, modules allow composition of functionality without rigid class hierarchies.

module Timestampable
  def created_at
    @created_at ||= Time.now
  end

  def updated_at
    @updated_at
  end

  def touch
    @updated_at = Time.now
  end
end

class Article
  include Timestampable
  attr_accessor :title, :content
end

class Comment
  include Timestampable
  attr_accessor :body
end

Metaprogramming for Pattern Elimination: Ruby's metaprogramming capabilities eliminate repetitive patterns through code generation at runtime. Methods like define_method, method_missing, and class_eval create behavior dynamically.

class ApiClient
  %w[users posts comments].each do |resource|
    define_method("get_#{resource}") do |id|
      request(:get, "/#{resource}/#{id}")
    end

    define_method("list_#{resource}") do
      request(:get, "/#{resource}")
    end

    define_method("create_#{resource}") do |data|
      request(:post, "/#{resource}", data)
    end
  end

  private

  def request(method, path, data = nil)
    # HTTP request implementation
  end
end

Class Inheritance for Specialization: Inheritance extracts common behavior into superclasses while allowing subclasses to specialize or override specific behaviors.

class Document
  attr_accessor :title, :author, :created_at

  def initialize(title, author)
    @title = title
    @author = author
    @created_at = Time.now
  end

  def metadata
    {
      title: title,
      author: author,
      created_at: created_at,
      type: self.class.name
    }
  end
end

class Report < Document
  attr_accessor :data

  def generate
    # Report-specific generation logic
  end
end

class Invoice < Document
  attr_accessor :line_items, :total

  def calculate_total
    # Invoice-specific calculation
  end
end

Blocks and Procs for Behavior Parameterization: Blocks allow methods to accept behavior as parameters, eliminating the need for multiple similar methods.

class DataProcessor
  def process_records(records, &block)
    records.map do |record|
      transformed = transform(record)
      block.call(transformed) if block_given?
      transformed
    end
  end

  private

  def transform(record)
    # Base transformation logic
  end
end

# Usage with different behaviors
processor = DataProcessor.new
processor.process_records(data) { |record| log_record(record) }
processor.process_records(data) { |record| validate_record(record) }

Constants for Shared Values: Constants define values used across multiple locations from a single definition point.

class HttpStatus
  OK = 200
  CREATED = 201
  BAD_REQUEST = 400
  UNAUTHORIZED = 401
  NOT_FOUND = 404
  SERVER_ERROR = 500

  def self.successful?(code)
    code >= 200 && code < 300
  end

  def self.client_error?(code)
    code >= 400 && code < 500
  end
end

# Usage throughout application
def handle_response(response)
  case response.code
  when HttpStatus::OK
    process_success(response)
  when HttpStatus::NOT_FOUND
    handle_not_found
  end
end

Method Delegation: Ruby's delegation mechanisms forward behavior without duplicating implementation details.

require 'forwardable'

class UserProfile
  extend Forwardable

  def initialize(user)
    @user = user
  end

  def_delegators :@user, :email, :username, :created_at

  def display_name
    @user.full_name || @user.username
  end
end

Practical Examples

Database Schema and Model Validation: Business rules enforced in multiple layers create synchronization challenges. Define validation logic once and reference it across layers.

# Before: Validation logic duplicated
class User < ApplicationRecord
  validates :email, format: { with: /@/, message: "Invalid email" }
end

# Controller also validates
class UsersController
  def create
    if params[:email] !~ /@/
      render json: { error: "Invalid email" }
      return
    end
    User.create(params)
  end
end

# After: Single source of validation
class User < ApplicationRecord
  EMAIL_REGEX = /\A[^@\s]+@[^@\s]+\.[^@\s]+\z/

  validates :email, format: { with: EMAIL_REGEX }

  def self.valid_email?(email)
    email.match?(EMAIL_REGEX)
  end
end

class UsersController
  def create
    user = User.new(params)
    if user.valid?
      user.save
      render json: user
    else
      render json: { errors: user.errors }, status: 422
    end
  end
end

API Response Formatting: Consistent response structures across endpoints often lead to duplicated formatting code.

# Before: Response formatting duplicated
class UsersController
  def show
    user = User.find(params[:id])
    render json: {
      data: {
        id: user.id,
        type: 'user',
        attributes: {
          name: user.name,
          email: user.email
        }
      }
    }
  end
end

class PostsController
  def show
    post = Post.find(params[:id])
    render json: {
      data: {
        id: post.id,
        type: 'post',
        attributes: {
          title: post.title,
          body: post.body
        }
      }
    }
  end
end

# After: Centralized serialization
class JsonApiSerializer
  def self.serialize(resource, type)
    {
      data: {
        id: resource.id,
        type: type,
        attributes: resource.attributes.except('id', 'created_at', 'updated_at')
      }
    }
  end
end

class UsersController
  def show
    user = User.find(params[:id])
    render json: JsonApiSerializer.serialize(user, 'user')
  end
end

Configuration Management: Application configuration scattered across files creates deployment and environment management problems.

# Before: Database configuration duplicated
class DatabaseConnection
  def self.connect
    PG.connect(
      host: 'localhost',
      port: 5432,
      dbname: 'myapp_production'
    )
  end
end

class CacheConnection
  def self.connect
    Redis.new(
      host: 'localhost',
      port: 6379
    )
  end
end

# After: Centralized configuration
class Config
  def self.database
    {
      host: ENV.fetch('DB_HOST', 'localhost'),
      port: ENV.fetch('DB_PORT', 5432).to_i,
      dbname: ENV.fetch('DB_NAME', "myapp_#{environment}")
    }
  end

  def self.redis
    {
      host: ENV.fetch('REDIS_HOST', 'localhost'),
      port: ENV.fetch('REDIS_PORT', 6379).to_i
    }
  end

  def self.environment
    ENV.fetch('RAILS_ENV', 'development')
  end
end

class DatabaseConnection
  def self.connect
    PG.connect(Config.database)
  end
end

Query Building: Complex database queries repeated across service objects or repositories violate DRY when the same filtering or joining logic appears multiple times.

# Before: Query logic duplicated
class ReportService
  def active_users_this_month
    User.where(status: 'active')
        .where('created_at >= ?', 1.month.ago)
  end
end

class UserAnalytics
  def growth_rate
    current = User.where(status: 'active')
                  .where('created_at >= ?', 1.month.ago).count
    # calculation logic
  end
end

# After: Centralized query scopes
class User < ApplicationRecord
  scope :active, -> { where(status: 'active') }
  scope :created_since, ->(date) { where('created_at >= ?', date) }
  scope :this_month, -> { created_since(1.month.ago) }

  def self.active_this_month
    active.this_month
  end
end

class ReportService
  def active_users_this_month
    User.active_this_month
  end
end

Design Considerations

Abstraction Timing: Premature abstraction creates unnecessary complexity before patterns emerge clearly. Extract commonality after recognizing genuine duplication, not after seeing code twice. The "rule of three" suggests waiting until knowledge appears three times before abstracting.

Creating abstractions too early produces incorrect abstractions that fit initial cases but constrain future requirements. These wrong abstractions become harder to modify than duplication. Wait until the common pattern stabilizes before extracting shared code.

# Premature abstraction
class FormValidator
  def validate(form, rules)
    # Complex generic validation that tries to handle all cases
  end
end

# Better: Wait for pattern to emerge
class LoginForm
  def validate
    errors = []
    errors << "Email required" if email.blank?
    errors << "Password required" if password.blank?
    errors
  end
end

class RegistrationForm
  def validate
    errors = []
    errors << "Email required" if email.blank?
    errors << "Password required" if password.blank?
    errors << "Password too short" if password.length < 8
    errors
  end
end

# After third similar form, extract common pattern

Wrong Abstraction Cost: A wrong abstraction damages codebases more than duplication. Incorrect abstractions force developers to work around limitations, add conditional logic, and create coupling between unrelated features. Removing wrong abstractions and returning to duplication often improves codebases.

When an abstraction requires frequent conditional checks or special cases, it signals incorrect factoring. The abstraction assumed similarity that doesn't exist at the knowledge level.

# Wrong abstraction: Combining unrelated concepts
class PaymentProcessor
  def process(payment, type)
    case type
    when :credit_card
      # Credit card specific logic
    when :bank_transfer
      # Bank transfer specific logic
    when :cryptocurrency
      # Crypto specific logic with completely different flow
    end
  end
end

# Better: Separate implementations
class CreditCardProcessor
  def process(payment)
    # Credit card logic
  end
end

class BankTransferProcessor
  def process(payment)
    # Bank transfer logic
  end
end

Coupling vs. Duplication Trade-off: DRY creates coupling between code locations that share knowledge. When shared knowledge changes, all dependent locations must adapt. Sometimes controlled duplication provides better flexibility than tight coupling.

Services or modules with different change frequencies benefit from isolation despite code similarity. Microservices often duplicate code intentionally to maintain service independence.

Change Frequency Alignment: Code that changes together should live together. Code that changes independently should remain separate even when initially similar. Analyze change patterns when deciding whether to eliminate duplication.

# Different change frequencies suggest separation
class UserEmailValidator
  # Changes when user requirements change
  def valid?(email)
    email.match?(/@/)
  end
end

class MarketingEmailValidator
  # Changes when marketing rules change
  def valid?(email)
    email.match?(/@/) # Same implementation, different knowledge
  end
end

Domain Boundaries: Duplication across domain boundaries often represents separate concepts that happen to share implementation. Microservices, bounded contexts, or separate applications should duplicate code rather than share implementations that couple domains.

Code Size vs. Abstraction Complexity: Small duplications sometimes create less complexity than the abstractions needed to eliminate them. A three-line method duplicated twice may be clearer than introducing new classes or modules.

Common Patterns

Template Method Pattern: Define algorithm structure in base class while allowing subclasses to customize specific steps.

class DocumentGenerator
  def generate
    prepare_data
    format_content
    apply_styling
    export
  end

  def prepare_data
    raise NotImplementedError
  end

  def format_content
    # Common formatting logic
  end

  def apply_styling
    raise NotImplementedError
  end

  def export
    # Common export logic
  end
end

class PdfGenerator < DocumentGenerator
  def prepare_data
    # PDF-specific data preparation
  end

  def apply_styling
    # PDF styling
  end
end

class HtmlGenerator < DocumentGenerator
  def prepare_data
    # HTML-specific data preparation
  end

  def apply_styling
    # HTML styling
  end
end

Strategy Pattern: Encapsulate interchangeable algorithms, eliminating conditional logic duplication.

class PricingStrategy
  def calculate(base_price)
    raise NotImplementedError
  end
end

class StandardPricing < PricingStrategy
  def calculate(base_price)
    base_price
  end
end

class DiscountPricing < PricingStrategy
  def initialize(discount_percent)
    @discount_percent = discount_percent
  end

  def calculate(base_price)
    base_price * (1 - @discount_percent / 100.0)
  end
end

class Product
  attr_accessor :base_price, :pricing_strategy

  def price
    pricing_strategy.calculate(base_price)
  end
end

Concern Pattern: Extract cross-cutting concerns into modules that multiple classes include.

module Searchable
  extend ActiveSupport::Concern

  included do
    scope :search, ->(query) {
      where("#{table_name}.searchable_text ILIKE ?", "%#{query}%")
    }
  end

  def update_search_index
    update_column(:searchable_text, searchable_content)
  end

  private

  def searchable_content
    raise NotImplementedError
  end
end

class Article < ApplicationRecord
  include Searchable

  private

  def searchable_content
    [title, body].join(' ')
  end
end

class Product < ApplicationRecord
  include Searchable

  private

  def searchable_content
    [name, description, category].join(' ')
  end
end

Builder Pattern: Construct complex objects through fluent interfaces, centralizing construction logic.

class QueryBuilder
  def initialize(base_query)
    @query = base_query
  end

  def where(conditions)
    @query = @query.where(conditions)
    self
  end

  def order_by(column, direction = :asc)
    @query = @query.order(column => direction)
    self
  end

  def limit(count)
    @query = @query.limit(count)
    self
  end

  def execute
    @query.to_a
  end
end

# Usage eliminates repeated query construction
results = QueryBuilder.new(User.all)
  .where(status: 'active')
  .order_by(:created_at, :desc)
  .limit(10)
  .execute

Decorator Pattern: Add behavior to objects dynamically without modifying class definitions.

class MessageDecorator
  def initialize(message)
    @message = message
  end

  def content
    @message.content
  end
end

class HtmlDecorator < MessageDecorator
  def content
    "<p>#{@message.content}</p>"
  end
end

class MarkdownDecorator < MessageDecorator
  def content
    Markdown.render(@message.content)
  end
end

# Usage
plain_message = Message.new("Hello world")
html_message = HtmlDecorator.new(plain_message)
markdown_message = MarkdownDecorator.new(plain_message)

Common Pitfalls

Abstraction for Similarity: Code that looks similar but represents different concepts should remain separate. Combining code based on syntactic similarity rather than conceptual unity creates inappropriate coupling.

# Bad: Combined because code looks similar
class Formatter
  def format(value, type)
    case type
    when :currency
      "$#{value}"  # Financial display
    when :percentage
      "#{value}%"  # Statistical display
    end
  end
end

# These represent different domains and will evolve independently
# Better to keep separate despite similar structure

Over-DRYing: Extracting every tiny duplication creates excessive indirection. Methods with single call sites or abstractions more complex than duplication they replace harm readability.

# Over-DRY: Excessive extraction
def process_user
  set_name
  set_email
  validate_data
  save_to_database
end

def set_name
  @name = params[:name]
end

def set_email
  @email = params[:email]
end

# Better: Simple direct code
def process_user
  @name = params[:name]
  @email = params[:email]
  validate_data
  save_to_database
end

Shared Mutable State: DRY implementations that share mutable state between callers create hidden dependencies and bugs.

# Dangerous: Shared mutable state
class RequestCounter
  @@count = 0

  def self.increment
    @@count += 1
  end

  def self.count
    @@count
  end
end

# Multiple threads or requests interfere with each other

Generic Parameter Proliferation: Methods accepting multiple parameters or configuration hashes to handle all cases become harder to use than specialized duplicates.

# Over-parameterized to avoid duplication
def send_notification(user, type, urgent, include_attachment, template, ...)
  # Complex branching logic based on parameters
end

# Better: Specific methods
def send_urgent_email(user, template)
  # Simple focused implementation
end

def send_routine_notification(user)
  # Simple focused implementation
end

Inheritance Depth: Deep inheritance hierarchies created to eliminate duplication become fragile and difficult to understand. Favor composition over inheritance for DRY implementations.

False Duplication Detection: Similar-looking code serving different purposes gets incorrectly unified. Two functions calculating percentages for different business purposes should remain separate even when implementation matches.

Documentation Duplication: Comments that restate code rather than explain intent duplicate knowledge. When code changes, comments become outdated lies.

# Bad: Comment duplicates code
def calculate_total
  # Calculate total by summing item prices
  @items.sum(&:price)
end

# Better: Comment explains why
def calculate_total
  # Excludes promotional items per billing requirements
  @items.reject(&:promotional?).sum(&:price)
end

Testing Duplication: Test setup duplicated across test files creates maintenance burden. Extract common setup into shared helpers or fixtures while keeping test assertions explicit.

# Before: Duplicated test setup
class UserTest < Minitest::Test
  def test_validation
    user = User.new(name: "Test", email: "test@example.com")
    # test code
  end

  def test_persistence
    user = User.new(name: "Test", email: "test@example.com")
    # test code
  end
end

# After: Extracted setup
class UserTest < Minitest::Test
  def setup
    @user = create_test_user
  end

  def test_validation
    # test code using @user
  end

  private

  def create_test_user
    User.new(name: "Test", email: "test@example.com")
  end
end

Reference

DRY Principle Checklist

Aspect Question Action
Knowledge Location Does this knowledge exist elsewhere? Search codebase for similar logic
Change Impact How many places change when requirements change? Trace knowledge dependencies
Source of Truth Is there a single authoritative source? Identify or create canonical source
Abstraction Readiness Has pattern appeared three times? Wait or extract based on confidence
Coupling Cost Does sharing increase coupling excessively? Evaluate independence vs. duplication

DRY Violation Indicators

Indicator Description Severity
Multi-location Changes Single requirement changes multiple files High
Copy-Paste Patterns Similar code blocks with minor variations Medium
Parallel Hierarchies Two class hierarchies that mirror each other High
Shotgun Surgery Small changes ripple through many locations High
Divergent Implementation Same concept implemented differently Medium
Magic Numbers Same constants defined in multiple places Low

Ruby DRY Techniques

Technique Use Case Implementation
Modules Shared behavior across classes include or extend modules
Class Inheritance Specialization of common base Subclass with super calls
Metaprogramming Pattern-based method generation define_method, method_missing
Delegation Forwarding to contained objects Forwardable, SimpleDelegator
Blocks Parameterizing behavior yield, block.call
Constants Shared values Class or module constants
Configuration Environment-specific values Config files, ENV variables
Concerns Cross-cutting functionality ActiveSupport::Concern

Abstraction Decision Matrix

Factor Favor Abstraction Favor Duplication
Change Frequency Changes together frequently Changes independently
Pattern Stability Pattern well understood Pattern still emerging
Coupling Tolerance Low coupling cost High coupling risk
Abstraction Complexity Simpler than duplication More complex than duplication
Domain Boundaries Same bounded context Different bounded contexts
Team Knowledge Pattern well known to team Novel pattern requiring learning

Common DRY Patterns

Pattern Purpose Key Benefit
Template Method Define algorithm skeleton Consistent structure, variable steps
Strategy Interchangeable algorithms Eliminates conditional logic
Decorator Add behavior dynamically Flexible enhancement without subclassing
Builder Complex object construction Centralized construction logic
Concern Cross-cutting functionality Reusable behavior across classes
Service Object Complex business logic Single location for business rules

Anti-Patterns to Avoid

Anti-Pattern Problem Solution
Premature Abstraction Abstracts before pattern clear Wait for third occurrence
Wrong Abstraction Forces unrelated code together Split into separate implementations
Over-parameterization Too many configuration options Create specific methods
Deep Inheritance Fragile hierarchy Favor composition
Shared Mutable State Hidden dependencies Use immutable data or isolation
Generic Base Classes Everything inherits from God object Split into focused components