CrackedRuby CrackedRuby

Views and Materialized Views

Overview

Database views are named queries stored in the database that appear as virtual tables. A view executes its underlying query each time it is accessed, providing a consistent abstraction layer over base tables without storing data physically. Views serve multiple purposes: simplifying complex queries, enforcing security by limiting table access, maintaining compatibility when schema changes occur, and providing consistent interfaces to data.

Materialized views store query results physically on disk. Unlike regular views that execute their query on each access, materialized views compute the result once and cache it. This trade-off exchanges storage space and data freshness for query performance. Materialized views refresh on demand or on schedule, making them appropriate for queries that run frequently but can tolerate slightly stale data.

The distinction between views and materialized views determines application architecture choices. Views add minimal overhead for simple queries but may cause performance issues when the underlying query is complex or involves multiple joins. Materialized views deliver consistent fast query performance but require refresh mechanisms and additional storage.

-- Regular view: executes query on each access
CREATE VIEW active_users AS
SELECT id, name, email, last_login
FROM users
WHERE status = 'active';

-- Materialized view: stores results physically
CREATE MATERIALIZED VIEW user_statistics AS
SELECT 
  DATE_TRUNC('day', created_at) AS signup_date,
  COUNT(*) AS user_count,
  COUNT(CASE WHEN verified THEN 1 END) AS verified_count
FROM users
GROUP BY DATE_TRUNC('day', created_at);

Key Principles

Views function as stored SELECT statements that the database treats as virtual tables. When a query references a view, the database substitutes the view's definition into the query, creating a combined query plan. This substitution happens at query planning time, allowing the database optimizer to optimize the entire query as a unit. The view itself stores no data—it merely provides an abstraction layer.

View definitions can include SELECT statements with joins, aggregations, subqueries, and computed columns. The database validates the view definition when created but executes it only when accessed. Views can reference other views, creating layered abstractions, though excessive nesting impacts query planning performance.

Materialized views store the query result as a physical table. The database populates the materialized view when created or refreshed, executing the underlying query and storing the results. Subsequent queries against the materialized view read from this stored data rather than executing the original query. The database may create indexes on materialized views to optimize access patterns.

The refresh mechanism for materialized views operates in two modes. Complete refresh re-executes the entire query and replaces all data. Incremental refresh (when supported) identifies changes to base tables and updates only affected rows. Incremental refresh requires specific conditions: the query must use certain patterns, base tables need primary keys, and the database must track changes.

View updateability depends on the complexity of the view definition. Simple views selecting from a single table with no computed columns or aggregations can support INSERT, UPDATE, and DELETE operations. The database translates modifications to the view into modifications to the underlying table. Complex views with joins, aggregations, or DISTINCT clauses are read-only.

-- Simple updatable view
CREATE VIEW recent_orders AS
SELECT order_id, customer_id, order_date, total
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days';

-- Update operates on underlying table
UPDATE recent_orders SET total = 150.00 WHERE order_id = 12345;

-- Complex non-updatable view
CREATE VIEW customer_order_summary AS
SELECT 
  c.customer_id,
  c.name,
  COUNT(o.order_id) AS order_count,
  SUM(o.total) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.name;

Database systems apply different strategies when querying views. Query rewrite substitutes the view definition into the outer query. Materialization executes the view query first and stores temporary results. The optimizer chooses based on cost estimates. For materialized views, the database reads from stored results, bypassing the original query entirely.

Dependency tracking maintains relationships between views and base tables. When a base table changes structure, the database identifies affected views. Some databases automatically invalidate views when dependencies change; others require manual recompilation. Circular dependencies between views are prohibited—the database detects cycles during view creation.

Ruby Implementation

Ruby database libraries interact with views as if they were regular tables. ActiveRecord models can map to views, enabling the same query interface and associations used with table-backed models. The distinction between views and tables is transparent to application code in most cases.

# Migration creating a view
class CreateActiveUsersView < ActiveRecord::Migration[7.0]
  def up
    execute <<-SQL
      CREATE VIEW active_users AS
      SELECT id, name, email, last_login, status
      FROM users
      WHERE status = 'active'
        AND last_login >= CURRENT_DATE - INTERVAL '90 days'
    SQL
  end

  def down
    execute "DROP VIEW IF EXISTS active_users"
  end
end

# Model backed by view
class ActiveUser < ApplicationRecord
  self.table_name = 'active_users'
  self.primary_key = 'id'
  
  # View is read-only
  def readonly?
    true
  end
end

# Use like any ActiveRecord model
active_count = ActiveUser.count
recent_users = ActiveUser.where('last_login > ?', 1.week.ago)

The scenic gem provides Rails-native view management. Scenic generates SQL files for view definitions and creates migrations that apply them. This approach keeps view definitions under version control as plain SQL files, separate from migration code.

# Using scenic gem
# Run: rails generate scenic:view active_users

# Generates migration
class CreateActiveUsers < ActiveRecord::Migration[7.0]
  def change
    create_view :active_users
  end
end

# Generates db/views/active_users_v01.sql
# SELECT id, name, email, last_login
# FROM users
# WHERE status = 'active';

# Model definition
class ActiveUser < ApplicationRecord
  # Scenic handles readonly? automatically
end

# Update view definition
# Run: rails generate scenic:view active_users --version 2
# Edit db/views/active_users_v02.sql with new definition

class UpdateActiveUsersToVersion2 < ActiveRecord::Migration[7.0]
  def change
    update_view :active_users, version: 2, revert_to_version: 1
  end
end

PostgreSQL materialized views require manual refresh management. Rails applications typically refresh materialized views through scheduled jobs. The application must handle the staleness window between refreshes.

# Migration for materialized view
class CreateUserStatisticsMaterializedView < ActiveRecord::Migration[7.0]
  def up
    execute <<-SQL
      CREATE MATERIALIZED VIEW user_statistics AS
      SELECT 
        DATE_TRUNC('day', created_at) AS signup_date,
        COUNT(*) AS user_count,
        COUNT(CASE WHEN verified THEN 1 END) AS verified_count,
        COUNT(CASE WHEN status = 'active' THEN 1 END) AS active_count
      FROM users
      GROUP BY DATE_TRUNC('day', created_at)
    SQL
    
    # Add index for fast querying
    execute "CREATE UNIQUE INDEX idx_user_stats_date ON user_statistics (signup_date)"
  end

  def down
    execute "DROP MATERIALIZED VIEW IF EXISTS user_statistics"
  end
end

# Model with refresh methods
class UserStatistic < ApplicationRecord
  self.table_name = 'user_statistics'
  self.primary_key = 'signup_date'
  
  def readonly?
    true
  end
  
  def self.refresh!
    connection.execute("REFRESH MATERIALIZED VIEW user_statistics")
  end
  
  def self.refresh_concurrently!
    connection.execute("REFRESH MATERIALIZED VIEW CONCURRENTLY user_statistics")
  end
end

# Scheduled refresh job
class RefreshUserStatisticsJob < ApplicationJob
  queue_as :default
  
  def perform
    UserStatistic.refresh_concurrently!
  end
end

# Schedule in whenever or sidekiq-scheduler
# every 1.hour do
#   RefreshUserStatisticsJob.perform_later
# end

The scenic gem also supports materialized views with similar patterns. Scenic tracks view versions and handles both regular and materialized views through the same interface.

# Generate materialized view
# rails generate scenic:view user_statistics --materialized

class CreateUserStatistics < ActiveRecord::Migration[7.0]
  def change
    create_view :user_statistics, materialized: true
  end
end

# Refresh through scenic
class UserStatistic < ApplicationRecord
  def self.refresh
    Scenic.database.refresh_materialized_view(table_name, concurrently: false)
  end
  
  def self.refresh_concurrently
    Scenic.database.refresh_materialized_view(table_name, concurrently: true)
  end
end

Sequel handles views similarly to ActiveRecord. Sequel datasets work with views transparently, and the materialized view refresh methods integrate through raw SQL execution.

# Sequel view usage
DB.create_view(:active_users, 
  DB[:users].where(status: 'active').select(:id, :name, :email))

class ActiveUser < Sequel::Model(:active_users)
  # Sequel automatically handles readonly status
end

# Materialized view refresh
DB.run("REFRESH MATERIALIZED VIEW user_statistics")

# Concurrent refresh (requires unique index)
DB.run("REFRESH MATERIALIZED VIEW CONCURRENTLY user_statistics")

Design Considerations

Choose regular views when query results must reflect current data immediately. Views add negligible overhead for simple queries but may cause performance issues when the underlying query is complex. Views excel at providing security boundaries, as they limit column and row access without duplicating security logic across application code.

Materialized views suit scenarios where query performance matters more than data freshness. Analytics dashboards, reporting systems, and summary tables benefit from materialized views. The application must tolerate the staleness window between refreshes. For time-series data or historical analysis, staleness rarely matters since the underlying data changes infrequently.

The refresh frequency determines materialized view practicality. High-frequency refreshes consume database resources and may cause locking issues. Low-frequency refreshes increase data staleness. The balance depends on how quickly base data changes and how current the view data must be.

# Decision matrix example

# Use regular view: data must be current
class CurrentInventory < ApplicationRecord
  self.table_name = 'current_inventory_view'
  # View shows real-time stock levels
  # SELECT p.id, p.name, p.sku, 
  #        COALESCE(SUM(i.quantity), 0) as available
  # FROM products p
  # LEFT JOIN inventory i ON p.id = i.product_id
  # GROUP BY p.id, p.name, p.sku
end

# Use materialized view: data changes slowly, queries are expensive
class MonthlyRevenue < ApplicationRecord
  self.table_name = 'monthly_revenue_mv'
  # View aggregates millions of transactions
  # Refresh nightly when traffic is low
  # SELECT DATE_TRUNC('month', order_date) as month,
  #        SUM(total) as revenue,
  #        COUNT(*) as order_count
  # FROM orders
  # GROUP BY DATE_TRUNC('month', order_date)
end

Index strategy differs between view types. Regular views cannot have indexes—the database uses indexes on base tables. Materialized views support indexes like regular tables. Index selection for materialized views depends on query patterns, not the refresh query.

View composition creates layered abstractions. Building views on top of other views simplifies complex logic but may hinder query optimization. The database optimizer must expand all view definitions, potentially creating suboptimal plans. Two or three levels of nesting typically works well; deeper nesting often causes problems.

# Layered views with scenic

# Base view: db/views/verified_users_v01.sql
# SELECT id, name, email, verified_at
# FROM users
# WHERE verified = true

# Derived view: db/views/active_verified_users_v01.sql
# SELECT id, name, email, verified_at, last_login
# FROM verified_users
# WHERE last_login >= CURRENT_DATE - INTERVAL '90 days'

class VerifiedUser < ApplicationRecord; end
class ActiveVerifiedUser < ApplicationRecord; end

# Query expansion can create complex plans
# Consider flattening if performance suffers

Security isolation represents a primary view use case. Views restrict access to sensitive columns or filter rows based on criteria. Grant permissions on the view rather than base tables, preventing direct table access.

# Security-focused view
class CreatePublicUserProfileView < ActiveRecord::Migration[7.0]
  def up
    execute <<-SQL
      CREATE VIEW public_user_profiles AS
      SELECT id, username, bio, avatar_url, created_at
      FROM users
      WHERE privacy_setting = 'public'
        AND account_status = 'active'
    SQL
    
    # Grant read access to app role
    execute "GRANT SELECT ON public_user_profiles TO app_readonly"
  end
  
  def down
    execute "DROP VIEW IF EXISTS public_user_profiles"
  end
end

Performance Considerations

Regular views add overhead proportional to the complexity of the underlying query. Simple views selecting columns from a single table perform nearly as fast as querying the base table directly. Views with joins, subqueries, or aggregations execute those operations on every access. The database optimizer can sometimes push predicates from the outer query into the view definition, improving performance.

# Simple view: minimal overhead
create_view :active_users, <<-SQL
  SELECT id, name, email FROM users WHERE status = 'active'
SQL

ActiveUser.where(email: 'user@example.com')
# Optimizes to: SELECT * FROM users 
#               WHERE status = 'active' AND email = 'user@example.com'

# Complex view: expensive on each access
create_view :user_order_summary, <<-SQL
  SELECT 
    u.id, u.name,
    COUNT(o.id) as order_count,
    SUM(o.total) as total_spent,
    MAX(o.created_at) as last_order_date
  FROM users u
  LEFT JOIN orders o ON u.id = o.user_id
  GROUP BY u.id, u.name
SQL

# Every query re-aggregates all orders
UserOrderSummary.where('order_count > ?', 10)

Materialized views provide consistent query performance regardless of underlying query complexity. Query response time depends on materialized view size and index strategy, not the original query. This predictability makes materialized views valuable for user-facing features with strict latency requirements.

Refresh performance determines materialized view viability. Complete refresh reads all base table data and rewrites the entire materialized view. For large datasets, refresh time may exceed acceptable windows. Concurrent refresh allows queries during refresh but requires a unique index and takes longer than complete refresh.

# Measuring refresh performance
class UserStatistic < ApplicationRecord
  def self.refresh_with_timing!
    start_time = Time.current
    connection.execute("REFRESH MATERIALIZED VIEW user_statistics")
    duration = Time.current - start_time
    
    Rails.logger.info("Materialized view refresh completed in #{duration}s")
    duration
  end
  
  def self.concurrent_refresh_with_timing!
    start_time = Time.current
    connection.execute("REFRESH MATERIALIZED VIEW CONCURRENTLY user_statistics")
    duration = Time.current - start_time
    
    Rails.logger.info("Concurrent refresh completed in #{duration}s")
    duration
  end
end

# Monitor refresh performance
# Complete refresh: 2.3s, blocks reads
# Concurrent refresh: 8.7s, allows reads

Storage costs for materialized views include the table data plus all indexes. Large materialized views with multiple indexes consume significant disk space. The trade-off is faster queries against higher storage costs and refresh overhead.

Incremental refresh reduces refresh time by updating only changed rows. PostgreSQL lacks native incremental refresh, but custom triggers or change data capture patterns can achieve similar results. The application tracks changes to base tables and applies corresponding updates to the materialized view.

# Pseudo-incremental refresh pattern
class CreateIncrementalUpdateTracking < ActiveRecord::Migration[7.0]
  def change
    create_table :user_statistics_updates do |t|
      t.date :signup_date, null: false
      t.boolean :needs_refresh, default: true
      t.timestamps
    end
    
    add_index :user_statistics_updates, :signup_date, unique: true
    add_index :user_statistics_updates, :needs_refresh, where: 'needs_refresh = true'
  end
end

class User < ApplicationRecord
  after_commit :mark_statistics_for_refresh, on: [:create, :update]
  
  private
  
  def mark_statistics_for_refresh
    date = created_at.to_date
    UserStatisticsUpdate.find_or_create_by(signup_date: date)
      .update(needs_refresh: true)
  end
end

# Selective refresh job
class IncrementalRefreshUserStatisticsJob < ApplicationJob
  def perform
    dates_to_refresh = UserStatisticsUpdate.where(needs_refresh: true)
      .pluck(:signup_date)
    
    dates_to_refresh.each do |date|
      refresh_date(date)
      UserStatisticsUpdate.find_by(signup_date: date)
        .update(needs_refresh: false)
    end
  end
  
  private
  
  def refresh_date(date)
    connection.execute(<<-SQL)
      DELETE FROM user_statistics WHERE signup_date = '#{date}'
    SQL
    
    connection.execute(<<-SQL)
      INSERT INTO user_statistics
      SELECT 
        DATE_TRUNC('day', created_at) AS signup_date,
        COUNT(*) AS user_count,
        COUNT(CASE WHEN verified THEN 1 END) AS verified_count
      FROM users
      WHERE DATE_TRUNC('day', created_at) = '#{date}'
      GROUP BY DATE_TRUNC('day', created_at)
    SQL
  end
end

View materialization in query plans occurs when the optimizer chooses to execute the view query separately and store temporary results. This strategy can improve performance for complex views referenced multiple times in a query. The optimizer bases this decision on cost estimates.

Common Pitfalls

View definitions become stale when base table schemas change. Adding or removing columns from base tables does not automatically update view definitions. Views may break or return incorrect results. Use explicit column lists in view definitions rather than SELECT * to avoid unexpected behavior.

# Fragile view definition
create_view :user_summary, "SELECT * FROM users"

# users table changes: add 'password_hash' column
# View now exposes sensitive data unintentionally

# Explicit column list prevents issues
create_view :user_summary, <<-SQL
  SELECT id, name, email, created_at, updated_at
  FROM users
SQL

Forgetting readonly status on view-backed models causes confusing errors. ActiveRecord attempts INSERT, UPDATE, or DELETE operations that fail because the view is not updatable. Always override readonly? to return true for view-backed models.

class OrderSummary < ApplicationRecord
  self.table_name = 'order_summary_view'
  
  # Missing readonly? causes errors
end

# This fails with cryptic error
OrderSummary.first.update(total: 100)
# PG::Error: cannot update view "order_summary_view"

# Correct implementation
class OrderSummary < ApplicationRecord
  self.table_name = 'order_summary_view'
  
  def readonly?
    true
  end
end

Materialized view refresh locking blocks queries during non-concurrent refresh. Applications experience timeout errors or slow queries during refresh windows. Concurrent refresh avoids blocking but requires a unique index and takes longer. Schedule refreshes during low-traffic periods or use concurrent refresh for user-facing views.

Circular view dependencies create invalid states. PostgreSQL detects cycles at creation time and rejects the view. Less obvious cycles occur through multiple levels of indirection. Track view dependencies explicitly in documentation.

Missing indexes on materialized views cause slow queries. Materialized views support indexes like regular tables, but they must be created manually. Analyze query patterns and add appropriate indexes.

class CreateProductSummaryMaterializedView < ActiveRecord::Migration[7.0]
  def up
    execute <<-SQL
      CREATE MATERIALIZED VIEW product_summary AS
      SELECT 
        p.id, p.name, p.category_id,
        COUNT(o.id) as order_count,
        SUM(oi.quantity) as units_sold
      FROM products p
      LEFT JOIN order_items oi ON p.id = oi.product_id
      LEFT JOIN orders o ON oi.order_id = o.id
      GROUP BY p.id, p.name, p.category_id
    SQL
    
    # Add indexes for common query patterns
    execute "CREATE INDEX idx_product_summary_category ON product_summary (category_id)"
    execute "CREATE INDEX idx_product_summary_order_count ON product_summary (order_count)"
    execute "CREATE UNIQUE INDEX idx_product_summary_id ON product_summary (id)"
  end
  
  def down
    execute "DROP MATERIALIZED VIEW IF EXISTS product_summary"
  end
end

View query optimization depends on database version and configuration. The same view definition may perform differently across database versions due to optimizer changes. Test view performance in environments matching production.

Association definitions referencing views require careful consideration. ActiveRecord associations work with views, but inverse associations may fail. Foreign key constraints cannot reference views, affecting referential integrity.

class Order < ApplicationRecord
  belongs_to :active_user, class_name: 'ActiveUser'
  # This association works for queries
  # But database-level foreign keys cannot reference the view
end

# Database constraint fails
add_foreign_key :orders, :active_users
# PG::Error: there is no unique constraint matching given keys for referenced table "active_users"

Refresh scheduling for materialized views must account for base table update patterns. Refreshing too frequently wastes resources; refreshing too infrequently serves stale data. Monitor base table change rates and adjust refresh intervals accordingly.

Overly complex view definitions create maintenance burdens. When view definitions exceed 50-100 lines, consider refactoring into smaller views or moving logic into application code. Long view definitions are difficult to test and debug.

Reference

View Operations

Operation Syntax Description
Create View CREATE VIEW name AS query Define a new view
Create Materialized View CREATE MATERIALIZED VIEW name AS query Create a materialized view
Drop View DROP VIEW name Remove a view
Drop Materialized View DROP MATERIALIZED VIEW name Remove a materialized view
Replace View CREATE OR REPLACE VIEW name AS query Update view definition
Refresh Materialized View REFRESH MATERIALIZED VIEW name Complete refresh
Refresh Concurrent REFRESH MATERIALIZED VIEW CONCURRENTLY name Refresh without blocking

View Characteristics Comparison

Characteristic Regular View Materialized View
Data Storage None (virtual) Physical storage on disk
Query Performance Depends on underlying query Consistent fast performance
Data Freshness Always current Stale until refreshed
Storage Cost Negligible Significant
Indexes Uses base table indexes Supports own indexes
Refresh Overhead None Refresh time and locking
Update Operations Sometimes supported Not supported

ActiveRecord View Integration

Feature Implementation Notes
Define Model class Name < ApplicationRecord Map to view like a table
Set Table Name self.table_name = 'view_name' Specify view name
Readonly Status def readonly?; true; end Prevent write operations
Primary Key self.primary_key = 'column' Define key column
Associations belongs_to, has_many Work normally with caveats
Scopes scope :name, -> { where(...) } Apply additional filtering

Scenic Gem Commands

Command Purpose Example
Generate View rails generate scenic:view name Create new view
Generate Materialized rails generate scenic:view name --materialized Create materialized view
Update Version rails generate scenic:view name --version N Create new version
Refresh Method Scenic.database.refresh_materialized_view Refresh from code

Refresh Strategies

Strategy Use Case Trade-offs
Complete Refresh Small to medium views, low query load Fast, blocks reads
Concurrent Refresh Large views, high query load Slower, requires unique index, allows reads
Scheduled Refresh Batch updates, off-peak processing Simple, predictable staleness
Trigger-based Near real-time updates Complex, higher overhead
Event-driven Change data capture integration Most current, most complex

Performance Guidelines

Scenario Recommendation Rationale
Simple filter views Use regular view Minimal overhead, always current
Complex aggregations Use materialized view Expensive query, acceptable staleness
Real-time dashboards Use materialized view with frequent refresh Balance performance and freshness
Historical reports Use materialized view with daily refresh Data rarely changes
Security boundaries Use regular view Must reflect current permissions
Join-heavy queries Evaluate both options Test with production data volume

Index Design for Materialized Views

Query Pattern Index Type Example
Equality lookups Single column B-tree CREATE INDEX idx_name ON view (user_id)
Range queries Single column B-tree CREATE INDEX idx_name ON view (created_at)
Multiple filters Multi-column B-tree CREATE INDEX idx_name ON view (status, created_at)
Concurrent refresh Unique index required CREATE UNIQUE INDEX idx_name ON view (id)
Text search GIN or GiST CREATE INDEX idx_name ON view USING GIN (search_vector)

Common View Patterns

Pattern Description Use Case
Security View Filters sensitive data Expose subset of columns to roles
Aggregation View Pre-computed summaries Dashboards, reporting
Denormalization View Joins related tables Simplify application queries
Compatibility View Maintains old schema Support legacy code during migration
Partitioning View Unions partition tables Transparent partition access