CrackedRuby - Index Optimization

Overview

Index optimization involves designing, implementing, and maintaining database indexes to maximize query performance while minimizing storage overhead and write operation costs. Indexes function as data structures that databases use to locate records without scanning entire tables, similar to how a book's index allows readers to find topics without reading every page.

Database queries without appropriate indexes require full table scans, examining every row to find matching records. A table with one million rows might require one million disk reads to find a single record. An optimized index reduces this to logarithmic complexity, often requiring fewer than twenty disk reads for the same operation.

The optimization process balances competing concerns. Indexes accelerate read operations but slow write operations. Each index consumes disk space and requires maintenance during inserts, updates, and deletes. A table with ten indexes might experience acceptable read performance but unacceptable write performance. Index optimization identifies the minimal set of indexes that satisfies performance requirements.

Database systems store indexes separately from table data, typically using B-tree or hash data structures. The database query planner analyzes available indexes and selects execution strategies. Query performance depends on the planner's ability to use indexes effectively, which depends on index design matching query patterns.

# Without index: full table scan
User.where(email: 'user@example.com')
# SQL: SELECT * FROM users WHERE email = 'user@example.com'
# Execution: Scans all rows (Seq Scan on users)

# With index: direct lookup
# add_index :users, :email
User.where(email: 'user@example.com')
# SQL: SELECT * FROM users WHERE email = 'user@example.com'
# Execution: Index scan on users_email_idx

Index optimization addresses both database design decisions and runtime query patterns. Design decisions include which columns to index, whether to create single-column or composite indexes, and which index types to apply. Runtime considerations include query plan analysis, index usage statistics, and maintenance scheduling.

Key Principles

Indexes store sorted copies of column data with pointers to complete table rows. When a query filters on indexed columns, the database searches the index structure rather than scanning the entire table. The fundamental principle involves trading storage space and write performance for read performance.

Index Selectivity measures how many distinct values exist in an indexed column relative to total rows. High selectivity means most values are unique. A column containing user email addresses typically has high selectivity because each user has a unique email. A column containing user account status with values 'active' or 'inactive' has low selectivity because only two distinct values exist.

Selectivity directly affects index usefulness. Indexes on high-selectivity columns reduce search space dramatically. An index on a primary key with one million distinct values divides search space by roughly one million. An index on a binary status column divides search space by two. Database query planners may ignore low-selectivity indexes and perform table scans instead.

# High selectivity: email addresses are unique
add_index :users, :email
# Index highly effective for: WHERE email = 'specific@email.com'

# Low selectivity: few distinct values
add_index :users, :account_type
# Index less effective for: WHERE account_type = 'premium'
# May trigger table scan if 'premium' represents 40% of rows

B-tree Structure organizes index data in balanced tree format. Each tree node contains sorted key values and pointers to child nodes or table rows. Search operations start at the root node and traverse downward, following pointers based on key comparisons. Tree depth remains logarithmic relative to total rows, ensuring predictable performance.

B-tree indexes support range queries, equality queries, and sorted result sets. A query filtering on created_at > '2024-01-01' can use a B-tree index to find the starting point and scan forward. The index maintains sorted order, so the database can return sorted results without additional sorting operations.

Composite Indexes contain multiple columns in specified order. The index sorts first by the first column, then by the second column within matching first-column values, and so on. Column order determines which queries can use the index effectively.

A composite index on (last_name, first_name) supports queries filtering on last name alone or both last name and first name. It cannot support queries filtering only on first name because first names are not sorted globally, only within matching last names.

add_index :users, [:last_name, :first_name]

# Can use index: filters on leftmost column
User.where(last_name: 'Smith')

# Can use index: filters on leftmost columns in order
User.where(last_name: 'Smith', first_name: 'John')

# Cannot use index effectively: skips leftmost column
User.where(first_name: 'John')

Index Coverage occurs when an index contains all columns required by a query. The database can satisfy the query entirely from the index without accessing table rows. Covered queries avoid random disk seeks to table data, reducing I/O operations significantly.

add_index :orders, [:customer_id, :created_at, :total_amount]

# Covered query: all columns in index
Order.where(customer_id: 123)
     .select(:customer_id, :created_at, :total_amount)

# Not covered: requires 'status' from table
Order.where(customer_id: 123)
     .select(:customer_id, :created_at, :total_amount, :status)

Cardinality represents the number of distinct values in a column. High cardinality columns make better index candidates than low cardinality columns. A universally unique identifier column has cardinality equal to row count. A boolean column has cardinality of two or three (true, false, null).

Query planners estimate result set size using cardinality statistics. Accurate statistics enable optimal plan selection. Stale statistics cause poor plan choices, degrading query performance even with appropriate indexes.

Ruby Implementation

Ruby database interactions through ActiveRecord provide migration DSL for index management. Migrations define indexes using declarative syntax, and ActiveRecord translates definitions into database-specific SQL.

class AddIndexesToUsers < ActiveRecord::Migration[7.0]
  def change
    # Single column index
    add_index :users, :email
    
    # Composite index
    add_index :orders, [:user_id, :created_at]
    
    # Unique constraint index
    add_index :users, :username, unique: true
    
    # Named index for easier identification
    add_index :posts, :published_at, name: 'index_published_posts'
    
    # Partial index with condition
    add_index :orders, :status, where: "status = 'pending'"
  end
end

ActiveRecord supports database-specific index features through options hash. PostgreSQL extensions require explicit enabling and syntax varies from MySQL or SQLite implementations.

# PostgreSQL: expression indexes
add_index :users, 'lower(email)', name: 'index_users_on_lower_email'

# PostgreSQL: GIN index for arrays
add_index :posts, :tags, using: :gin

# PostgreSQL: partial unique index
add_index :users, :email, unique: true, where: 'deleted_at IS NULL'

# MySQL: index length for text columns
add_index :articles, :content, length: 100

# All databases: concurrent index creation (PostgreSQL)
add_index :large_table, :column, algorithm: :concurrently

The algorithm: :concurrently option prevents table locking during index creation on large tables. Without this option, index creation acquires exclusive locks, blocking writes until completion. Concurrent creation takes longer but maintains table availability.

ActiveRecord provides remove_index for dropping indexes. Specify either column names or explicit index name. Removing unused indexes reduces write overhead and storage consumption.

class RemoveUnusedIndexes < ActiveRecord::Migration[7.0]
  def change
    # Remove by column name
    remove_index :users, :temporary_field
    
    # Remove by explicit name
    remove_index :posts, name: 'index_published_posts'
    
    # Remove composite index
    remove_index :orders, [:user_id, :status]
  end
end

ActiveRecord schema introspection exposes index metadata. The connection.indexes method returns array of index objects containing names, columns, uniqueness, and other properties.

# Inspect existing indexes
indexes = ActiveRecord::Base.connection.indexes('users')
indexes.each do |index|
  puts "Name: #{index.name}"
  puts "Columns: #{index.columns.join(', ')}"
  puts "Unique: #{index.unique}"
  puts "Where clause: #{index.where}" if index.where
end

# Check if specific index exists
has_email_index = ActiveRecord::Base.connection.index_exists?(
  :users, 
  :email
)

Database adapters expose query execution plans through explain method. Examining plans reveals whether queries use indexes and identifies optimization opportunities.

# View query execution plan
User.where(email: 'user@example.com').explain

# Example PostgreSQL output:
# Index Scan using index_users_on_email on users
# Index Cond: (email = 'user@example.com')
# Planning Time: 0.123 ms
# Execution Time: 0.456 ms

# Example of query not using index:
User.where("email LIKE '%@example.com'").explain

# Seq Scan on users
# Filter: (email ~~ '%@example.com')
# Planning Time: 0.089 ms
# Execution Time: 145.678 ms

Leading wildcard patterns prevent index usage because indexes store values in sorted order. A search for %@example.com cannot use the index efficiently since matching values scatter throughout the sorted structure.

ActiveRecord includes specialized index types for full-text search, spatial data, and other use cases. PostgreSQL provides additional index types through extensions.

# Full-text search with PostgreSQL
enable_extension 'pg_trgm'
add_index :articles, :title, using: :gin, opclass: :gin_trgm_ops

# Spatial indexes with PostGIS
enable_extension 'postgis'
add_index :locations, :coordinates, using: :gist

# Array overlap queries
add_index :posts, :tag_ids, using: :gin

# JSONB indexes
add_index :settings, :preferences, using: :gin

Foreign key indexes improve join performance and referential integrity checking. ActiveRecord generates foreign key indexes automatically in many cases, but explicit creation ensures coverage.

class CreateOrdersWithIndexes < ActiveRecord::Migration[7.0]
  def change
    create_table :orders do |t|
      t.references :user, foreign_key: true, index: true
      t.references :product, foreign_key: true, index: true
      t.timestamps
    end
    
    # Additional composite index for common query
    add_index :orders, [:user_id, :created_at]
  end
end

Performance Considerations

Index creation time increases with table size. A table with one hundred million rows might require hours to build an index. During creation, the database reads all rows, sorts extracted values, and writes the new index structure. This process consumes significant I/O bandwidth and CPU resources.

# Problematic: blocks writes during index creation
add_index :huge_table, :column

# Better: concurrent creation (PostgreSQL)
disable_ddl_transaction!
add_index :huge_table, :column, algorithm: :concurrently

Write operations incur index maintenance overhead. Each insert, update, or delete affecting indexed columns requires updating all relevant indexes. A table with five indexes performs six write operations per insert: one table write and five index writes. Write-heavy workloads benefit from minimizing index count.

Measure the impact through database performance metrics. Monitor transaction throughput, query latency, and lock contention before and after adding indexes. A new index that reduces query time from five seconds to fifty milliseconds but decreases write throughput by thirty percent requires careful evaluation.

Query performance depends on index selectivity for the filtered data. An index on status column performs well when querying for rare status values but poorly for common values. The query planner estimates result set size and may choose table scans over indexes when expected results exceed threshold percentages.

# Assuming 'cancelled' represents 1% of orders
Order.where(status: 'cancelled').explain
# Uses index: Index Scan on orders_status_idx

# Assuming 'delivered' represents 85% of orders
Order.where(status: 'delivered').explain
# May use table scan: Seq Scan on orders
# Reason: Reading 85% of rows makes index overhead wasteful

Index bloat occurs when deleted or updated rows leave empty space in index structures. B-tree indexes maintain logical order but physical storage becomes fragmented. Bloated indexes consume excessive space and degrade performance as the database reads more pages to find the same data.

# Monitor index bloat (PostgreSQL)
SELECT
  schemaname,
  tablename,
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
  idx_scan AS index_scans,
  idx_tup_read AS tuples_read,
  idx_tup_fetch AS tuples_fetched
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

Database maintenance operations rebuild indexes to eliminate bloat. The REINDEX operation (PostgreSQL) or OPTIMIZE TABLE (MySQL) reconstructs indexes from scratch. Schedule maintenance during low-traffic periods since operations may lock tables.

Buffer pool sizing affects index performance. Databases cache frequently accessed index pages in memory. Small buffer pools cause excessive disk I/O as the database repeatedly loads the same pages. Large buffer pools keep working set in memory, reducing physical reads.

Composite index column order dramatically affects query performance. Place high-selectivity columns first to minimize search space quickly. An index on (status, created_at) works poorly for queries filtering only on creation date but works well for status-based filters.

# Poor column order: low selectivity first
add_index :orders, [:status, :customer_id]

# Query filtering on customer_id cannot use index effectively
Order.where(customer_id: 123).explain
# Seq Scan on orders (index not used)

# Better column order: high selectivity first
add_index :orders, [:customer_id, :status]

# Now both queries can use index
Order.where(customer_id: 123).explain
# Index Scan using index_orders_on_customer_id_status

Benchmark queries before and after index changes using realistic data volumes. Development databases with thousands of rows exhibit different behavior than production databases with millions of rows. Query plans change based on table statistics, and the planner may select different strategies at different scales.

Design Considerations

Index design balances query performance against storage costs and write overhead. Each index accelerates specific queries but consumes disk space and slows writes. The optimal index set depends on workload characteristics.

Read-heavy applications tolerate more indexes. A reporting database serving analytical queries benefits from extensive indexing. Each report runs faster with appropriate indexes, and occasional data loads accept the write overhead. Write-heavy applications require selective indexing, prioritizing the most critical queries.

Query analysis identifies index candidates. Examine slow query logs to find frequently executed expensive queries. Focus on queries with high execution time multiplied by execution frequency. A query running once per minute for two seconds consumes more resources than a query running once per hour for ten seconds.

# Identify slow queries (PostgreSQL)
SELECT
  query,
  calls,
  total_time,
  mean_time,
  max_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 20;

Composite indexes require careful column ordering. The leftmost prefix rule determines which queries can use the index. An index on (a, b, c) supports queries filtering on a, (a, b), or (a, b, c) but not on b alone or (b, c).

Design composite indexes matching common query patterns. A search interface filtering by category and price range needs an index on (category_id, price). The category filter provides high selectivity, and price enables range scanning within category.

# Query pattern requiring composite index
Product.where(category_id: 5)
       .where('price >= ? AND price <= ?', 100, 200)
       .order(:price)

# Optimal index supports category filter, price range, and sort
add_index :products, [:category_id, :price]

Unique indexes enforce data integrity while providing query performance benefits. Databases use unique indexes to prevent duplicate values and to optimize equality searches. Unique constraints automatically create unique indexes.

# Unique index prevents duplicates and accelerates lookups
add_index :users, :email, unique: true

# Composite unique index for compound constraints
add_index :enrollments, [:student_id, :course_id], unique: true

Partial indexes reduce storage and maintenance costs by indexing subsets of rows. A table tracking tasks might index only incomplete tasks since queries rarely filter completed tasks. The partial index remains small despite table growth.

# Index only pending orders
add_index :orders, :created_at, where: "status = 'pending'"

# Queries matching the condition use the index
Order.where(status: 'pending')
     .where('created_at > ?', 1.week.ago)
     .explain
# Index Scan using index_orders_created_at_pending

Expression indexes enable queries on computed values. Searching case-insensitive email addresses requires indexing lower(email) rather than the raw column. Without expression indexes, case-insensitive searches perform full table scans.

# PostgreSQL expression index
add_index :users, 'lower(email)', name: 'index_users_on_lower_email'

# Query using the expression
User.where('lower(email) = ?', 'user@example.com')

# Without expression index, this requires table scan:
User.where('email ILIKE ?', 'user@example.com')

Covering indexes include query columns beyond filter criteria. Adding frequently selected columns to composite indexes enables index-only scans. The database retrieves all required data from the index, avoiding table access entirely.

# Non-covering index
add_index :orders, :customer_id

# Query requires table access for total
Order.where(customer_id: 123).select(:id, :total)

# Covering index includes selected columns
add_index :orders, [:customer_id, :id, :total]

# Query satisfied entirely from index (index-only scan)
Order.where(customer_id: 123).select(:id, :total)

Functional redundancy wastes resources. Multiple similar indexes provide diminishing returns. An index on customer_id makes a separate index on (customer_id, created_at) partially redundant for customer-based queries without date filters. Evaluate whether queries justify the additional index.

Common Patterns

Single Column Indexes accelerate queries filtering or sorting on individual columns. Primary keys receive automatic indexes, but foreign keys and frequently filtered columns require explicit indexes.

# Foreign key index for join performance
add_index :orders, :customer_id

# Search field index
add_index :products, :sku

# Sort field index
add_index :posts, :published_at

Single column indexes work well for equality comparisons and range queries on the indexed column. Combine with table statistics to enable accurate query planning.

Composite Indexes optimize queries filtering on multiple columns. Design composite indexes matching WHERE clause patterns, ordering columns by selectivity.

# Time-series data access pattern
add_index :metrics, [:device_id, :recorded_at]

# Multi-field search
add_index :products, [:category_id, :brand, :price]

# Filtered reporting
add_index :transactions, [:account_id, :type, :created_at]

Composite index column order follows the leftmost prefix rule. Place columns appearing in most queries first, then additional filtering columns, then sorting columns.

Covering Indexes incorporate all columns needed by queries, enabling index-only scans. Add frequently selected columns to composite indexes after filter and sort columns.

# Basic composite index
add_index :orders, [:customer_id, :status]

# Covering index adds selected columns
add_index :orders, [:customer_id, :status, :total, :created_at]

# Query benefits from index-only scan
Order.where(customer_id: 123, status: 'pending')
     .select(:id, :total, :created_at)

Balance covering index benefits against increased storage and maintenance costs. Wide indexes consume more space and require more updates when covered columns change.

Partial Indexes reduce size by indexing filtered subsets. Apply partial indexes when queries consistently filter on specific conditions.

# Index only active records
add_index :users, :email, where: 'deleted_at IS NULL'

# Index only recent data
add_index :logs, :created_at, where: "created_at > '2024-01-01'"

# Index only high-value transactions
add_index :transactions, [:customer_id, :amount], 
          where: 'amount > 1000'

Partial indexes remain small and fast even as total table size grows. Queries matching the partial condition use the smaller index while other queries fall back to alternative strategies.

Unique Indexes enforce constraints and optimize equality lookups. Databases can stop scanning after finding the first match because uniqueness guarantees no additional matches exist.

# Natural key uniqueness
add_index :users, :email, unique: true

# Composite natural key
add_index :enrollments, [:student_id, :course_id], unique: true

# Conditional uniqueness with partial index
add_index :users, :username, unique: true, where: 'deleted_at IS NULL'

Unique indexes serve dual purposes: data integrity and query performance. The database maintains uniqueness automatically and optimizes queries using the constraint knowledge.

Text Search Indexes accelerate pattern matching and full-text search. Standard B-tree indexes cannot optimize pattern searches with leading wildcards or complex text matching.

# PostgreSQL trigram index for pattern matching
enable_extension 'pg_trgm'
add_index :articles, :title, using: :gin, opclass: :gin_trgm_ops

# Queries like 'title ILIKE '%search%'' can use the index
Article.where('title ILIKE ?', '%search%')

# PostgreSQL full-text search
add_index :documents, :content_tsvector, using: :gin

# Full-text search query
Document.where("content_tsvector @@ plainto_tsquery('search terms')")

Text search indexes consume significant storage but enable efficient text queries. Evaluate search requirements to select appropriate index types.

Common Pitfalls

Missing Indexes on Foreign Keys cause expensive joins. Rails generates foreign key indexes in modern versions, but legacy databases or manual schema modifications may omit them.

# Problem: foreign key without index
create_table :orders do |t|
  t.integer :customer_id
  t.timestamps
end

# Expensive join query
Customer.joins(:orders).where(customers: {country: 'US'})

# Solution: add foreign key index
add_index :orders, :customer_id

Verify foreign key indexes exist for all associations. Missing indexes degrade join performance as table size grows.

Over-Indexing slows write operations and wastes storage. Each index adds overhead to inserts, updates, and deletes. Tables with dozens of indexes experience severe write performance degradation.

# Problematic: redundant indexes
add_index :users, :email
add_index :users, [:email, :created_at]
add_index :users, [:email, :status]
add_index :users, [:email, :name]

# Better: single composite index covering most queries
add_index :users, [:email, :created_at, :status]

Audit indexes periodically using database statistics. Remove unused indexes consuming resources without providing query benefits.

Incorrect Composite Index Column Order prevents index usage. Placing low-selectivity columns first or omitting commonly filtered columns renders composite indexes ineffective.

# Poor: low selectivity column first
add_index :orders, [:status, :customer_id]

# Better: high selectivity column first
add_index :orders, [:customer_id, :status]

# Query pattern determines optimal order
Order.where(customer_id: 123, status: 'pending')

Analyze query patterns before designing composite indexes. Match column order to WHERE clause filter sequence and selectivity.

Function Calls Preventing Index Usage occur when queries transform indexed columns. Applying functions to indexed columns forces full table scans.

# Cannot use index on created_at
Order.where('DATE(created_at) = ?', Date.today)

# Can use index
Order.where(created_at: Date.today.beginning_of_day..Date.today.end_of_day)

# Cannot use index on email
User.where('LOWER(email) = ?', 'user@example.com')

# Can use expression index
add_index :users, 'LOWER(email)', name: 'index_users_on_lower_email'
User.where('LOWER(email) = ?', 'user@example.com')

Structure queries to reference indexed columns directly. Create expression indexes when computed values require indexing.

Index Bloat from High Update Frequency degrades performance over time. B-tree indexes retain dead space after updates and deletes, causing fragmentation and increased I/O.

# Monitor bloat (PostgreSQL)
SELECT
  schemaname || '.' || tablename AS table,
  indexrelname AS index,
  pg_size_pretty(pg_relation_size(indexrelid)) AS size,
  idx_scan AS scans,
  100 * (pgstatindex(indexrelid)).avg_leaf_density AS density
FROM pg_stat_user_indexes
JOIN pg_index ON indexrelid = pg_index.indexrelid
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(indexrelid) DESC;

Schedule regular maintenance to rebuild bloated indexes. Use REINDEX CONCURRENTLY in PostgreSQL or OPTIMIZE TABLE in MySQL during maintenance windows.

Leading Wildcards in LIKE Queries prevent index usage. Pattern %search_term requires scanning the entire index because matches occur anywhere in the value.

# Cannot use standard B-tree index
Product.where('name LIKE ?', '%widget%')

# Can use index: leading characters known
Product.where('name LIKE ?', 'widget%')

# For arbitrary pattern matching, use trigram index
enable_extension 'pg_trgm'
add_index :products, :name, using: :gin, opclass: :gin_trgm_ops
Product.where('name ILIKE ?', '%widget%')

Redesign search functionality to support prefix matching or implement specialized text search indexes.

Ignoring NULL Values in Indexes causes issues with nullable columns. Standard B-tree indexes include NULL values, but partial indexes and unique indexes handle NULLs differently.

# Unexpected behavior: unique index allows multiple NULLs
add_index :users, :middle_name, unique: true

# Multiple users with middle_name = NULL are allowed
# Unique constraint applies only to non-NULL values

# Solution: partial unique index excluding NULLs
add_index :users, :middle_name, unique: true, where: 'middle_name IS NOT NULL'

Consider NULL handling when designing indexes for optional columns. Some databases treat multiple NULLs as distinct for uniqueness constraints.

Forgetting to Drop Unused Indexes wastes resources. Feature changes and query pattern evolution make indexes obsolete, but indexes persist until explicitly removed.

# Identify unused indexes (PostgreSQL)
SELECT
  schemaname || '.' || tablename AS table,
  indexrelname AS index,
  idx_scan AS scans,
  pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(indexrelid) DESC;

# Remove unused indexes
remove_index :table_name, name: 'unused_index_name'

Review index usage statistics quarterly. Remove indexes with zero or minimal scans to reduce storage and maintenance overhead.

Reference

Index Types

Type	Description	Use Case	Storage Impact
B-tree	Balanced tree structure	Equality and range queries	Moderate
Hash	Hash table structure	Equality queries only	Low
GiST	Generalized search tree	Geometric and custom data	High
GIN	Generalized inverted index	Full-text search, arrays	Very High
BRIN	Block range index	Very large tables with natural clustering	Very Low

Index Creation Syntax

Pattern	Ruby/ActiveRecord	PostgreSQL	MySQL
Single column	add_index :table, :column	CREATE INDEX ON table(column)	CREATE INDEX idx ON table(column)
Composite	add_index :table, [:col1, :col2]	CREATE INDEX ON table(col1, col2)	CREATE INDEX idx ON table(col1, col2)
Unique	add_index :table, :column, unique: true	CREATE UNIQUE INDEX ON table(column)	CREATE UNIQUE INDEX idx ON table(column)
Partial	add_index :table, :column, where: 'condition'	CREATE INDEX ON table(column) WHERE condition	Not supported
Concurrent	add_index :table, :column, algorithm: :concurrently	CREATE INDEX CONCURRENTLY ON table(column)	Not required
Expression	add_index :table, 'expression', name: 'idx'	CREATE INDEX ON table(expression)	Limited support

Performance Guidelines

Scenario	Recommendation	Rationale
Foreign keys	Always index	Accelerates joins and foreign key checks
WHERE clauses	Index filtered columns	Enables index scans instead of table scans
ORDER BY	Index sort columns	Eliminates sort operations
High selectivity	Prioritize for indexing	Greater search space reduction
Low selectivity	Consider partial or skip	May not improve performance
Write-heavy tables	Minimize indexes	Reduces write operation overhead
Read-heavy tables	More indexes acceptable	Query performance priority
Covering index	Add if query hot path	Enables index-only scans
Large tables	Use concurrent creation	Prevents table locking

Composite Index Column Order Priority

Priority	Factor	Example
1	Equality filters	customer_id = 123
2	High selectivity	unique or near-unique values
3	Range filters	created_at > date
4	Sort columns	ORDER BY updated_at
5	Additional SELECT columns	Covering index optimization

Maintenance Operations

Operation	PostgreSQL	MySQL	Frequency
Rebuild indexes	REINDEX TABLE tablename	OPTIMIZE TABLE tablename	Monthly or when bloated
Update statistics	ANALYZE tablename	ANALYZE TABLE tablename	After bulk changes
Check bloat	pgstattuple extension queries	INFORMATION_SCHEMA queries	Weekly monitoring
Concurrent reindex	REINDEX INDEX CONCURRENTLY idxname	Not available	When uptime required

Query Plan Analysis Commands

Database	Command	Purpose
PostgreSQL	EXPLAIN query	Show execution plan
PostgreSQL	EXPLAIN ANALYZE query	Execute and show actual timings
PostgreSQL	EXPLAIN (ANALYZE, BUFFERS) query	Include buffer usage statistics
MySQL	EXPLAIN query	Show execution plan
MySQL	EXPLAIN FORMAT=JSON query	Detailed JSON format plan
SQLite	EXPLAIN QUERY PLAN query	Show execution plan

Index Size Estimation

Row Count	Indexed Column Size	Estimated Index Size
100,000	8 bytes	4-8 MB
1,000,000	8 bytes	40-80 MB
10,000,000	8 bytes	400-800 MB
100,000	50 bytes	8-15 MB
1,000,000	50 bytes	80-150 MB
10,000,000	50 bytes	800 MB - 1.5 GB

Actual sizes vary with database engine, page size, fill factor, and data distribution. Composite indexes multiply per-column estimates by number of columns with overhead.

Index Optimization