CrackedRuby CrackedRuby

Overview

SQL (Structured Query Language) provides a standardized method for interacting with relational database management systems. The language defines operations for creating, reading, updating, and deleting data through declarative statements that specify the desired result rather than the procedural steps to achieve it.

Basic SQL operations fall into several categories defined by their purpose. Data Query Language (DQL) retrieves information from tables using SELECT statements. Data Manipulation Language (DML) modifies table contents through INSERT, UPDATE, and DELETE commands. Data Definition Language (DDL) manages database structure with CREATE, ALTER, and DROP statements. Data Control Language (DCL) handles permissions through GRANT and REVOKE commands.

The relational model underlying SQL organizes data into tables with rows representing individual records and columns defining attributes. Each table consists of a schema specifying column names, data types, and constraints. Primary keys uniquely identify rows, while foreign keys establish relationships between tables.

-- Basic SELECT retrieves all columns from a table
SELECT * FROM users;

-- Filtering with WHERE clause
SELECT name, email FROM users WHERE active = true;

SQL operates on sets of data rather than individual records. A single statement can affect multiple rows simultaneously, and the database engine determines the optimal execution plan for each query. This set-based approach differs from procedural programming where operations execute sequentially on one item at a time.

Key Principles

SQL statements follow a consistent structure with clauses appearing in a defined order. SELECT queries begin with the SELECT keyword followed by column specifications, then FROM to indicate the source table, WHERE for filtering conditions, GROUP BY for aggregation grouping, HAVING for aggregate filtering, and ORDER BY for result sorting. The database engine executes these clauses in a logical order that differs from their written sequence.

Query execution follows this logical order: FROM identifies source tables, WHERE filters individual rows, GROUP BY aggregates rows into groups, HAVING filters aggregated groups, SELECT determines output columns, and ORDER BY sorts the final result set. Understanding this execution model explains why certain column references work in some clauses but not others.

-- Logical execution order demonstration
SELECT department, COUNT(*) as employee_count
FROM employees
WHERE salary > 50000
GROUP BY department
HAVING COUNT(*) > 5
ORDER BY employee_count DESC;

-- Execution: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY

Data types define the kind of information each column stores. Numeric types include integers (INT, BIGINT) and decimals (NUMERIC, DECIMAL, FLOAT). Character types store text as fixed-length (CHAR) or variable-length (VARCHAR, TEXT) strings. Temporal types handle dates (DATE), times (TIME), and timestamps (TIMESTAMP). Boolean types represent true/false values, though implementation varies across database systems.

NULL represents missing or undefined data, distinct from empty strings or zero values. NULL behaves differently in comparisons and requires special handling with IS NULL and IS NOT NULL operators. Standard comparison operators (=, <, >) return NULL when comparing against NULL values, requiring explicit NULL checks in conditional logic.

Constraints enforce data integrity rules at the database level. PRIMARY KEY ensures unique identification for each row and implicitly creates a NOT NULL constraint. FOREIGN KEY maintains referential integrity between related tables. UNIQUE prevents duplicate values in specified columns. CHECK validates data against custom conditions. DEFAULT provides values when none is specified during insertion.

-- Table creation with constraints
CREATE TABLE orders (
  id SERIAL PRIMARY KEY,
  user_id INTEGER NOT NULL REFERENCES users(id),
  order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  total_amount NUMERIC(10,2) CHECK (total_amount >= 0),
  order_number VARCHAR(50) UNIQUE
);

Transactions group multiple SQL statements into atomic units that either complete entirely or roll back completely. The ACID properties (Atomicity, Consistency, Isolation, Durability) guarantee reliable transaction processing. BEGIN starts a transaction, COMMIT saves changes permanently, and ROLLBACK undoes all modifications since the transaction began.

Joins combine data from multiple tables based on related columns. INNER JOIN returns only matching rows from both tables. LEFT JOIN includes all rows from the left table plus matching rows from the right table, filling non-matches with NULL. RIGHT JOIN reverses this behavior. FULL OUTER JOIN includes all rows from both tables. CROSS JOIN produces the cartesian product of both tables.

Ruby Implementation

Ruby connects to SQL databases through several libraries with different abstraction levels. The pg gem provides direct PostgreSQL access with raw SQL execution. The sqlite3 gem interfaces with SQLite databases for development and testing. ActiveRecord abstracts database operations into object-relational mapping with query generation and model management.

Direct database connections using the pg gem require explicit connection management and query execution. This approach offers maximum control over SQL statements and connection handling but requires manual parameter binding and result processing.

require 'pg'

# Establish connection
conn = PG.connect(dbname: 'myapp_production', host: 'localhost', port: 5432)

# Execute query with parameters
result = conn.exec_params(
  'SELECT * FROM users WHERE email = $1 AND active = $2',
  ['user@example.com', true]
)

# Process results
result.each do |row|
  puts "#{row['name']}: #{row['email']}"
end

# Clean up
result.clear
conn.close

Parameter binding prevents SQL injection by separating query structure from user-provided values. The exec_params method uses placeholder variables ($1, $2) that the database driver substitutes safely. Never concatenate user input directly into SQL strings.

ActiveRecord provides a higher-level interface that generates SQL from Ruby methods. Models inherit from ActiveRecord::Base and map to database tables by convention. Query methods return relation objects that compose SQL clauses through method chaining.

class User < ActiveRecord::Base
  has_many :orders
  validates :email, presence: true, uniqueness: true
end

# ActiveRecord generates: SELECT * FROM users WHERE active = true
active_users = User.where(active: true)

# Method chaining builds complex queries
User.where(active: true)
    .where('created_at > ?', 30.days.ago)
    .order(created_at: :desc)
    .limit(10)

# Eager loading prevents N+1 queries
users_with_orders = User.includes(:orders).where(active: true)

The Sequel gem offers a middle ground between raw SQL and full ORM functionality. It provides dataset objects for query construction while maintaining closer proximity to SQL semantics than ActiveRecord.

require 'sequel'

DB = Sequel.connect('postgres://localhost/myapp')

# Dataset construction
users = DB[:users]
active_users = users.where(active: true)

# SQL generation with placeholders
users.where(email: 'user@example.com').sql
# => "SELECT * FROM users WHERE (email = 'user@example.com')"

# Raw SQL with parameter binding
DB.fetch('SELECT * FROM users WHERE id = ?', user_id) do |row|
  puts row[:name]
end

Connection pooling manages multiple database connections for concurrent request handling. Ruby web applications typically configure connection pools in database.yml or through explicit pool size parameters. The pool maintains a set of reusable connections, blocking requests when all connections are busy.

# ActiveRecord connection pool configuration
ActiveRecord::Base.establish_connection(
  adapter: 'postgresql',
  database: 'myapp_production',
  pool: 5,
  timeout: 5000
)

# Sequel connection pool
DB = Sequel.connect(
  'postgres://localhost/myapp',
  max_connections: 10,
  pool_timeout: 5
)

Transaction management in Ruby wraps SQL transactions in block syntax. ActiveRecord provides transaction methods on models and the connection class. Exceptions within transaction blocks trigger automatic rollback.

ActiveRecord::Base.transaction do
  user = User.create!(name: 'John Doe', email: 'john@example.com')
  account = Account.create!(user: user, balance: 1000)
  
  # If this raises an exception, both inserts rollback
  Payment.create!(account: account, amount: 1000)
end

# Manual transaction control with pg gem
conn.transaction do |txn|
  conn.exec('INSERT INTO users (name) VALUES ($1)', ['Alice'])
  conn.exec('INSERT INTO accounts (user_id) VALUES ($1)', [user_id])
  
  # Explicit rollback
  txn.rollback if some_condition
end

Practical Examples

User authentication queries verify credentials and retrieve account information. The query selects user records matching provided credentials, typically comparing email and hashed password values.

# Authentication with parameterized query
def authenticate_user(email, password)
  result = conn.exec_params(
    'SELECT id, name, email, password_digest FROM users WHERE email = $1 LIMIT 1',
    [email]
  )
  
  return nil if result.ntuples == 0
  
  user_data = result[0]
  if BCrypt::Password.new(user_data['password_digest']) == password
    {
      id: user_data['id'].to_i,
      name: user_data['name'],
      email: user_data['email']
    }
  else
    nil
  end
end

Pagination splits large result sets into manageable pages. The OFFSET clause skips a specified number of rows while LIMIT restricts the result count. Calculating offset values from page numbers enables sequential page navigation.

def paginated_users(page: 1, per_page: 20)
  offset = (page - 1) * per_page
  
  # Get paginated results
  users = conn.exec_params(
    'SELECT id, name, email FROM users ORDER BY created_at DESC LIMIT $1 OFFSET $2',
    [per_page, offset]
  )
  
  # Get total count for pagination metadata
  total = conn.exec('SELECT COUNT(*) FROM users')[0]['count'].to_i
  
  {
    users: users.to_a,
    page: page,
    per_page: per_page,
    total_pages: (total / per_page.to_f).ceil,
    total_count: total
  }
end

Search functionality combines multiple conditions with partial matching. The LIKE operator performs pattern matching with wildcard characters, while ILIKE provides case-insensitive matching in PostgreSQL.

def search_products(query, category: nil, min_price: nil, max_price: nil)
  sql = 'SELECT * FROM products WHERE name ILIKE $1'
  params = ["%#{query}%"]
  param_index = 2
  
  if category
    sql += " AND category = $#{param_index}"
    params << category
    param_index += 1
  end
  
  if min_price
    sql += " AND price >= $#{param_index}"
    params << min_price
    param_index += 1
  end
  
  if max_price
    sql += " AND price <= $#{param_index}"
    params << max_price
  end
  
  sql += ' ORDER BY name'
  
  conn.exec_params(sql, params)
end

Aggregation queries calculate summary statistics across groups of rows. GROUP BY collects rows with matching values while aggregate functions compute totals, averages, counts, and other metrics.

def sales_by_category
  sql = <<-SQL
    SELECT 
      category,
      COUNT(*) as product_count,
      SUM(quantity_sold) as total_quantity,
      AVG(price) as average_price,
      MAX(price) as highest_price
    FROM products
    WHERE active = true
    GROUP BY category
    HAVING SUM(quantity_sold) > 100
    ORDER BY total_quantity DESC
  SQL
  
  conn.exec(sql).map do |row|
    {
      category: row['category'],
      product_count: row['product_count'].to_i,
      total_quantity: row['total_quantity'].to_i,
      average_price: row['average_price'].to_f,
      highest_price: row['highest_price'].to_f
    }
  end
end

Bulk insert operations improve performance when adding multiple rows simultaneously. Single INSERT statements with multiple value sets reduce network roundtrips and transaction overhead compared to individual inserts.

def bulk_create_users(user_data)
  return if user_data.empty?
  
  # Build parameterized query
  values_clauses = []
  params = []
  param_index = 1
  
  user_data.each do |user|
    values_clauses << "($#{param_index}, $#{param_index + 1}, $#{param_index + 2})"
    params += [user[:name], user[:email], user[:created_at]]
    param_index += 3
  end
  
  sql = <<-SQL
    INSERT INTO users (name, email, created_at)
    VALUES #{values_clauses.join(', ')}
    RETURNING id
  SQL
  
  conn.exec_params(sql, params)
end

Security Implications

SQL injection attacks exploit improperly sanitized user input to execute arbitrary SQL commands. Attackers manipulate query structure by injecting SQL syntax through input fields, potentially accessing unauthorized data, modifying records, or executing administrative commands.

# VULNERABLE: String concatenation allows injection
def vulnerable_search(username)
  # If username = "admin' OR '1'='1"
  # Query becomes: SELECT * FROM users WHERE username = 'admin' OR '1'='1'
  query = "SELECT * FROM users WHERE username = '#{username}'"
  conn.exec(query)
end

# SECURE: Parameterized queries separate data from structure
def secure_search(username)
  conn.exec_params('SELECT * FROM users WHERE username = $1', [username])
end

Parameterized queries treat user input as data values rather than executable SQL code. Database drivers handle proper escaping and type conversion, eliminating injection vulnerabilities. All user-provided values must pass through parameter binding regardless of perceived safety.

Dynamic query construction requires careful handling when building SQL strings programmatically. Whitelist valid column names and table names rather than accepting arbitrary input. Validate and sanitize all dynamic components before inclusion in queries.

def search_with_dynamic_column(search_term, sort_column, sort_direction)
  # Whitelist valid columns and directions
  valid_columns = %w[name email created_at]
  valid_directions = %w[ASC DESC]
  
  unless valid_columns.include?(sort_column) && valid_directions.include?(sort_direction.upcase)
    raise ArgumentError, 'Invalid sort parameters'
  end
  
  # Safe to interpolate whitelisted values
  sql = "SELECT * FROM users WHERE name ILIKE $1 ORDER BY #{sort_column} #{sort_direction}"
  conn.exec_params(sql, ["%#{search_term}%"])
end

Least privilege principle restricts database user permissions to minimum required operations. Application database users should not have DROP, TRUNCATE, or administrative privileges. Separate read-only users for reporting queries prevent accidental data modification.

-- Create restricted application user
CREATE USER app_user WITH PASSWORD 'secure_password';
GRANT CONNECT ON DATABASE myapp TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO app_user;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO app_user;

-- Create read-only reporting user
CREATE USER reporting_user WITH PASSWORD 'different_password';
GRANT CONNECT ON DATABASE myapp TO reporting_user;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO reporting_user;

Sensitive data exposure occurs when queries return unnecessary columns or rows. SELECT * retrieves all table columns including potentially sensitive information. Explicitly specify required columns and implement row-level filtering based on user permissions.

# BAD: Returns password digests and other sensitive fields
def get_users
  conn.exec('SELECT * FROM users')
end

# GOOD: Returns only necessary public information
def get_users
  conn.exec('SELECT id, name, email, created_at FROM users')
end

# GOOD: Filters based on user permissions
def get_accessible_records(current_user_id)
  conn.exec_params(
    'SELECT id, title, content FROM documents WHERE owner_id = $1 OR public = true',
    [current_user_id]
  )
end

Connection string security protects database credentials. Store connection parameters in environment variables or encrypted configuration files rather than hardcoding credentials in source code. Rotate passwords regularly and use separate credentials for each environment.

# BAD: Hardcoded credentials
conn = PG.connect(
  dbname: 'production',
  user: 'admin',
  password: 'password123'
)

# GOOD: Environment variable configuration
conn = PG.connect(
  dbname: ENV['DB_NAME'],
  user: ENV['DB_USER'],
  password: ENV['DB_PASSWORD'],
  host: ENV['DB_HOST']
)

Performance Considerations

Index usage accelerates data retrieval by creating auxiliary data structures that enable fast lookups. Indexes work best on columns used in WHERE clauses, JOIN conditions, and ORDER BY clauses. Database engines automatically use indexes when query patterns match indexed columns.

-- Create index for frequently queried columns
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_orders_created_at ON orders(created_at);

-- Composite index for multiple-column queries
CREATE INDEX idx_products_category_price ON products(category, price);

Query execution plans reveal how the database processes SQL statements. The EXPLAIN command shows the query planner's chosen strategy, including index usage, join methods, and estimated costs. EXPLAIN ANALYZE executes the query and reports actual performance metrics.

def analyze_query_performance(sql, params)
  # Get query plan without execution
  explain = conn.exec_params("EXPLAIN #{sql}", params)
  puts "Query Plan:"
  explain.each { |row| puts row['QUERY PLAN'] }
  
  # Get actual execution metrics
  explain_analyze = conn.exec_params("EXPLAIN ANALYZE #{sql}", params)
  puts "\nExecution Analysis:"
  explain_analyze.each { |row| puts row['QUERY PLAN'] }
end

N+1 query problems occur when code executes one query per record instead of batching operations. Loading associated records in loops generates excessive database roundtrips. Eager loading retrieves related data in a single query through JOIN operations or separate batched queries.

# BAD: N+1 queries - one query per user
users = conn.exec('SELECT * FROM users LIMIT 10')
users.each do |user|
  # Executes 10 additional queries
  orders = conn.exec_params('SELECT * FROM orders WHERE user_id = $1', [user['id']])
  puts "#{user['name']}: #{orders.ntuples} orders"
end

# GOOD: Single JOIN query
sql = <<-SQL
  SELECT 
    users.id,
    users.name,
    COUNT(orders.id) as order_count
  FROM users
  LEFT JOIN orders ON orders.user_id = users.id
  GROUP BY users.id, users.name
  LIMIT 10
SQL

users = conn.exec(sql)
users.each do |user|
  puts "#{user['name']}: #{user['order_count']} orders"
end

Connection pooling reduces the overhead of establishing database connections. Opening new connections requires authentication handshakes and resource allocation. Connection pools maintain reusable connections, immediately providing available connections to waiting requests.

SELECT query optimization targets frequently executed queries and slow-running operations. Reduce returned row counts with appropriate WHERE filters and LIMIT clauses. Minimize column retrieval by specifying needed columns rather than using SELECT *. Consider materialized views for complex aggregations executed repeatedly.

# BAD: Retrieves unnecessary data
def get_recent_orders
  # Returns all columns and processes 10000 rows in application
  all_orders = conn.exec('SELECT * FROM orders ORDER BY created_at DESC')
  all_orders.first(100)
end

# GOOD: Filters and limits at database level
def get_recent_orders
  conn.exec('SELECT id, user_id, total, created_at FROM orders ORDER BY created_at DESC LIMIT 100')
end

Batch operations process multiple records in single database roundtrips. UPDATE and DELETE statements with WHERE clauses affect multiple rows simultaneously. Bulk inserts combine multiple VALUE sets in one INSERT statement.

# Update multiple records in single query
def activate_users(user_ids)
  placeholders = user_ids.each_with_index.map { |_, i| "$#{i + 1}" }.join(',')
  sql = "UPDATE users SET active = true WHERE id IN (#{placeholders})"
  conn.exec_params(sql, user_ids)
end

# Delete with complex conditions
def cleanup_old_records(days_ago)
  cutoff_date = Time.now - (days_ago * 24 * 60 * 60)
  conn.exec_params('DELETE FROM logs WHERE created_at < $1', [cutoff_date])
end

Common Pitfalls

NULL comparison errors occur when using standard equality operators with NULL values. The expression column = NULL always evaluates to NULL rather than true or false. Use IS NULL and IS NOT NULL for NULL checks.

# WRONG: This condition never matches
result = conn.exec_params(
  'SELECT * FROM users WHERE middle_name = $1',
  [nil]
)
# Returns no rows even if middle_name is NULL

# CORRECT: Use IS NULL
result = conn.exec('SELECT * FROM users WHERE middle_name IS NULL')

# Handling NULL in Ruby
def find_users_without_middle_name
  users = conn.exec('SELECT * FROM users WHERE middle_name IS NULL')
  users.map { |row| row['name'] }
end

Character encoding mismatches cause data corruption and comparison failures. Ensure consistent encoding across database, connection, and application. PostgreSQL defaults to UTF-8 but connections may use different encodings without explicit configuration.

# Set connection encoding explicitly
conn = PG.connect(
  dbname: 'myapp',
  client_encoding: 'UTF8'
)

# Verify encoding
encoding = conn.exec('SHOW client_encoding')[0]['client_encoding']
puts "Connection encoding: #{encoding}"

Transaction isolation level misunderstandings lead to unexpected behavior in concurrent operations. READ COMMITTED allows other transactions to modify rows between reads. REPEATABLE READ prevents this but may cause serialization failures. SERIALIZABLE provides strongest isolation but impacts performance.

# Default READ COMMITTED can produce inconsistent results
def transfer_funds(from_account, to_account, amount)
  conn.transaction do
    # Another transaction might modify balances between these queries
    balance = conn.exec_params(
      'SELECT balance FROM accounts WHERE id = $1',
      [from_account]
    )[0]['balance'].to_f
    
    raise 'Insufficient funds' if balance < amount
    
    conn.exec_params(
      'UPDATE accounts SET balance = balance - $1 WHERE id = $2',
      [amount, from_account]
    )
    conn.exec_params(
      'UPDATE accounts SET balance = balance + $1 WHERE id = $2',
      [amount, to_account]
    )
  end
end

String comparison case sensitivity varies between database systems. PostgreSQL treats standard comparison operators as case-sensitive while MySQL defaults to case-insensitive comparisons. Use LOWER() or UPPER() functions for consistent case-insensitive matching.

# Case-sensitive search (PostgreSQL default)
users = conn.exec_params('SELECT * FROM users WHERE email = $1', ['User@Example.com'])
# May not match 'user@example.com'

# Case-insensitive search
users = conn.exec_params('SELECT * FROM users WHERE LOWER(email) = LOWER($1)', ['User@Example.com'])

# Or use ILIKE for pattern matching (PostgreSQL)
users = conn.exec_params('SELECT * FROM users WHERE email ILIKE $1', ['user@example.com'])

Date and time zone handling requires explicit zone awareness. TIMESTAMP WITHOUT TIME ZONE stores local time without zone information. TIMESTAMP WITH TIME ZONE stores UTC values with automatic conversion. Application code must handle timezone conversions consistently.

# Store as UTC, convert in application
require 'time'

def create_event(name, event_time_string, user_timezone)
  # Parse user's local time
  local_time = Time.parse(event_time_string)
  
  # Convert to UTC for storage
  utc_time = local_time.getutc
  
  conn.exec_params(
    'INSERT INTO events (name, scheduled_at) VALUES ($1, $2)',
    [name, utc_time]
  )
end

def get_user_events(user_timezone)
  events = conn.exec('SELECT id, name, scheduled_at FROM events')
  
  events.map do |event|
    # Convert stored UTC to user's timezone
    utc_time = Time.parse(event['scheduled_at']).utc
    {
      id: event['id'],
      name: event['name'],
      local_time: utc_time.getlocal(user_timezone)
    }
  end
end

Numeric precision loss happens when using inappropriate data types. FLOAT and REAL types use approximate storage causing rounding errors. Use NUMERIC or DECIMAL for financial calculations requiring exact precision.

# BAD: FLOAT causes precision errors
conn.exec('CREATE TABLE accounts (balance FLOAT)')
conn.exec_params('INSERT INTO accounts VALUES ($1)', [10.15])
result = conn.exec('SELECT balance FROM accounts')[0]['balance'].to_f
# May return 10.149999... instead of 10.15

# GOOD: NUMERIC preserves precision
conn.exec('CREATE TABLE accounts (balance NUMERIC(10,2))')
conn.exec_params('INSERT INTO accounts VALUES ($1)', [10.15])
result = conn.exec('SELECT balance FROM accounts')[0]['balance']
# Returns exactly '10.15'

Reference

Common SQL Commands

Command Purpose Example Syntax
SELECT Retrieve data from tables SELECT columns FROM table WHERE condition
INSERT Add new rows to table INSERT INTO table (columns) VALUES (values)
UPDATE Modify existing rows UPDATE table SET column = value WHERE condition
DELETE Remove rows from table DELETE FROM table WHERE condition
CREATE TABLE Define new table structure CREATE TABLE name (column type constraints)
ALTER TABLE Modify table structure ALTER TABLE name ADD COLUMN column type
DROP TABLE Remove table and data DROP TABLE name
CREATE INDEX Add performance index CREATE INDEX name ON table (columns)
TRUNCATE Remove all rows quickly TRUNCATE TABLE name

WHERE Clause Operators

Operator Description Example
= Equality comparison WHERE age = 25
<> or != Inequality comparison WHERE status <> 'deleted'
< Less than WHERE price < 100
> Greater than WHERE quantity > 0
<= Less than or equal WHERE age <= 65
>= Greater than or equal WHERE score >= 80
BETWEEN Range comparison (inclusive) WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31'
IN List membership WHERE status IN ('active', 'pending')
LIKE Pattern matching (case-sensitive) WHERE name LIKE 'John%'
ILIKE Pattern matching (case-insensitive) WHERE email ILIKE '%@example.com'
IS NULL NULL value check WHERE deleted_at IS NULL
IS NOT NULL Non-NULL value check WHERE email IS NOT NULL
AND Logical conjunction WHERE active = true AND verified = true
OR Logical disjunction WHERE role = 'admin' OR role = 'moderator'
NOT Logical negation WHERE NOT archived = true

Aggregate Functions

Function Purpose Returns
COUNT(*) Total number of rows Integer count including NULLs
COUNT(column) Non-NULL values in column Integer count excluding NULLs
SUM(column) Total of numeric column Sum of all values
AVG(column) Mean of numeric column Average value
MIN(column) Smallest value Minimum value in column
MAX(column) Largest value Maximum value in column
STRING_AGG Concatenate strings with separator Comma-separated or custom-delimited string

JOIN Types

Join Type Description Result Set
INNER JOIN Matching rows from both tables Only rows with matches in both tables
LEFT JOIN All left table rows with matches All left rows, NULLs for non-matching right rows
RIGHT JOIN All right table rows with matches All right rows, NULLs for non-matching left rows
FULL OUTER JOIN All rows from both tables All rows from both tables, NULLs for non-matches
CROSS JOIN Cartesian product Every left row paired with every right row

Common Data Types

Type Storage Range/Format Use Case
INTEGER 4 bytes -2147483648 to 2147483647 Standard integer values
BIGINT 8 bytes -9223372036854775808 to 9223372036854775807 Large integer values
NUMERIC(p,s) Variable Exact decimal with precision Financial calculations
VARCHAR(n) Variable Up to n characters Variable-length text
TEXT Variable Unlimited length Large text content
BOOLEAN 1 byte true, false, NULL Binary state
DATE 4 bytes YYYY-MM-DD format Calendar dates
TIMESTAMP 8 bytes Date and time with microseconds Event timestamps
JSON/JSONB Variable JSON formatted data Semi-structured data

Ruby Database Gem Methods

Gem Connection Method Query Execution
pg PG.connect(options) conn.exec(sql) or conn.exec_params(sql, params)
sqlite3 SQLite3::Database.new(filename) db.execute(sql, params)
mysql2 Mysql2::Client.new(options) client.query(sql)

Transaction Control

Command Ruby Implementation Purpose
BEGIN conn.transaction { block } Start transaction
COMMIT Automatic at block end Save changes permanently
ROLLBACK raise exception in block Undo all transaction changes
SAVEPOINT conn.exec('SAVEPOINT name') Create rollback point within transaction

Index Types

Index Type Syntax Best For
B-tree CREATE INDEX name ON table (column) Equality and range queries
Hash CREATE INDEX name ON table USING HASH (column) Equality comparisons only
GiST CREATE INDEX name ON table USING GIST (column) Full-text search, geometric data
GIN CREATE INDEX name ON table USING GIN (column) Array and JSON searches
Partial CREATE INDEX name ON table (column) WHERE condition Subset of rows
Composite CREATE INDEX name ON table (col1, col2) Multi-column queries