CrackedRuby logo

CrackedRuby

SDBM

Overview

SDBM (Simple Database Manager) is Ruby's built-in key-value database that stores data in files using a hash-based structure. Ruby's implementation provides the SDBM class through the standard library, offering persistent storage for string-based keys and values with automatic file management.

The SDBM class implements the DBM interface, making it compatible with other DBM-family databases while maintaining simplicity and minimal dependencies. Unlike more complex database systems, SDBM stores data directly in binary files without requiring separate database servers or complex configuration.

SDBM databases consist of two files: a data file containing the actual key-value pairs and a directory file that maintains hash bucket information for efficient lookups. Ruby handles file creation, locking, and synchronization automatically when opening databases.

require 'sdbm'

# Open existing database or create new one
db = SDBM.open('mydata.sdbm')
db['key'] = 'value'
db.close

The database accepts only string keys and values, with automatic conversion for other data types. SDBM performs well for small to medium datasets but lacks advanced features like transactions, concurrent writes, or complex queries found in larger database systems.

require 'sdbm'

db = SDBM.open('config.sdbm')
db['server_host'] = 'localhost'
db['server_port'] = '8080'
db['max_connections'] = '100'

# Retrieve values
host = db['server_host']  # => "localhost"
port = db['server_port']  # => "8080"

db.close

SDBM works particularly well for configuration storage, caching, session data, and other scenarios requiring simple persistent key-value access without complex relational operations.

Basic Usage

Creating and opening SDBM databases uses the SDBM.open method, which accepts a filename and optional mode flags. Ruby creates the necessary files automatically when opening non-existent databases.

require 'sdbm'

# Open in read-write mode (default)
db = SDBM.open('data.sdbm')

# Open in read-only mode
readonly_db = SDBM.open('data.sdbm', 0444)

# Always close databases
db.close
readonly_db.close

SDBM provides hash-like access methods for storing and retrieving data. The []= and [] operators handle the most common operations, while methods like store and fetch offer additional functionality.

require 'sdbm'

db = SDBM.open('inventory.sdbm')

# Basic storage and retrieval
db['item001'] = 'Widget A'
db['item002'] = 'Widget B'
db['qty001'] = '50'

name = db['item001']        # => "Widget A"
quantity = db['qty001']     # => "50"

# Check key existence
if db.key?('item003')
  puts "Item 003 exists"
else
  puts "Item 003 not found"  # This executes
end

db.close

The store method provides the same functionality as []= but returns the stored value, while fetch accepts default values or blocks for handling missing keys.

require 'sdbm'

db = SDBM.open('cache.sdbm')

# Store method returns the value
result = db.store('timestamp', Time.now.to_s)
# => "2025-08-30 14:30:25 UTC"

# Fetch with default value
cached_data = db.fetch('expensive_calculation', 'not_computed')
# => "not_computed"

# Fetch with block for missing keys
cached_result = db.fetch('complex_query') do |key|
  "Computing result for #{key}..."
end
# => "Computing result for complex_query..."

db.close

SDBM supports iteration over keys, values, and key-value pairs using standard enumerable methods. The each_key, each_value, and each_pair methods provide specific iteration patterns.

require 'sdbm'

db = SDBM.open('settings.sdbm')
db['debug'] = 'true'
db['log_level'] = 'info'
db['max_memory'] = '1024'

# Iterate over all keys
db.each_key do |key|
  puts "Setting: #{key}"
end

# Iterate over key-value pairs
db.each_pair do |key, value|
  puts "#{key} = #{value}"
end

# Convert to hash for complex operations
settings_hash = db.to_h
# => {"debug"=>"true", "log_level"=>"info", "max_memory"=>"1024"}

db.close

Deletion operations use delete for single keys or clear to remove all data. The delete method returns the deleted value or nil for non-existent keys.

require 'sdbm'

db = SDBM.open('temp.sdbm')
db['temp1'] = 'value1'
db['temp2'] = 'value2'
db['temp3'] = 'value3'

# Delete single key
deleted_value = db.delete('temp2')  # => "value2"
missing_value = db.delete('temp9')  # => nil

# Check remaining keys
db.keys  # => ["temp1", "temp3"]

# Clear all data
db.clear
db.keys  # => []

db.close

Performance & Memory

SDBM performance characteristics depend heavily on file system speed, key distribution, and database size. The hash-based storage provides O(1) average-case lookup times, but file I/O operations create performance bottlenecks compared to in-memory storage.

Database file sizes grow based on the number of keys and total data volume. SDBM creates two files: a .dir directory file containing hash bucket information and a .pag data file storing actual key-value pairs. The directory file size relates to the number of unique keys, while the data file grows with total content volume.

require 'sdbm'
require 'benchmark'

# Performance comparison: small vs large datasets
def benchmark_operations(size)
  db = SDBM.open("bench_#{size}.sdbm")
  
  # Write performance
  write_time = Benchmark.realtime do
    size.times { |i| db["key#{i}"] = "value#{i}" * 10 }
  end
  
  # Read performance
  read_time = Benchmark.realtime do
    size.times { |i| db["key#{i}"] }
  end
  
  puts "Size: #{size}, Write: #{write_time.round(4)}s, Read: #{read_time.round(4)}s"
  
  db.close
  File.delete("bench_#{size}.sdbm.dir", "bench_#{size}.sdbm.pag")
end

[1000, 10000, 50000].each { |size| benchmark_operations(size) }

Memory usage remains relatively constant regardless of database size since SDBM loads only active pages into memory rather than entire datasets. However, frequent access patterns can increase memory usage through page caching.

require 'sdbm'

db = SDBM.open('large_dataset.sdbm')

# Memory-efficient batch processing
def process_large_dataset(db, batch_size = 1000)
  processed = 0
  
  db.each_pair do |key, value|
    # Process individual record
    result = expensive_processing(value)
    db["processed_#{key}"] = result
    
    processed += 1
    
    # Periodic cleanup to manage memory
    if processed % batch_size == 0
      GC.start  # Force garbage collection
      puts "Processed #{processed} records"
    end
  end
end

def expensive_processing(value)
  # Simulate computation-heavy processing
  value.reverse.upcase
end

# process_large_dataset(db)
db.close

Key length and value size directly impact performance. Shorter keys provide faster hash computation and reduced file I/O, while large values increase disk usage and transfer times. SDBM handles variable-length data efficiently but benefits from consistent sizing patterns.

require 'sdbm'

def compare_key_strategies
  # Strategy 1: Long descriptive keys
  db1 = SDBM.open('long_keys.sdbm')
  long_key_time = Benchmark.realtime do
    1000.times do |i|
      key = "user_profile_data_for_customer_#{i}_with_metadata"
      db1[key] = "data#{i}"
    end
  end
  
  # Strategy 2: Short hash-based keys
  db2 = SDBM.open('short_keys.sdbm')
  short_key_time = Benchmark.realtime do
    1000.times do |i|
      key = "usr#{i}"
      db2[key] = "data#{i}"
    end
  end
  
  puts "Long keys: #{long_key_time.round(4)}s"
  puts "Short keys: #{short_key_time.round(4)}s"
  
  db1.close
  db2.close
end

# compare_key_strategies

File system performance significantly affects SDBM operations. Solid-state drives provide better random access performance than traditional hard drives, while network file systems introduce additional latency. Database location on fast local storage improves overall application performance.

require 'sdbm'

# Optimize for sequential vs random access patterns
class OptimizedSDBM
  def initialize(filename)
    @db = SDBM.open(filename)
    @write_buffer = {}
    @buffer_size = 100
  end
  
  # Buffered writes reduce file I/O frequency
  def buffered_store(key, value)
    @write_buffer[key] = value
    
    if @write_buffer.size >= @buffer_size
      flush_buffer
    end
  end
  
  def flush_buffer
    @write_buffer.each { |key, value| @db[key] = value }
    @write_buffer.clear
  end
  
  def close
    flush_buffer
    @db.close
  end
end

# Usage example
optimized_db = OptimizedSDBM.new('optimized.sdbm')
1000.times { |i| optimized_db.buffered_store("key#{i}", "value#{i}") }
optimized_db.close

Error Handling & Debugging

SDBM operations can raise various exceptions related to file system access, permission issues, and database corruption. Understanding these error patterns helps build robust applications that gracefully handle database failures.

File permission errors occur when attempting to open databases without appropriate read or write permissions. The Errno::EACCES exception indicates permission problems, while Errno::ENOENT signals missing files in read-only mode.

require 'sdbm'

def safe_database_open(filename, mode = nil)
  begin
    db = mode ? SDBM.open(filename, mode) : SDBM.open(filename)
    return db
  rescue Errno::EACCES => e
    puts "Permission denied accessing #{filename}: #{e.message}"
    return nil
  rescue Errno::ENOENT => e
    puts "Database file not found: #{filename}"
    return nil
  rescue StandardError => e
    puts "Unexpected error opening database: #{e.class} - #{e.message}"
    return nil
  end
end

# Usage with error handling
db = safe_database_open('protected.sdbm')
if db
  db['test'] = 'value'
  db.close
else
  puts "Failed to open database, using alternative storage"
end

Database corruption can occur due to improper shutdowns, file system errors, or concurrent access violations. SDBM may raise RuntimeError exceptions when encountering corrupted data structures or inconsistent file states.

require 'sdbm'

def recover_corrupted_database(filename)
  backup_filename = "#{filename}.backup"
  
  begin
    # Attempt to open and validate database
    db = SDBM.open(filename)
    
    # Simple validation: try to read all keys
    key_count = db.keys.length
    puts "Database contains #{key_count} keys"
    
    # Create backup of working database
    create_backup(db, backup_filename)
    db.close
    
  rescue RuntimeError => e
    puts "Database corruption detected: #{e.message}"
    
    # Attempt recovery from backup
    if File.exist?("#{backup_filename}.dir") && File.exist?("#{backup_filename}.pag")
      puts "Restoring from backup..."
      restore_from_backup(filename, backup_filename)
    else
      puts "No backup available, manual recovery required"
    end
  end
end

def create_backup(source_db, backup_filename)
  backup_db = SDBM.open(backup_filename)
  source_db.each_pair { |key, value| backup_db[key] = value }
  backup_db.close
end

def restore_from_backup(target_filename, backup_filename)
  # Remove corrupted files
  ["#{target_filename}.dir", "#{target_filename}.pag"].each do |file|
    File.delete(file) if File.exist?(file)
  end
  
  # Copy backup files
  File.rename("#{backup_filename}.dir", "#{target_filename}.dir")
  File.rename("#{backup_filename}.pag", "#{target_filename}.pag")
  
  puts "Recovery completed"
end

Memory and disk space exhaustion can cause SDBM operations to fail with system-level exceptions. Monitoring available resources and implementing graceful degradation prevents application crashes.

require 'sdbm'

class MonitoredSDBM
  def initialize(filename, max_size_mb = 100)
    @filename = filename
    @max_size_bytes = max_size_mb * 1024 * 1024
    @db = SDBM.open(filename)
  end
  
  def store_with_monitoring(key, value)
    # Check database size before writing
    current_size = database_size
    estimated_new_size = current_size + key.bytesize + value.bytesize
    
    if estimated_new_size > @max_size_bytes
      raise "Database size limit exceeded: #{estimated_new_size} bytes"
    end
    
    begin
      @db[key] = value
    rescue Errno::ENOSPC => e
      puts "Disk space exhausted: #{e.message}"
      cleanup_old_entries
      retry
    rescue SystemCallError => e
      puts "System error during write: #{e.class} - #{e.message}"
      raise
    end
  end
  
  def database_size
    dir_size = File.size("#{@filename}.dir") if File.exist?("#{@filename}.dir")
    pag_size = File.size("#{@filename}.pag") if File.exist?("#{@filename}.pag")
    (dir_size || 0) + (pag_size || 0)
  end
  
  def cleanup_old_entries
    # Remove oldest entries (implementation specific to use case)
    keys_to_remove = @db.keys.first(10)  # Remove first 10 keys
    keys_to_remove.each { |key| @db.delete(key) }
    puts "Cleaned up #{keys_to_remove.length} entries"
  end
  
  def close
    @db.close
  end
end

Debugging SDBM issues requires understanding the underlying file structure and access patterns. Common problems include concurrent access attempts, incomplete transactions due to process termination, and file locking conflicts.

require 'sdbm'

def diagnose_database_issues(filename)
  puts "Diagnosing SDBM database: #{filename}"
  
  # Check file existence and permissions
  dir_file = "#{filename}.dir"
  pag_file = "#{filename}.pag"
  
  [dir_file, pag_file].each do |file|
    if File.exist?(file)
      stat = File.stat(file)
      puts "#{file}: #{stat.size} bytes, mode: #{stat.mode.to_s(8)}"
      puts "  Readable: #{File.readable?(file)}"
      puts "  Writable: #{File.writable?(file)}"
      puts "  Modified: #{stat.mtime}"
    else
      puts "#{file}: Missing"
    end
  end
  
  # Attempt to open and gather statistics
  begin
    db = SDBM.open(filename)
    puts "Database opened successfully"
    puts "Key count: #{db.keys.length}"
    puts "Sample keys: #{db.keys.first(5)}"
    
    # Check for common corruption indicators
    db.each_key do |key|
      begin
        value = db[key]
        if value.nil?
          puts "Warning: Key '#{key}' returns nil value"
        end
      rescue => e
        puts "Error reading key '#{key}': #{e.message}"
      end
    end
    
    db.close
  rescue => e
    puts "Failed to open database: #{e.class} - #{e.message}"
  end
end

Reference

Core Classes and Methods

Class/Method Parameters Returns Description
SDBM.open(filename, mode=0666) filename (String), mode (Integer) SDBM Opens or creates database files
SDBM#[](key) key (String) String or nil Retrieves value for key
SDBM#[]=(key, value) key (String), value (String) String Stores key-value pair
SDBM#store(key, value) key (String), value (String) String Stores key-value pair, returns value
SDBM#fetch(key, default=nil) key (String), default (Object) Object Retrieves value with default
SDBM#delete(key) key (String) String or nil Removes key-value pair
SDBM#key?(key) key (String) Boolean Tests key existence
SDBM#keys None Array<String> Returns all keys
SDBM#values Array<String> Returns all values
SDBM#clear None SDBM Removes all data
SDBM#close None nil Closes database files
SDBM#closed? None Boolean Tests if database is closed

Iteration Methods

Method Parameters Returns Description
SDBM#each_key Block {|key|} SDBM Iterates over all keys
SDBM#each_value Block {|value|} SDBM Iterates over all values
SDBM#each_pair Block {|key, value|} SDBM Iterates over key-value pairs
SDBM#each Block {|key, value|} SDBM Alias for each_pair
SDBM#select Block {|key, value|} Hash Returns matching key-value pairs
SDBM#reject Block {|key, value|} Hash Returns non-matching pairs

Conversion and Utility Methods

Method Parameters Returns Description
SDBM#to_h None Hash Converts to Hash object
SDBM#to_a None Array Converts to array of [key, value] pairs
SDBM#length None Integer Returns number of key-value pairs
SDBM#size None Integer Alias for length
SDBM#empty? None Boolean Tests if database contains no data

File Mode Constants

Mode Value Description
Read-only 0444 Opens existing database for reading only
Read-write 0644 Opens for reading and writing (default)
Create new 0666 Creates new database with read-write access

Common Exceptions

Exception Trigger Conditions Recovery Strategy
Errno::EACCES Insufficient file permissions Check file ownership and permissions
Errno::ENOENT File not found in read-only mode Verify file path or create database
Errno::ENOSPC Disk space exhausted Free disk space or cleanup old data
RuntimeError Database corruption or invalid state Restore from backup or recreate database
IOError General I/O failures Check file system integrity
SystemCallError Low-level system call failures Investigate system-level issues

Configuration Options

Option Type Default Description
Filename String Required Base name for database files
File Mode Integer 0666 Unix permissions for created files
Automatic Creation Boolean true Create database if non-existent

File Structure

File Extension Purpose Content
.dir Directory file Hash bucket index information
.pag Page file Actual key-value data storage

Performance Characteristics

Operation Time Complexity Notes
Key lookup O(1) average Hash-based access
Key insertion O(1) average May trigger file expansion
Key deletion O(1) average Leaves holes in data file
Full iteration O(n) Reads all key-value pairs
Database open O(1) File system dependent

Thread Safety

SDBM is not thread-safe for concurrent writes. Multiple threads reading from the same database handle generally work, but concurrent write operations require external synchronization. Opening separate database handles in different threads provides better isolation but may cause file locking conflicts depending on the operating system.