Overview
Caching strategies define systematic approaches to storing and retrieving frequently accessed data in temporary storage locations. These strategies determine what data gets cached, where it gets stored, how long it remains valid, and when it gets invalidated or refreshed. The primary objective is reducing latency and computational overhead by avoiding repeated expensive operations such as database queries, API calls, or complex calculations.
Cache strategies operate at multiple levels of the application stack. Client-side caching stores data in browsers or mobile applications. Application-level caching maintains computed results in memory. Database query caching stores result sets. Content Delivery Networks (CDNs) cache static assets geographically close to users. Each level requires different invalidation policies, storage mechanisms, and consistency guarantees.
The choice of caching strategy impacts system performance, consistency, scalability, and complexity. A read-heavy application might benefit from aggressive caching with longer Time-To-Live (TTL) values, while a write-heavy system requires careful cache invalidation to maintain data accuracy. Distributed systems face additional challenges when multiple cache instances need coordination to ensure consistency across nodes.
# Basic cache example
class ProductService
def find_product(id)
cache_key = "product:#{id}"
cached = Cache.read(cache_key)
return cached if cached
product = Database.query("SELECT * FROM products WHERE id = ?", id)
Cache.write(cache_key, product, expires_in: 1.hour)
product
end
end
Key Principles
Cache Hit and Miss Ratio measures effectiveness. A cache hit occurs when requested data exists in the cache, returning results without accessing the origin. A cache miss requires fetching from the source and optionally storing in cache. The hit ratio (hits / total requests) indicates caching effectiveness. High ratios (above 80%) suggest well-tuned strategies, while low ratios indicate poor key selection or inappropriate TTL values.
Temporal and Spatial Locality form the theoretical foundation. Temporal locality assumes recently accessed data will likely be accessed again soon. Spatial locality suggests data near recently accessed items will be accessed next. Effective caching strategies exploit these patterns by keeping recently used data in fast storage and pre-loading related data.
Cache Coherence addresses consistency in distributed environments. When multiple cache instances exist, updates to source data must propagate to all cached copies or invalidate them. Strong coherence guarantees all caches reflect the latest data immediately, while eventual consistency allows temporary staleness for performance. The CAP theorem constrains distributed caches to choose between consistency and availability during network partitions.
Eviction Policies determine what gets removed when cache capacity fills. Least Recently Used (LRU) removes oldest accessed items, assuming older data is less likely needed. Least Frequently Used (LFU) tracks access counts, removing items with lowest counts. First In First Out (FIFO) removes oldest entries regardless of access patterns. Random Replacement (RR) selects arbitrary items for removal. Each policy suits different access patterns and computational costs.
Time-To-Live (TTL) Management controls cache validity duration. Fixed TTL assigns identical expiration times to all entries. Sliding TTL extends expiration on each access, keeping active items cached longer. Adaptive TTL adjusts based on access patterns or update frequency. Infinite TTL requires explicit invalidation, suitable for immutable data.
Cache Aside and Write Strategies define data flow patterns. Cache-aside (lazy loading) populates cache on read misses, giving applications control over what gets cached. Write-through updates cache and database simultaneously, ensuring consistency but increasing write latency. Write-behind (write-back) updates cache immediately and asynchronously persists to database, improving write performance but risking data loss. Write-around bypasses cache on writes, preventing cache pollution from write-once data.
# Cache-aside pattern
def get_user(user_id)
cache_key = "user:#{user_id}"
user = Rails.cache.read(cache_key)
return user if user
user = User.find(user_id)
Rails.cache.write(cache_key, user, expires_in: 30.minutes)
user
end
# Write-through pattern
def update_user(user_id, attributes)
user = User.find(user_id)
user.update!(attributes)
cache_key = "user:#{user_id}"
Rails.cache.write(cache_key, user, expires_in: 30.minutes)
user
end
Cache Warming and Priming prepopulate caches before requests arrive. Cold start situations where empty caches cause initial slow responses can be avoided by loading frequently accessed data during application startup or scheduled intervals. This technique particularly benefits predictable access patterns like homepage data or popular products.
Design Considerations
Consistency Requirements dictate strategy selection. Strong consistency demands synchronous cache invalidation across all instances, preventing stale reads but increasing latency and complexity. Eventual consistency tolerates temporary staleness in exchange for availability and performance. Financial transactions typically require strong consistency, while product catalogs may accept eventual consistency with periodic refreshes.
Data Volatility influences TTL configuration and invalidation approaches. Highly volatile data changes frequently, requiring shorter TTL values or event-driven invalidation. Static data like historical records can use infinite TTL with explicit invalidation only when corrections occur. Semi-static data like configuration settings benefits from longer TTL with active invalidation on updates.
Access Patterns determine which data deserves caching. Zipfian distributions where small subsets account for most requests benefit significantly from caching hot items. Uniform access patterns provide less benefit since no particular items dominate traffic. Analyzing request logs reveals which entities, endpoints, or queries consume most resources, guiding cache key selection.
Cache Size and Memory Constraints limit what can be stored. In-memory caches face RAM limitations, requiring careful selection of high-value items and aggressive eviction policies. Disk-based caches provide larger capacity with higher latency. Multi-tier architectures place hot data in memory and warm data on disk, balancing speed and capacity.
Invalidation Complexity affects maintainability and correctness. Time-based expiration is simple but may serve stale data or invalidate prematurely. Event-based invalidation requires careful tracking of dependencies and update propagation. Tag-based invalidation groups related entries for bulk invalidation but adds metadata overhead. The maxim "cache invalidation is one of the two hard problems in computer science" reflects the inherent difficulty.
Thundering Herd Problem occurs when many requests simultaneously encounter a cache miss for the same key, causing multiple identical expensive operations. Strategies include request coalescing where only the first miss triggers computation while others wait, probabilistic early expiration where TTL varies slightly to distribute recomputation, and cache warming to prevent popular keys from expiring during high traffic.
# Request coalescing to prevent thundering herd
class CacheWithLock
def fetch(key, expires_in:)
cached = Rails.cache.read(key)
return cached if cached
lock_key = "lock:#{key}"
acquired = Rails.cache.write(lock_key, true, expires_in: 10.seconds, unless_exist: true)
if acquired
begin
result = yield
Rails.cache.write(key, result, expires_in: expires_in)
result
ensure
Rails.cache.delete(lock_key)
end
else
# Another process is computing, wait and retry
sleep 0.1
fetch(key, expires_in: expires_in) { yield }
end
end
end
Negative Caching stores absence of data to prevent repeated queries for non-existent entities. Database lookups for invalid IDs or deleted records should cache the negative result with shorter TTL. This prevents attackers from overloading systems by requesting non-existent data repeatedly.
Cache Hierarchy Trade-offs balance latency, capacity, and consistency. L1 caches (application memory) provide microsecond access but limited size and no sharing. L2 caches (Redis, Memcached) offer millisecond access with gigabytes of capacity and multi-instance sharing. L3 caches (CDN edge locations) provide geographic distribution with second-scale propagation delays. Choosing the appropriate level depends on access frequency, data size, and consistency needs.
Implementation Approaches
Client-Side Caching stores data in browsers or mobile applications. HTTP cache headers (Cache-Control, ETag, Last-Modified) instruct clients how long to cache responses. Service Workers enable sophisticated caching strategies for Progressive Web Apps. localStorage and IndexedDB provide persistent storage across sessions. This approach minimizes server load and network traffic but requires careful invalidation strategies since clients control cache lifetime.
Application-Level Caching maintains computed results in server memory. Full-page caching stores entire rendered HTML responses, maximizing speed for static content. Fragment caching stores partial page sections, reusing components across different pages. Object caching stores deserialized database records or API responses. Query result caching stores database result sets. Ruby applications commonly use in-process caching for single-server deployments and external stores for distributed systems.
Database Query Caching stores result sets from frequent queries. The database engine maintains an internal cache of recently executed queries and their results. When identical queries execute, the engine returns cached results without re-executing. This transparent caching requires no application changes but has limited effectiveness for parameterized queries since each unique parameter combination creates a new cache entry.
Content Delivery Network (CDN) Caching distributes static assets to edge locations near users. HTML, CSS, JavaScript, images, and videos get cached at geographic points of presence. Origin servers set cache headers controlling edge behavior. Cache purging APIs allow explicit invalidation when content updates. CDNs drastically reduce latency for global users and protect origin servers from traffic spikes.
Distributed Cache Patterns coordinate multiple cache instances. Cache-aside with eventual consistency allows each instance to independently cache data, tolerating temporary inconsistencies. Centralized invalidation uses pub/sub messaging to propagate invalidation events to all instances. Distributed locks coordinate access to prevent inconsistencies during updates. Consistent hashing distributes keys across nodes, maintaining stability when nodes join or leave.
# Distributed cache with Redis pub/sub invalidation
class DistributedCache
def initialize
@local_cache = ActiveSupport::Cache::MemoryStore.new
@redis = Redis.new
Thread.new { subscribe_to_invalidations }
end
def read(key)
@local_cache.read(key) || @redis.get(key)
end
def write(key, value, expires_in:)
@local_cache.write(key, value, expires_in: expires_in)
@redis.setex(key, expires_in, value)
end
def invalidate(key)
@local_cache.delete(key)
@redis.del(key)
@redis.publish('cache:invalidate', key)
end
private
def subscribe_to_invalidations
@redis.subscribe('cache:invalidate') do |on|
on.message do |channel, key|
@local_cache.delete(key)
end
end
end
end
Materialized View Pattern precomputes and stores expensive query results. Database materialized views store aggregated or joined data, refreshing periodically or on demand. Application-level materialized views compute complex business logic results during low-traffic periods. This approach trades storage space and update complexity for query performance.
Cache Sharding distributes cache across multiple instances based on key ranges or hash values. Consistent hashing assigns keys to nodes while minimizing remapping when nodes change. Range-based sharding assigns contiguous key ranges to nodes, enabling range queries but risking uneven distribution. Hash-based sharding distributes keys evenly but complicates range operations. Sharding scales cache capacity and throughput beyond single-instance limits.
Ruby Implementation
Rails Cache API provides a unified interface across multiple cache stores. The framework includes built-in support for memory stores, file stores, Redis, and Memcached. Configuration in config/environments/*.rb selects the appropriate store for each environment. The API offers read, write, fetch, delete, and increment operations with consistent behavior across backends.
# Configure cache store in Rails
# config/environments/production.rb
config.cache_store = :redis_cache_store, {
url: ENV['REDIS_URL'],
expires_in: 1.hour,
namespace: 'myapp',
pool_size: 5,
pool_timeout: 5
}
# Using Rails cache
class ArticleController < ApplicationController
def show
@article = Rails.cache.fetch("article:#{params[:id]}", expires_in: 12.hours) do
Article.find(params[:id])
end
end
end
Memory Store keeps cache in application memory, suitable for single-process applications or development. Data persists only while the process runs, and each process maintains separate caches in multi-process deployments. Configuration options include size limits and cleanup intervals.
# Memory store configuration
config.cache_store = :memory_store, {
size: 64.megabytes,
expires_in: 30.minutes,
compress: true
}
# Direct usage
cache = ActiveSupport::Cache::MemoryStore.new(size: 32.megabytes)
cache.write('key', 'value', expires_in: 5.minutes)
cache.read('key') # => 'value'
Redis Integration through redis-rails or redis-store gems provides distributed caching. Redis offers persistence options, atomic operations, and data structure support beyond simple key-value storage. Connection pooling prevents exhaustion under high concurrency. Namespace support isolates multiple applications sharing the same Redis instance.
# Redis with Sidekiq integration
class ExpensiveJob
include Sidekiq::Worker
def perform(user_id)
cache_key = "expensive_calculation:#{user_id}"
result = $redis.get(cache_key)
return JSON.parse(result) if result
result = perform_expensive_calculation(user_id)
$redis.setex(cache_key, 1.hour.to_i, result.to_json)
result
end
end
Fragment Caching caches portions of rendered views in Rails templates. Cache keys include view dependencies to automatically invalidate when underlying data changes. Russian doll caching nests cache fragments, invalidating outer caches when inner data changes through touch cascades.
<!-- app/views/products/show.html.erb -->
<% cache @product do %>
<h1><%= @product.name %></h1>
<p><%= @product.description %></p>
<% cache @product.reviews do %>
<h2>Reviews</h2>
<% @product.reviews.each do |review| %>
<% cache review do %>
<%= render review %>
<% end %>
<% end %>
<% end %>
<% end %>
Low-Level Caching provides manual control over cache operations. The Rails.cache.fetch method combines read and write operations, executing a block only on cache miss. This pattern reduces boilerplate while maintaining flexibility.
class UserService
def user_statistics(user_id)
Rails.cache.fetch("user_stats:#{user_id}", expires_in: 1.hour) do
{
total_orders: Order.where(user_id: user_id).count,
total_spent: Order.where(user_id: user_id).sum(:total),
average_order: Order.where(user_id: user_id).average(:total),
last_order: Order.where(user_id: user_id).order(created_at: :desc).first
}
end
end
end
HTTP Caching uses Rack middleware and controller methods to set cache headers. fresh_when and stale? methods compare request ETags or modification times with resource state, returning 304 Not Modified responses when appropriate. expires_in sets Cache-Control headers for client-side caching.
class ArticlesController < ApplicationController
def show
@article = Article.find(params[:id])
# ETags for conditional GET
fresh_when(@article, public: true)
# Or explicit cache control
expires_in 1.hour, public: true
end
def index
@articles = Article.published
# Use last modified timestamp
fresh_when last_modified: @articles.maximum(:updated_at))
end
end
Dalli Gem provides high-performance Memcached client for Ruby applications. It supports connection pooling, compression, and failover across multiple Memcached servers. Configuration includes server addresses, namespace, and timeout values.
# Gemfile
gem 'dalli'
# config/environments/production.rb
config.cache_store = :mem_cache_store, 'cache1.example.com', 'cache2.example.com', {
namespace: 'myapp',
compress: true,
pool_size: 5,
expires_in: 1.day
}
# Direct usage for advanced operations
dalli = Dalli::Client.new(['localhost:11211'])
dalli.set('key', 'value', 3600) # TTL in seconds
dalli.get_multi('key1', 'key2', 'key3') # Batch fetch
dalli.cas('key') do |value| # Compare-and-swap
value.to_i + 1
end
Custom Cache Stores implement the ActiveSupport::Cache::Store interface for specialized backends. The interface requires read_entry, write_entry, delete_entry, and clear methods. This abstraction allows seamless switching between different cache backends.
class CustomCacheStore < ActiveSupport::Cache::Store
def initialize(options = nil)
super(options)
@storage = SomeCustomBackend.new
end
private
def read_entry(key, **options)
deserialize(@storage.get(key))
end
def write_entry(key, entry, **options)
expires_in = options[:expires_in]
@storage.set(key, serialize(entry), expires_in)
end
def delete_entry(key, **options)
@storage.delete(key)
end
def clear(options = nil)
@storage.flush
end
end
Performance Considerations
Cache Effectiveness Metrics quantify performance improvements. Hit rate (successful reads / total reads) indicates how often cache contains requested data. Miss rate (cache misses / total reads) shows unsuccessful lookups. Eviction rate (items evicted / items written) reveals capacity issues. Latency distribution comparing cached versus uncached requests demonstrates actual performance benefits. Monitoring these metrics identifies configuration problems and optimization opportunities.
Memory Overhead impacts cache capacity and performance. Serialization adds storage overhead beyond raw data size. Ruby object marshaling includes class metadata and structure information. JSON serialization creates string representations with syntax characters. MessagePack provides compact binary encoding with faster serialization. Compression reduces size but increases CPU usage. Balancing compression, serialization format, and memory usage depends on data characteristics and access patterns.
# Comparing serialization formats
require 'benchmark'
require 'json'
require 'msgpack'
data = { users: Array.new(1000) { |i| { id: i, name: "User #{i}", email: "user#{i}@example.com" } } }
Benchmark.bm do |x|
x.report("Marshal:") { 1000.times { Marshal.dump(data) } }
x.report("JSON:") { 1000.times { JSON.generate(data) } }
x.report("MessagePack:") { 1000.times { MessagePack.pack(data) } }
end
# Marshal: ~250KB, fastest Ruby serialization
# JSON: ~180KB, cross-language compatibility
# MessagePack: ~140KB, compact binary format
Network Latency dominates distributed cache performance. Redis typically adds 1-2ms per operation over localhost. Remote Redis adds round-trip network latency. Connection pooling amortizes connection overhead across requests. Pipelining batches multiple operations into single network round-trips. Multi-get operations fetch multiple keys simultaneously, reducing latency compared to sequential reads.
# Inefficient: N+1 cache reads
user_ids.each do |id|
user = Rails.cache.read("user:#{id}")
process(user)
end
# Efficient: Batch read with read_multi
cache_keys = user_ids.map { |id| "user:#{id}" }
cached_users = Rails.cache.read_multi(*cache_keys)
user_ids.each do |id|
user = cached_users["user:#{id}"]
process(user) if user
end
TTL Configuration balances freshness and efficiency. Shorter TTL reduces stale data risk but increases cache misses and database load. Longer TTL maximizes hit rates but prolongs staleness. Analyzing update frequency and read patterns guides appropriate TTL selection. Hot data benefits from longer TTL since high read volumes justify occasional staleness. Cold data can use shorter TTL since low access frequency limits miss impact.
Compression Trade-offs reduce memory usage at CPU cost. Rails cache stores support automatic compression for entries exceeding configurable thresholds. Compression ratios vary by data type: structured JSON compresses well, binary data poorly. CPU overhead matters more for high-throughput applications. Network transfer benefits more in remote cache scenarios where bandwidth costs exceed compression CPU time.
Cache Warming Strategies eliminate cold start penalties. Background jobs preload frequently accessed keys during deployment or scheduled intervals. Application initialization triggers loading critical data before accepting requests. Request-triggered warming populates cache during first access with locked computation preventing duplicate work. Warm-up effectiveness depends on accurately predicting which data users will request.
Monitoring and Instrumentation tracks cache performance in production. Rails ActiveSupport::Notifications instruments cache operations with events capturing operation type, key, hit/miss status, and duration. Custom instrumentation can aggregate metrics by key pattern, controller action, or business operation. Real-time dashboards visualize hit rates, latency percentiles, and capacity utilization, enabling proactive optimization.
# Subscribe to cache instrumentation events
ActiveSupport::Notifications.subscribe('cache_read.active_support') do |name, start, finish, id, payload|
duration = (finish - start) * 1000
hit = payload[:hit] ? 'hit' : 'miss'
Metrics.measure('cache.read.duration', duration, tags: { hit: hit })
Metrics.increment('cache.read.count', tags: { hit: hit })
end
# Custom instrumentation wrapper
class InstrumentedCache
def fetch(key, **options)
start = Time.now
result = Rails.cache.fetch(key, **options) { yield }
duration = Time.now - start
Metrics.measure('business.cache.duration', duration, tags: { key_pattern: extract_pattern(key) })
result
end
end
Common Pitfalls
Cache Stampede occurs when a popular key expires and concurrent requests simultaneously trigger recomputation. Hundreds of processes may execute the same expensive query simultaneously. Solutions include probabilistic early expiration where some requests refresh slightly before expiration, distributed locking allowing only one process to compute while others wait, or pre-emptive refresh with background jobs updating cache before expiration.
Stale Data Serving happens when cached data outlives source updates. Time-based expiration cannot guarantee freshness since updates occur asynchronously to TTL countdown. Event-based invalidation introduces complexity and potential race conditions. Versioned cache keys incorporating update timestamps or version numbers ensure new data supersedes old entries. Checking source modification times before serving cached responses catches staleness at read time.
# Versioned cache keys prevent stale data
class Article < ApplicationRecord
after_save :bump_cache_version
def cache_key
"article:#{id}:v#{cache_version}"
end
private
def bump_cache_version
increment!(:cache_version)
end
end
# Usage automatically invalidates old versions
Rails.cache.fetch(@article.cache_key) do
render_article(@article)
end
Cascading Failures amplify cache unavailability impact. Applications crashing when cache becomes unavailable create complete service outages. Circuit breaker patterns detect cache failures and fail gracefully to source systems. Timeouts prevent indefinite blocking on unresponsive cache services. Fallback logic bypasses cache during degraded performance, accepting slower response times over complete failure.
Over-Caching wastes memory on low-value data. Caching everything indiscriminately fills memory with rarely-accessed entries, reducing space for hot data. Analyzing access patterns identifies high-frequency operations worth caching. Data accessed infrequently or computed cheaply gains little from caching overhead. Large objects with infrequent access consume disproportionate cache space relative to benefit.
Cache Pollution occurs when one-time reads evict frequently accessed data. Bulk operations processing entire datasets can flush hot data from cache. Database migrations scanning all records populate cache with cold data. Separating cache namespaces isolates transient operations from normal traffic. Using different cache instances or TTL policies for batch operations preserves interactive response caching.
Inconsistent Invalidation creates data corruption in distributed systems. Write-through caching updates cache and database but failures can leave inconsistent state. Write-behind caching may lose uncommitted cache entries on crashes. Distributed transactions coordinate cache and database updates atomically but impose significant performance penalties. Eventual consistency accepts temporary inconsistencies for availability and performance.
# Handling inconsistent invalidation with retry logic
class ConsistentCacheWriter
MAX_RETRIES = 3
def write_through(key, value)
retries = 0
begin
ActiveRecord::Base.transaction do
database_write(value)
begin
Rails.cache.write(key, value)
rescue => cache_error
# Log error but don't fail transaction
Rails.logger.error("Cache write failed: #{cache_error}")
# Schedule background invalidation
InvalidateCacheJob.perform_later(key)
end
end
rescue => db_error
retries += 1
raise if retries >= MAX_RETRIES
sleep(0.1 * retries)
retry
end
end
end
Memory Leaks from unbounded cache growth crash applications. Missing TTL configuration allows infinite accumulation. Unique cache keys generated per request (including timestamps or random values) prevent cache hits while consuming memory. Monitoring cache size and entry counts detects unbounded growth. Configuring maximum cache sizes with appropriate eviction policies prevents exhaustion.
Serialization Failures break cache functionality. Caching objects containing unclosed file handles or database connections fails during deserialization. Caching objects with circular references may exceed serialization limits. Complex objects with many associations serialize inefficiently. Caching simplified data structures or pre-rendered output avoids serialization complexity. Version mismatches between serialized format and application code cause deserialization errors after deployments.
Ineffective Key Design reduces hit rates and complicates invalidation. Over-specific keys including irrelevant parameters prevent cache reuse across similar requests. Under-specific keys cause inappropriate sharing between distinct requests. Keys lacking structure complicate pattern-based invalidation. Hierarchical key patterns (namespace:type🆔version) enable efficient invalidation and monitoring. Including relevant parameters while excluding irrelevant details maximizes reuse.
Reference
Cache Store Comparison
| Store Type | Persistence | Distribution | Latency | Capacity | Use Case |
|---|---|---|---|---|---|
| Memory | None | Single process | Microseconds | MB-GB | Development, single server |
| Redis | Optional | Multi-instance | 1-2ms | GB-TB | Production, distributed systems |
| Memcached | None | Multi-instance | 1-2ms | GB-TB | High-throughput caching |
| File | Disk | Single server | 10-100ms | TB+ | Large objects, shared hosting |
| CDN | Edge locations | Global | 10-100ms | PB+ | Static assets, global distribution |
Eviction Policies
| Policy | Description | Best For | Worst For |
|---|---|---|---|
| LRU | Remove least recently used | Temporal locality patterns | Scan operations |
| LFU | Remove least frequently used | Hotspot patterns | Changing access patterns |
| FIFO | Remove oldest entries | Simple implementation | Any access pattern |
| Random | Remove random entries | Uniform access | Predictable patterns |
| TTL-based | Remove expired entries | Time-sensitive data | Long-lived data |
Cache Patterns
| Pattern | Write Behavior | Read Behavior | Consistency | Complexity |
|---|---|---|---|---|
| Cache-Aside | Application writes to DB, invalidates cache | Application reads cache, fetches on miss | Eventual | Low |
| Write-Through | Application writes to cache and DB synchronously | Application reads from cache | Strong | Medium |
| Write-Behind | Application writes to cache, async DB update | Application reads from cache | Eventual | High |
| Write-Around | Application writes to DB only | Application reads cache, fetches on miss | Eventual | Low |
| Read-Through | Cache writes to DB on miss | Application reads from cache | Configurable | Medium |
Rails Cache Configuration
| Option | Values | Purpose | Default |
|---|---|---|---|
| expires_in | Duration | Time until expiration | Never |
| namespace | String | Key prefix for isolation | nil |
| compress | Boolean | Enable compression | false |
| compress_threshold | Bytes | Minimum size to compress | 1024 |
| race_condition_ttl | Duration | Extended TTL during refresh | 0 |
| pool_size | Integer | Connection pool size | 5 |
| pool_timeout | Seconds | Pool acquisition timeout | 5 |
Cache Operation Methods
| Method | Purpose | Returns | Notes |
|---|---|---|---|
| read | Retrieve value | Value or nil | Returns nil on miss |
| write | Store value | Boolean | Overwrites existing |
| fetch | Read or compute | Value | Executes block on miss |
| delete | Remove entry | Boolean | Returns true if existed |
| exist? | Check presence | Boolean | Checks without reading |
| increment | Atomic add | Integer | Requires numeric value |
| decrement | Atomic subtract | Integer | Requires numeric value |
| read_multi | Batch read | Hash | More efficient than multiple reads |
| write_multi | Batch write | Boolean | Atomic multi-key write |
| delete_matched | Pattern delete | Integer | Returns count deleted |
HTTP Cache Headers
| Header | Purpose | Example Values |
|---|---|---|
| Cache-Control | Caching directives | public, private, no-cache, max-age=3600 |
| ETag | Response version identifier | "686897696a7c876b7e" |
| Last-Modified | Resource modification time | Mon, 01 Jan 2024 00:00:00 GMT |
| Expires | Absolute expiration time | Mon, 01 Jan 2024 12:00:00 GMT |
| Vary | Header-based variations | Accept-Encoding, User-Agent |
Common Cache Key Patterns
| Pattern | Example | Use Case |
|---|---|---|
| Entity | user:123 | Single database record |
| Collection | users:page:1 | Paginated lists |
| Computed | user:123:stats | Expensive calculations |
| Composite | product:456:reviews:recent | Nested relationships |
| Versioned | article:789:v12 | Version-aware caching |
| User-specific | cart:user:123 | Per-user data |
| Query | search:ruby:page:1 | Query results |
| Fragment | views/products/show:123 | Rendered fragments |
Performance Metrics
| Metric | Formula | Target | Indicates |
|---|---|---|---|
| Hit Rate | Hits / Total Requests | >80% | Cache effectiveness |
| Miss Rate | Misses / Total Requests | <20% | Cache coverage |
| Eviction Rate | Evictions / Writes | <10% | Capacity adequacy |
| Average Latency | Sum(Duration) / Count | <5ms | Performance impact |
| Memory Usage | Used / Capacity | <80% | Capacity planning |
| Throughput | Operations / Second | Varies | Scalability limit |