Overview
Amortized analysis provides a method for analyzing the average performance of operations in a data structure over a sequence of operations, rather than considering the worst-case cost of individual operations. This technique proves particularly valuable when occasional expensive operations occur within a sequence of predominantly cheap operations, as it captures the average cost more accurately than worst-case analysis alone.
The technique originated from the need to analyze data structures where operations have variable costs. Consider a dynamic array that doubles its capacity when full: most insertions complete in constant time, but occasional resize operations require copying all elements to a new array. Analyzing each insertion as O(n) misrepresents the actual performance, while amortized analysis shows that insertions average O(1) time.
Amortized analysis differs from average-case analysis in a critical way. Average-case analysis assumes a probability distribution over inputs, while amortized analysis makes no assumptions about input distribution. Instead, it guarantees that for any sequence of n operations, the total cost remains bounded, regardless of the specific sequence chosen.
Three primary methods exist for performing amortized analysis: the aggregate method, the accounting method, and the potential method. Each offers different insights and applies more naturally to different problems, though all three produce equivalent results when correctly applied.
The concept applies across numerous data structures and algorithms in computer science. Dynamic arrays, hash tables with resizing, disjoint-set data structures, and splay trees all rely on amortized analysis to accurately characterize their performance. Without this technique, these data structures would appear less efficient than they actually are in practice.
# Dynamic array demonstrating amortized constant-time append
class DynamicArray
def initialize
@capacity = 4
@size = 0
@data = Array.new(@capacity)
end
def append(value)
resize if @size == @capacity
@data[@size] = value
@size += 1
end
private
def resize
@capacity *= 2
new_data = Array.new(@capacity)
@size.times { |i| new_data[i] = @data[i] }
@data = new_data
end
end
# Most appends are O(1), occasional resizes are O(n)
# Amortized analysis shows average O(1) per append
Key Principles
Amortized analysis rests on the principle that the total cost of n operations provides a more accurate picture than the maximum cost of any single operation. The amortized cost per operation equals the total cost divided by the number of operations, establishing an upper bound on the average cost per operation for any sequence.
The aggregate method calculates the total cost T(n) for n operations, then divides by n to find the amortized cost. This straightforward approach works well when the total cost can be directly computed or bounded. For a sequence of n operations where T(n) represents the total cost, the amortized cost per operation equals T(n)/n.
The accounting method assigns different charges to different operations. Some operations are charged more than their actual cost, storing the extra charge as credit. Other operations use this stored credit to pay for their execution. The key constraint: credit must never go negative, ensuring that enough credit exists to cover all operations. Each operation receives an amortized cost (the charge), which may differ from its actual cost.
The potential method defines a potential function Φ that maps each state of the data structure to a real number. The potential represents stored work that can pay for future expensive operations. For operation i with actual cost c_i, the amortized cost equals c_i + Φ(D_i) - Φ(D_{i-1}), where D_i represents the data structure state after operation i. The potential function must satisfy Φ(D_n) ≥ Φ(D_0) for all sequences, typically achieved by ensuring Φ(D_i) ≥ 0 for all i when Φ(D_0) = 0.
These methods produce equivalent results because they all bound the total cost. In the aggregate method, the sum of amortized costs equals T(n). In the accounting method, the sum of amortized costs equals the sum of actual costs plus remaining credit. In the potential method, the sum of amortized costs equals the sum of actual costs plus Φ(D_n) - Φ(D_0).
The critical insight: amortized analysis provides guarantees about sequences of operations, not about individual operations. An operation with O(1) amortized cost might occasionally take O(n) time, but over any sequence of n operations, the average cost remains constant.
# Stack with multipop demonstrating amortized analysis
class Stack
def initialize
@items = []
end
def push(value)
@items.push(value)
# Actual cost: O(1)
end
def pop
@items.pop
# Actual cost: O(1)
end
def multipop(k)
k.times { break if @items.empty?; @items.pop }
# Actual cost: O(min(k, size))
# Amortized cost: O(1) - each element pushed and popped once
end
end
# Analysis: n operations on initially empty stack
# Each element pushed once (cost 1), popped at most once (cost 1)
# Total cost ≤ 2n, amortized O(1) per operation
Understanding the relationship between actual and amortized costs proves essential. Operations with high amortized costs always have high actual costs. However, operations with low amortized costs may occasionally have high actual costs, compensated by many operations with low actual costs.
Ruby Implementation
Ruby provides several data structures whose performance characteristics benefit from amortized analysis. Understanding these implementations helps developers make informed decisions about data structure selection and performance expectations.
Ruby's Array class implements dynamic array behavior with amortized constant-time appends. The internal implementation maintains a capacity larger than the current size, doubling capacity when the array fills. This doubling strategy ensures that resize operations become increasingly rare as the array grows.
# Demonstrating Ruby Array's amortized append behavior
def measure_append_times(max_size)
arr = []
times = []
max_size.times do |i|
start = Process.clock_gettime(Process::CLOCK_MONOTONIC)
arr << i
finish = Process.clock_gettime(Process::CLOCK_MONOTONIC)
times << (finish - start)
end
# Occasional spikes in times correspond to resize operations
# Most appends complete in roughly constant time
times
end
# Analysis of the pattern:
# Times mostly constant, with spikes at powers of 2
# Each spike represents a resize operation
# Amortized cost remains O(1) despite occasional O(n) resizes
Ruby Hash implements a hash table with automatic resizing when the load factor exceeds a threshold. The implementation maintains bins (buckets) for collision resolution and resizes the table when it becomes too full, rehashing all entries into the new table.
# Hash table with explicit resizing demonstrates amortized analysis
class SimpleHashTable
DEFAULT_CAPACITY = 8
LOAD_FACTOR_THRESHOLD = 0.75
def initialize
@capacity = DEFAULT_CAPACITY
@size = 0
@bins = Array.new(@capacity) { [] }
end
def insert(key, value)
resize if load_factor > LOAD_FACTOR_THRESHOLD
index = hash(key) % @capacity
pair = @bins[index].find { |k, _| k == key }
if pair
pair[1] = value
else
@bins[index] << [key, value]
@size += 1
end
end
def get(key)
index = hash(key) % @capacity
pair = @bins[index].find { |k, _| k == key }
pair ? pair[1] : nil
end
private
def hash(key)
key.hash.abs
end
def load_factor
@size.to_f / @capacity
end
def resize
old_bins = @bins
@capacity *= 2
@bins = Array.new(@capacity) { [] }
@size = 0
old_bins.each do |bin|
bin.each { |key, value| insert(key, value) }
end
end
end
# Insert n items: most O(1), occasional O(n) resize
# Total cost: O(n), amortized O(1) per insert
Ruby's standard library includes data structures that demonstrate amortized performance characteristics. The Set class, built on Hash, inherits the amortized constant-time insertion. The PQueue implementation from the algorithms gem shows amortized logarithmic operations through heap-based priority queues.
require 'set'
# Set operations demonstrate amortized constant-time behavior
class BatchProcessor
def initialize
@seen = Set.new
@duplicates = 0
end
def process_items(items)
items.each do |item|
if @seen.add?(item)
# Process new item - amortized O(1) insert
handle_new_item(item)
else
@duplicates += 1
end
end
end
private
def handle_new_item(item)
# Process the item
end
end
# Over n items, total cost O(n) despite occasional Set resizes
Implementing a disjoint-set (union-find) data structure in Ruby demonstrates how path compression and union by rank produce near-constant amortized time operations. The implementation uses these techniques to keep trees shallow.
# Disjoint-set with path compression and union by rank
class DisjointSet
def initialize(n)
@parent = Array.new(n) { |i| i }
@rank = Array.new(n, 0)
end
def find(x)
# Path compression: make all nodes point directly to root
if @parent[x] != x
@parent[x] = find(@parent[x])
end
@parent[x]
end
def union(x, y)
root_x = find(x)
root_y = find(y)
return if root_x == root_y
# Union by rank: attach smaller tree under larger tree
if @rank[root_x] < @rank[root_y]
@parent[root_x] = root_y
elsif @rank[root_x] > @rank[root_y]
@parent[root_y] = root_x
else
@parent[root_y] = root_x
@rank[root_x] += 1
end
end
def connected?(x, y)
find(x) == find(y)
end
end
# m operations on n elements: O(m α(n))
# α(n) is inverse Ackermann function, effectively constant
# Amortized O(1) per operation for practical n
Practical Examples
Dynamic array resizing provides the canonical example of amortized analysis. The array maintains a capacity and size, doubling capacity when insertions would exceed current capacity. Analyzing a sequence of n insertions starting from an empty array reveals the amortized cost.
# Detailed dynamic array with operation counting
class InstrumentedArray
attr_reader :copies, :size, :capacity
def initialize(initial_capacity = 4)
@capacity = initial_capacity
@size = 0
@data = Array.new(@capacity)
@copies = 0 # Track total element copies
end
def push(value)
if @size == @capacity
resize_and_copy
end
@data[@size] = value
@size += 1
end
def resize_and_copy
new_capacity = @capacity * 2
new_data = Array.new(new_capacity)
@size.times do |i|
new_data[i] = @data[i]
@copies += 1
end
@capacity = new_capacity
@data = new_data
end
def amortized_copies_per_insert
@copies.to_f / @size
end
end
# Demonstrate amortized constant-time behavior
array = InstrumentedArray.new(4)
[8, 16, 32, 64, 128].each do |n|
array = InstrumentedArray.new(4)
n.times { |i| array.push(i) }
puts "n=#{n}: total_copies=#{array.copies}, amortized=#{array.amortized_copies_per_insert.round(2)}"
end
# Output shows amortized copies approaching constant (< 2)
Binary counter increment demonstrates amortized analysis through bit operations. Incrementing a counter requires flipping bits, where the number of flips varies by value but averages to constant time per increment.
# Binary counter with bit flip counting
class BinaryCounter
def initialize(bits)
@bits = Array.new(bits, 0)
@flips = 0
end
def increment
i = 0
while i < @bits.length && @bits[i] == 1
@bits[i] = 0
@flips += 1
i += 1
end
if i < @bits.length
@bits[i] = 1
@flips += 1
end
end
def value
@bits.reverse.join.to_i(2)
end
def amortized_flips
@flips.to_f / value if value > 0
end
end
# Aggregate analysis of n increments:
# Bit 0 flips n times
# Bit 1 flips n/2 times
# Bit 2 flips n/4 times
# Total flips: n(1 + 1/2 + 1/4 + ...) < 2n
# Amortized O(1) per increment
counter = BinaryCounter.new(10)
[10, 100, 1000].each do |n|
counter = BinaryCounter.new(10)
n.times { counter.increment }
puts "n=#{n}: total_flips=#{counter.instance_variable_get(:@flips)}, amortized=#{counter.amortized_flips.round(2)}"
end
Table resizing with incremental rehashing demonstrates how spreading expensive operations across multiple cheap operations maintains amortized bounds. Instead of rehashing all elements during a single resize, incremental rehashing moves a few elements per operation.
# Hash table with incremental rehashing
class IncrementalHashTable
REHASH_ITEMS_PER_OP = 2
def initialize
@old_table = nil
@new_table = Array.new(8) { [] }
@size = 0
@rehash_index = 0
end
def insert(key, value)
# Perform incremental rehashing if in progress
incremental_rehash if @old_table
# Trigger resize if needed
start_resize if load_factor > 0.75
# Insert into new table
index = hash(key, @new_table.length)
@new_table[index] << [key, value]
@size += 1
end
def get(key)
# Check new table
index = hash(key, @new_table.length)
pair = @new_table[index].find { |k, _| k == key }
return pair[1] if pair
# Check old table if rehashing in progress
if @old_table
index = hash(key, @old_table.length)
pair = @old_table[index].find { |k, _| k == key }
return pair[1] if pair
end
nil
end
private
def hash(key, table_size)
key.hash.abs % table_size
end
def load_factor
@size.to_f / @new_table.length
end
def start_resize
@old_table = @new_table
@new_table = Array.new(@old_table.length * 2) { [] }
@rehash_index = 0
end
def incremental_rehash
items_moved = 0
while @rehash_index < @old_table.length && items_moved < REHASH_ITEMS_PER_OP
@old_table[@rehash_index].each do |key, value|
index = hash(key, @new_table.length)
@new_table[index] << [key, value]
items_moved += 1
end
@rehash_index += 1
end
# Complete rehashing
@old_table = nil if @rehash_index >= @old_table.length
end
end
# Each operation performs constant work
# Rehashing distributed across many operations
# Amortized O(1) per operation with bounded latency
Splay tree operations demonstrate amortized logarithmic time through tree rotations that bring frequently accessed elements closer to the root. While individual operations can take linear time, the amortized cost per operation remains logarithmic.
# Simplified splay tree with amortized analysis
class SplayTree
class Node
attr_accessor :key, :left, :right, :parent
def initialize(key)
@key = key
@left = @right = @parent = nil
end
end
def initialize
@root = nil
end
def insert(key)
@root = insert_node(@root, key, nil)
splay(@root, key)
end
def find(key)
node = find_node(@root, key)
splay(@root, key) if node
node
end
private
def insert_node(node, key, parent)
return Node.new(key).tap { |n| n.parent = parent } unless node
if key < node.key
node.left = insert_node(node.left, key, node)
else
node.right = insert_node(node.right, key, node)
end
node
end
def find_node(node, key)
return nil unless node
return node if node.key == key
key < node.key ? find_node(node.left, key) : find_node(node.right, key)
end
def splay(root, key)
# Simplified splay operation
# Real implementation performs zig, zig-zig, zig-zag rotations
# Moves accessed node to root, restructuring tree
# Amortized O(log n) per operation
end
end
# Splay operations: O(n) worst case, O(log n) amortized
# Potential function measures tree balance
# Splaying improves balance, storing potential for future ops
Performance Considerations
Amortized analysis reveals performance characteristics invisible to worst-case analysis alone. Data structures with occasional expensive operations appear impractical under worst-case analysis but perform efficiently in practice when analyzed amortically.
The choice between worst-case and amortized analysis depends on application requirements. Real-time systems requiring bounded latency for individual operations cannot tolerate occasional expensive operations, making worst-case guarantees essential. Batch processing systems caring about total execution time benefit from amortized analysis, as occasional expensive operations have negligible impact on overall throughput.
Dynamic array resizing demonstrates the practical impact of doubling versus fixed-increment growth strategies. Doubling capacity achieves O(1) amortized insertion time, while increasing capacity by a fixed amount k produces O(n) amortized time per insertion.
# Compare doubling versus fixed-increment growth
class DoublingArray
def initialize
@capacity = 4
@size = 0
@total_copies = 0
end
def push(value)
if @size == @capacity
@total_copies += @size
@capacity *= 2
end
@size += 1
end
def amortized_cost
@total_copies.to_f / @size
end
end
class FixedIncrementArray
def initialize
@capacity = 4
@size = 0
@total_copies = 0
@increment = 4
end
def push(value)
if @size == @capacity
@total_copies += @size
@capacity += @increment
end
@size += 1
end
def amortized_cost
@total_copies.to_f / @size
end
end
# Comparison over n insertions
n = 10000
doubling = DoublingArray.new
fixed = FixedIncrementArray.new
n.times do
doubling.push(1)
fixed.push(1)
end
puts "Doubling amortized cost: #{doubling.amortized_cost.round(2)}"
puts "Fixed increment amortized cost: #{fixed.amortized_cost.round(2)}"
# Doubling: ~2 copies per insert (constant)
# Fixed: ~2500 copies per insert (linear in n)
Memory overhead interacts with amortized time complexity. Aggressive capacity doubling reduces resize frequency but increases memory waste. Growth factors smaller than 2 (such as 1.5) reduce memory overhead while maintaining amortized constant-time insertions, though with more frequent resizes.
Load factor thresholds in hash tables balance time and space efficiency. Lower thresholds reduce collision chains, improving average access time at the cost of memory. Higher thresholds conserve memory but increase collision resolution cost. The choice affects both worst-case and amortized performance.
# Hash table with configurable load factor
class ConfigurableHashTable
attr_reader :collisions
def initialize(load_factor_threshold)
@capacity = 8
@size = 0
@bins = Array.new(@capacity) { [] }
@threshold = load_factor_threshold
@collisions = 0
end
def insert(key, value)
resize if load_factor > @threshold
index = key.hash.abs % @capacity
@collisions += 1 if @bins[index].any?
@bins[index] << [key, value]
@size += 1
end
def load_factor
@size.to_f / @capacity
end
def resize
old_bins = @bins
@capacity *= 2
@bins = Array.new(@capacity) { [] }
temp_size = @size
@size = 0
@collisions = 0
old_bins.each do |bin|
bin.each { |key, value| insert(key, value) }
end
end
end
# Compare different load factor thresholds
n = 1000
[0.5, 0.75, 1.0].each do |threshold|
table = ConfigurableHashTable.new(threshold)
n.times { |i| table.insert(i, i) }
puts "Threshold #{threshold}: capacity=#{table.instance_variable_get(:@capacity)}, collisions=#{table.collisions}"
end
# Lower threshold: more capacity, fewer collisions
# Higher threshold: less capacity, more collisions
Amortized bounds do not guarantee performance for individual operations. Applications requiring predictable latency must either avoid data structures with high worst-case costs or use incremental techniques to bound per-operation work. Incremental rehashing, distributed garbage collection, and batch processing with work limits spread expensive operations across multiple cheap operations.
Cache behavior affects amortized analysis in practice. Resize operations touching many elements may experience cache misses, making the actual cost higher than simple operation counts suggest. Array doubling benefits from spatial locality during copying, while scattered pointer-based structures suffer from poor cache utilization during rebalancing.
Common Patterns
Several patterns recur across data structures analyzed amortically. Recognizing these patterns helps identify opportunities for amortized analysis and guides implementation choices.
The doubling/halving pattern appears in dynamic arrays, hash tables, and similar structures. Capacity grows geometrically when full and shrinks geometrically when occupancy drops below a threshold. This pattern ensures O(1) amortized insertion and deletion.
# Dynamic array with both growth and shrinkage
class ResizingArray
MIN_CAPACITY = 4
GROWTH_FACTOR = 2
SHRINK_THRESHOLD = 0.25
def initialize
@capacity = MIN_CAPACITY
@size = 0
@data = Array.new(@capacity)
end
def push(value)
resize(@capacity * GROWTH_FACTOR) if @size == @capacity
@data[@size] = value
@size += 1
end
def pop
return nil if @size == 0
value = @data[@size - 1]
@size -= 1
new_capacity = @capacity / GROWTH_FACTOR
if @size < @capacity * SHRINK_THRESHOLD && new_capacity >= MIN_CAPACITY
resize(new_capacity)
end
value
end
private
def resize(new_capacity)
new_data = Array.new(new_capacity)
@size.times { |i| new_data[i] = @data[i] }
@data = new_data
@capacity = new_capacity
end
end
# Both push and pop maintain O(1) amortized time
# Shrink threshold prevents oscillating resizes
The lazy evaluation pattern defers expensive operations until necessary, distributing cost across subsequent cheap operations. Incremental rehashing and lazy deletion exemplify this pattern.
# Lazy deletion in hash table
class LazyDeletionHashTable
DELETED = Object.new.freeze
def initialize
@bins = Array.new(8)
@size = 0
@deleted_count = 0
end
def insert(key, value)
clean_deleted_if_needed
index = find_slot(key)
if @bins[index].nil? || @bins[index] == DELETED
@bins[index] = [key, value]
@size += 1
else
@bins[index][1] = value
end
end
def delete(key)
index = find_slot(key)
if @bins[index] && @bins[index] != DELETED && @bins[index][0] == key
@bins[index] = DELETED
@size -= 1
@deleted_count += 1
true
else
false
end
end
private
def find_slot(key)
# Linear probing to find key or empty slot
index = key.hash.abs % @bins.length
while @bins[index] && @bins[index] != DELETED && @bins[index][0] != key
index = (index + 1) % @bins.length
end
index
end
def clean_deleted_if_needed
if @deleted_count > @bins.length / 2
compact_table
end
end
def compact_table
old_bins = @bins
@bins = Array.new(old_bins.length)
@size = 0
@deleted_count = 0
old_bins.each do |entry|
insert(entry[0], entry[1]) if entry && entry != DELETED
end
end
end
# Deletions O(1), cleanup amortized across operations
The cascading updates pattern performs a sequence of updates where each update may trigger subsequent updates. Fibonacci heaps demonstrate this pattern during decrease-key operations, and disjoint-set structures exhibit it during path compression.
The potential-based rebalancing pattern maintains invariants through operations that improve structure quality. Splay trees, AVL trees, and red-black trees rebalance after modifications, with the potential function measuring tree balance. Operations that worsen balance store potential, while rebalancing operations consume stored potential.
# Simplified AVL tree showing rotation pattern
class AVLNode
attr_accessor :key, :left, :right, :height
def initialize(key)
@key = key
@left = @right = nil
@height = 1
end
def balance_factor
left_height = @left ? @left.height : 0
right_height = @right ? @right.height : 0
left_height - right_height
end
def update_height
left_height = @left ? @left.height : 0
right_height = @right ? @right.height : 0
@height = [left_height, right_height].max + 1
end
end
class AVLTree
def insert(root, key)
return AVLNode.new(key) unless root
if key < root.key
root.left = insert(root.left, key)
else
root.right = insert(root.right, key)
end
root.update_height
balance(root)
end
def balance(node)
factor = node.balance_factor
# Left-heavy tree
if factor > 1
if node.left.balance_factor < 0
node.left = rotate_left(node.left)
end
return rotate_right(node)
end
# Right-heavy tree
if factor < -1
if node.right.balance_factor > 0
node.right = rotate_right(node.right)
end
return rotate_left(node)
end
node
end
def rotate_left(node)
# Rotation logic
# Decreases potential by improving balance
end
def rotate_right(node)
# Rotation logic
# Decreases potential by improving balance
end
end
# Insertions trigger rotations that improve balance
# Amortized O(log n) per operation
The credit-based pattern assigns costs to operations such that cheap operations accumulate credit and expensive operations spend accumulated credit. This pattern applies naturally when different operations have varying costs and certain sequences of operations guarantee sufficient credit accumulation.
Common Pitfalls
Confusing amortized and average-case complexity represents the most common misunderstanding. Amortized analysis provides worst-case guarantees for sequences of operations, independent of input distribution. Average-case analysis assumes a probability distribution over inputs and provides expected performance for random inputs.
# Amortized vs average-case: dynamic array example
class AnalysisComparison
def self.demonstrate_difference
# Amortized analysis: worst-case sequence
# Even adversarial sequence of n inserts is O(n) total
array = []
n = 1000
n.times { |i| array << i }
# Total: O(n), amortized O(1) per insert
# Holds for ANY sequence, including worst-case
# Average-case would assume random operations
# e.g., random mix of push/pop
# Different analysis, different guarantees
end
end
Assuming amortized bounds guarantee bounds on individual operations leads to incorrect conclusions. An operation with O(1) amortized cost may still have O(n) worst-case cost for individual invocations. Real-time systems requiring per-operation bounds cannot rely on amortized analysis alone.
Forgetting to account for resize costs in both growth and shrinkage creates problems. Dynamic arrays that shrink capacity when occupancy drops must avoid oscillating between growing and shrinking around a threshold. Setting the shrink threshold significantly below the growth threshold prevents oscillation.
# Incorrect: oscillating resize
class OscillatingArray
def initialize
@capacity = 4
@size = 0
end
def push(value)
resize_up if @size == @capacity
@size += 1
end
def pop
@size -= 1
resize_down if @size == @capacity / 2 # PROBLEM
end
def resize_up
@capacity *= 2
end
def resize_down
@capacity /= 2
end
end
# Sequence: push n, pop 1, push 1, pop 1, push 1, ...
# Oscillates between n and n-1, resizing repeatedly
# Amortized O(n) per operation, not O(1)
# Correct: use lower threshold
class StableArray
SHRINK_THRESHOLD = 0.25 # Shrink at 25%, grow at 100%
def pop
@size -= 1
resize_down if @size < @capacity * SHRINK_THRESHOLD
end
end
Misapplying potential functions by choosing functions that can go negative violates the method's requirements. The potential function must satisfy Φ(D_i) ≥ Φ(D_0) for all sequences when Φ(D_0) = 0, ensuring that the sum of amortized costs bounds the sum of actual costs.
Ignoring cache effects and memory hierarchy can make theoretical amortized bounds misleading in practice. A doubling strategy that appears optimal in the RAM model may perform poorly due to cache misses during copying. Understanding both theoretical bounds and practical performance requires considering memory hierarchy.
# Cache-aware resizing strategies
class CacheAwareArray
# Smaller growth factor improves cache behavior
GROWTH_FACTOR = 1.5 # vs 2.0
def initialize
@capacity = 4
@size = 0
end
# Growth factor < 2 allows reuse of previously freed memory
# Old capacity + next capacity < new capacity
# Enables allocator to reuse freed space
end
# Growth factor 1.5: better memory reuse, more frequent resizes
# Growth factor 2.0: less frequent resizes, worse memory reuse
# Both maintain O(1) amortized time
Applying amortized analysis to concurrent data structures without considering synchronization costs produces incorrect results. Lock acquisition, memory barriers, and atomic operations add costs not captured by sequential amortized analysis. Concurrent dynamic arrays require additional synchronization during resize operations, affecting both actual and amortized costs.
Assuming tighter amortized bounds than analysis supports causes performance surprises. Careful analysis may show O(log n) amortized cost where intuition suggests O(1). Splay trees provide O(log n) amortized access, not O(1), despite frequently accessed elements moving near the root.
Reference
Amortization Methods
| Method | Approach | Best For | Key Requirement |
|---|---|---|---|
| Aggregate | Calculate total cost T(n) for n operations, divide by n | Direct total cost calculation available | Accurate upper bound on T(n) |
| Accounting | Assign charges to operations, store credit for future use | Intuitive operation-by-operation analysis | Credit never goes negative |
| Potential | Define potential function measuring data structure state | Mathematical elegance, complex scenarios | Φ(D_n) ≥ Φ(D_0) for all sequences |
Common Data Structure Amortized Costs
| Data Structure | Operation | Worst-Case | Amortized | Method |
|---|---|---|---|---|
| Dynamic Array | Push | O(n) | O(1) | Aggregate/Potential |
| Dynamic Array | Pop with shrinking | O(n) | O(1) | Potential |
| Hash Table | Insert with resizing | O(n) | O(1) | Potential |
| Binary Counter | Increment | O(k) | O(1) | Aggregate |
| Disjoint Set | Union/Find with path compression | O(log n) | O(α(n)) | Potential |
| Splay Tree | Search/Insert/Delete | O(n) | O(log n) | Potential |
| Fibonacci Heap | Decrease-Key | O(n) | O(1) | Potential |
| Incrementing Multipop Stack | Push/Pop/Multipop | O(n) | O(1) | Accounting |
Growth Strategy Comparison
| Strategy | Growth Pattern | Amortized Insert | Memory Overhead | Resize Frequency |
|---|---|---|---|---|
| Doubling | Capacity × 2 | O(1) | Up to 2× | Low |
| Factor 1.5 | Capacity × 1.5 | O(1) | Up to 1.5× | Medium |
| Fixed Increment | Capacity + k | O(n) | Minimal | High |
| Exact Size | Capacity = Size | O(n) | None | Every operation |
Potential Function Properties
| Property | Requirement | Purpose |
|---|---|---|
| Non-negativity | Φ(D_i) ≥ 0 when Φ(D_0) = 0 | Ensures sum of amortized costs bounds actual costs |
| Sensitivity | Φ changes reflect structure quality | Expensive operations decrease potential |
| Measurability | Computable from data structure state | Enables amortized cost calculation |
| Invariant preservation | Maintains structural properties | Guarantees amortized bounds hold |
Amortized Cost Calculation
| Analysis Type | Cost Formula | Interpretation |
|---|---|---|
| Aggregate | T(n) / n | Total cost divided by operations |
| Accounting | Assigned charge to operation | May differ from actual cost |
| Potential | c_i + Φ(D_i) - Φ(D_{i-1}) | Actual cost plus potential change |
When to Use Amortized Analysis
| Scenario | Use Amortized Analysis | Use Worst-Case Analysis |
|---|---|---|
| Batch processing | Yes - total time matters | No - individual ops less critical |
| Real-time systems | No - need per-op bounds | Yes - must bound every operation |
| Throughput optimization | Yes - average perf important | No - occasional spikes acceptable |
| Latency-sensitive | Conditional - depends on tolerance | Yes - predictable response needed |
| Resource scheduling | Yes - total resource usage matters | Yes - must prevent resource exhaustion |
Ruby-Specific Implementation Notes
| Structure | Ruby Class | Key Method | Amortized Behavior |
|---|---|---|---|
| Dynamic Array | Array | push, pop | O(1) with automatic resizing |
| Hash Table | Hash | insert, delete | O(1) with automatic rehashing |
| Set | Set | add, delete | O(1) inherited from Hash |
| Priority Queue | Use gem: algorithms | push, pop | O(log n) heap operations |
Load Factor Thresholds
| Threshold | Resize Frequency | Collision Rate | Memory Usage | Recommended For |
|---|---|---|---|---|
| 0.5 | High | Very Low | High | Fast lookups critical |
| 0.75 | Medium | Low | Medium | Balanced performance |
| 1.0 | Low | Medium | Low | Memory-constrained systems |
| 1.5 | Very Low | High | Very Low | Rare insertions |