CrackedRuby - Time Complexity Analysis

Overview

Time complexity analysis measures how the runtime of an algorithm grows as the input size increases. This analysis focuses on the algorithmic behavior rather than absolute execution time, making it independent of hardware, language, or implementation details.

The primary tool for expressing time complexity is Big O notation, which describes the upper bound of growth rate. An algorithm with O(n) complexity takes time proportional to the input size, while O(n²) grows quadratically. This notation omits constants and lower-order terms, focusing on dominant factors that matter at scale.

Time complexity analysis answers critical questions: Will this algorithm handle 1 million records? How does doubling the input affect runtime? Which of two approaches scales better? These questions directly impact system design decisions, from choosing data structures to architecting distributed systems.

# O(1) - Constant time, independent of input size
def get_first(array)
  array[0]
end

# O(n) - Linear time, scales with input size
def sum_all(array)
  total = 0
  array.each { |num| total += num }
  total
end
# => For array of size n, performs n iterations

Computer scientists developed asymptotic analysis in the mid-20th century to compare algorithms mathematically. The notation system provides a common language for discussing algorithmic efficiency across platforms and implementations.

Key Principles

Asymptotic analysis examines algorithm behavior as input size approaches infinity. This approach eliminates noise from constant factors and focuses on growth rate. An algorithm taking 100n + 50 operations has the same asymptotic complexity as one taking n operations—both are O(n).

Three notations describe different bounds. Big O (O) represents the upper bound—the worst-case scenario. Big Omega (Ω) represents the lower bound—the best-case scenario. Big Theta (Θ) represents tight bounds when upper and lower bounds match. Most practical analysis uses Big O because worst-case guarantees matter for reliability.

# Analyzing nested loops
def find_duplicates(array)
  duplicates = []
  array.each_with_index do |item, i|           # Outer loop: n iterations
    array.each_with_index do |other, j|        # Inner loop: n iterations
      if i != j && item == other
        duplicates << item unless duplicates.include?(item)
      end
    end
  end
  duplicates
end
# Total operations: n * n = O(n²)

Growth rates form a hierarchy. From fastest to slowest: O(1) constant, O(log n) logarithmic, O(n) linear, O(n log n) linearithmic, O(n²) quadratic, O(n³) cubic, O(2^n) exponential, O(n!) factorial. Each jump represents a significant performance difference at scale.

Loop analysis follows straightforward rules. Sequential loops add their complexities: O(n) + O(m) = O(n + m). Nested loops multiply: O(n) * O(m) = O(n * m). When one loop depends on another's variable, complexity often becomes polynomial.

Recursive algorithms require solving recurrence relations. A function calling itself twice on half the input, like merge sort, produces T(n) = 2T(n/2) + O(n), which solves to O(n log n). The Master Theorem provides formulas for common recurrence patterns.

Space complexity measures memory usage growth, analyzed with the same notation. An algorithm can trade time for space or vice versa. Hash tables use O(n) space to achieve O(1) lookup time, while binary search uses O(1) space but requires O(log n) time.

Amortized analysis considers the average cost over a sequence of operations. Dynamic arrays demonstrate this: most insertions are O(1), but occasional resizing is O(n). Averaged over many operations, amortized cost remains O(1) per insertion.

# Demonstrating amortized analysis
class DynamicArray
  def initialize
    @capacity = 2
    @size = 0
    @array = Array.new(@capacity)
  end
  
  def push(value)
    if @size == @capacity
      resize  # O(n) operation, but rare
    end
    @array[@size] = value
    @size += 1
  end
  
  def resize
    @capacity *= 2
    new_array = Array.new(@capacity)
    @size.times { |i| new_array[i] = @array[i] }
    @array = new_array
  end
end
# Individual push: occasionally O(n), usually O(1)
# Amortized over n pushes: O(1) per push

Ruby Implementation

Ruby's built-in methods have documented complexity characteristics. Array access by index is O(1), but searching with include? is O(n). Hash lookup is O(1) average case, but depends on hash function quality. Understanding these characteristics prevents accidental performance degradation.

# Different approaches, different complexity
names = ["Alice", "Bob", "Charlie", "David"]

# O(n) - Linear search through array
names.include?("Charlie")  # => true

# O(1) - Hash lookup
name_set = names.to_set
name_set.include?("Charlie")  # => true

Enumerable methods vary in complexity. Methods like map, each, and select are O(n), processing each element once. Methods like sort use O(n log n) comparison sorts. Methods combining operations, like flatten on nested arrays, have complexity based on total elements, not just top-level array size.

# Analyzing method chains
users.select { |u| u.active? }      # O(n)
     .map { |u| u.name }             # O(n)
     .sort                           # O(n log n)
     .first(10)                      # O(1)
# Total: O(n log n) - dominated by sort

Block execution affects analysis. A simple block like { |x| x * 2 } is O(1), but complex blocks increase per-iteration cost. A block calling a method with its own complexity multiplies that into the loop's complexity.

# Hidden complexity in blocks
def process_users(users, projects)
  users.map do |user|                          # O(n) users
    {
      name: user.name,
      project_count: projects.count do |p|     # O(m) projects
        p.user_id == user.id
      end
    }
  end
end
# Total: O(n * m) - quadratic in combined size

Recursion in Ruby requires stack space consideration. Each recursive call adds a stack frame, limiting practical recursion depth. Tail-call optimization exists in Ruby but isn't enabled by default, making deep recursion risky.

# Recursive complexity analysis
def factorial(n)
  return 1 if n <= 1
  n * factorial(n - 1)
end
# Time: O(n) - n recursive calls
# Space: O(n) - n stack frames

# Iterative alternative
def factorial_iterative(n)
  result = 1
  (2..n).each { |i| result *= i }
  result
end
# Time: O(n) - same time complexity
# Space: O(1) - constant space

Ruby's garbage collector adds real-world performance considerations not captured by asymptotic analysis. Creating many temporary objects can trigger collection, adding overhead. This matters in practice despite theoretical complexity remaining unchanged.

Lazy enumerables defer computation, affecting when complexity manifests. array.lazy.map { ... }.first(5) evaluates only five elements, not the entire array. This transforms O(n) operations into O(1) when taking constant-size results.

# Lazy evaluation changes practical complexity
huge_array = (1..1_000_000).to_a

# Eager: processes all 1M elements
huge_array.map { |n| n * 2 }.first(5)  # O(n) for full map

# Lazy: processes only 5 elements
huge_array.lazy.map { |n| n * 2 }.first(5)  # O(1) - constant work

Practical Examples

Linear search demonstrates O(n) complexity through examining each element until finding the target or exhausting the collection. Worst case requires checking every element.

def linear_search(array, target)
  array.each_with_index do |element, index|
    return index if element == target
  end
  nil
end

# Best case: O(1) - target at first position
# Worst case: O(n) - target at end or absent
# Average case: O(n) - expected to check n/2 elements

Binary search achieves O(log n) by halving the search space each iteration. Requires sorted input. Each comparison eliminates half the remaining elements.

def binary_search(sorted_array, target)
  left = 0
  right = sorted_array.length - 1
  
  while left <= right
    mid = (left + right) / 2
    
    if sorted_array[mid] == target
      return mid
    elsif sorted_array[mid] < target
      left = mid + 1
    else
      right = mid - 1
    end
  end
  
  nil
end

# Each iteration: halves search space
# After k iterations: n / 2^k elements remain
# Terminates when n / 2^k = 1
# Solving: 2^k = n means k = log₂(n)
# Complexity: O(log n)

Bubble sort illustrates O(n²) quadratic complexity through nested iteration. Outer loop runs n times, inner loop runs up to n times per outer iteration.

def bubble_sort(array)
  n = array.length
  
  (n - 1).times do |i|                    # Outer: n-1 iterations
    (n - i - 1).times do |j|              # Inner: decreasing iterations
      if array[j] > array[j + 1]
        array[j], array[j + 1] = array[j + 1], array[j]
      end
    end
  end
  
  array
end

# Total comparisons: (n-1) + (n-2) + ... + 1 = n(n-1)/2
# Simplified: O(n²)

Merge sort demonstrates O(n log n) divide-and-conquer efficiency. Dividing takes O(log n) levels, merging takes O(n) per level.

def merge_sort(array)
  return array if array.length <= 1
  
  mid = array.length / 2
  left = merge_sort(array[0...mid])       # Divide: log n levels
  right = merge_sort(array[mid..-1])
  
  merge(left, right)                       # Conquer: O(n) per level
end

def merge(left, right)
  result = []
  
  until left.empty? || right.empty?
    if left.first <= right.first
      result << left.shift
    else
      result << right.shift
    end
  end
  
  result + left + right
end

# Recurrence: T(n) = 2T(n/2) + O(n)
# Solution: O(n log n)

Hash table operations demonstrate constant average-case complexity with linear worst-case scenarios. Good hash functions distribute keys evenly, minimizing collisions.

# Using hash for O(1) lookups
def find_pair_sum(array, target)
  seen = {}
  
  array.each do |num|                      # O(n) iterations
    complement = target - num
    return true if seen[complement]        # O(1) hash lookup
    seen[num] = true                       # O(1) hash insertion
  end
  
  false
end

# Total: O(n) with O(n) space
# Without hash: O(n²) with nested loops

Fibonacci calculation shows exponential recursion inefficiency versus linear dynamic programming.

# Naive recursion: O(2^n)
def fib_recursive(n)
  return n if n <= 1
  fib_recursive(n - 1) + fib_recursive(n - 2)
end
# Recalculates same values exponentially many times

# Dynamic programming: O(n)
def fib_dp(n)
  return n if n <= 1
  
  prev, curr = 0, 1
  (2..n).each do
    prev, curr = curr, prev + curr
  end
  
  curr
end
# Each value calculated exactly once

Common Patterns

Constant time O(1) operations execute in fixed time regardless of input size. Array indexing, hash table lookups, variable assignment, arithmetic operations all demonstrate constant complexity.

# All O(1) operations
array[5]                    # Direct index access
hash[key]                   # Hash lookup
x = y + z                   # Arithmetic
array.length               # Stored value access

Logarithmic time O(log n) appears in algorithms repeatedly halving the problem space. Binary search, balanced tree operations, and certain divide-and-conquer algorithms exhibit logarithmic growth.

# Finding position in sorted array: O(log n)
def find_insert_position(sorted_array, value)
  left, right = 0, sorted_array.length
  
  while left < right
    mid = (left + right) / 2
    if sorted_array[mid] < value
      left = mid + 1
    else
      right = mid
    end
  end
  
  left
end

Linear time O(n) processes each input element constant times. Single loops, linear search, and many array operations follow this pattern.

# Finding maximum: O(n)
def find_max(array)
  max = array.first
  array.each { |num| max = num if num > max }
  max
end

# Filtering: O(n)
def filter_even(array)
  array.select { |num| num.even? }
end

Linearithmic time O(n log n) characterizes efficient sorting algorithms. Merge sort, heap sort, and quicksort average case all achieve this complexity.

# Quicksort average case: O(n log n)
def quicksort(array)
  return array if array.length <= 1
  
  pivot = array[array.length / 2]
  left = array.select { |x| x < pivot }
  middle = array.select { |x| x == pivot }
  right = array.select { |x| x > pivot }
  
  quicksort(left) + middle + quicksort(right)
end

Quadratic time O(n²) results from nested iteration over the same input. Double loops, naive duplicate detection, and bubble sort demonstrate quadratic complexity.

# Checking all pairs: O(n²)
def has_duplicate(array)
  array.each_with_index do |item, i|
    array.each_with_index do |other, j|
      return true if i != j && item == other
    end
  end
  false
end

Exponential time O(2^n) grows extremely rapidly, often from exploring all subsets or combinations. Many brute-force solutions and naive recursive algorithms exhibit exponential complexity.

# Generating all subsets: O(2^n)
def all_subsets(array)
  return [[]] if array.empty?
  
  first = array[0]
  rest_subsets = all_subsets(array[1..-1])
  
  with_first = rest_subsets.map { |subset| [first] + subset }
  rest_subsets + with_first
end
# Each element: include or exclude = 2^n subsets

Performance Considerations

Asymptotic analysis ignores constants, but constants matter in practice. An O(n) algorithm with high constant overhead may perform worse than O(n log n) on realistic input sizes. Profiling reveals where theoretical analysis diverges from observed performance.

Input size determines which complexity class dominates. For small n, O(n²) may outperform O(n log n) due to lower overhead. Sorting 10 elements doesn't benefit from sophisticated algorithms. Sorting 1 million elements makes algorithmic choice critical.

# Simple insertion sort outperforms quicksort on tiny arrays
def hybrid_sort(array)
  if array.length < 10
    insertion_sort(array)    # O(n²) but low overhead
  else
    quicksort(array)         # O(n log n) with higher overhead
  end
end

Space-time tradeoffs frequently arise. Memoization exchanges O(n) memory for exponential time savings. Hash tables use O(n) space for O(1) lookup versus O(n) search in arrays. Compression trades CPU time for reduced memory and bandwidth.

# Memoized fibonacci: trades space for time
def fib_memo(n, memo = {})
  return n if n <= 1
  return memo[n] if memo[n]
  
  memo[n] = fib_memo(n - 1, memo) + fib_memo(n - 2, memo)
end
# Time: O(n) instead of O(2^n)
# Space: O(n) for memoization hash

Cache behavior affects real-world performance beyond asymptotic analysis. Algorithms accessing memory sequentially benefit from CPU cache, while random access patterns suffer cache misses. This makes array iteration faster than linked list traversal despite identical O(n) complexity.

Real-world input characteristics influence practical performance. Sorted or nearly-sorted input makes insertion sort O(n) instead of O(n²). Random pivot selection in quicksort avoids O(n²) worst case on sorted input. Knowing input patterns guides algorithm selection.

Database queries demonstrate how complexity analysis scales to distributed systems. A query joining two tables represents O(n * m) complexity. Adding indexes reduces search from O(n) to O(log n), dramatically improving performance at large scale.

# Database-backed duplicate detection
class User
  def self.find_duplicates_naive
    # Fetches all users: O(n) network + memory
    users = User.all.to_a
    
    users.select do |user|
      # For each user, queries database: O(n) queries
      User.where(email: user.email).count > 1
    end
  end
  # Total: O(n²) database queries - unacceptable
  
  def self.find_duplicates_optimized
    # Single query with GROUP BY: O(n log n) in database
    User.select(:email)
        .group(:email)
        .having('COUNT(*) > 1')
        .pluck(:email)
  end
  # Total: O(n log n) - one database operation
end

Common Pitfalls

Ignoring hidden complexity in library methods leads to unexpectedly slow code. Methods appearing constant-time may perform linear operations internally. Ruby's Array#uniq is O(n), String#reverse is O(n), and include? on arrays is O(n).

# Looks innocent, actually O(n²)
def remove_duplicates_bad(array)
  result = []
  array.each do |item|
    result << item unless result.include?(item)  # O(n) check
  end
  result
end
# Each iteration: O(n) check, n iterations = O(n²)

# Correct approach: O(n)
def remove_duplicates_good(array)
  array.uniq  # Uses hash internally
end

Analyzing only the dominant term without considering real-world input sizes causes poor decisions. An O(n²) algorithm may outperform O(n log n) for n < 100. Premature optimization based solely on asymptotic analysis wastes effort.

Confusing best, average, and worst cases leads to unrealistic expectations. Quicksort averages O(n log n) but degrades to O(n²) on pathological input. Hash tables average O(1) but worst-case O(n) with poor hash functions or adversarial input.

Overlooking space complexity while optimizing time complexity creates memory problems. Recursive solutions may crash from stack overflow despite good time complexity. Building large intermediate data structures may exhaust memory.

# Time-efficient but space-wasteful
def generate_primes_wasteful(n)
  # Creates array of n booleans: O(n) space
  sieve = Array.new(n + 1, true)
  sieve[0] = sieve[1] = false
  
  (2..Math.sqrt(n)).each do |i|
    if sieve[i]
      (i*i..n).step(i) { |j| sieve[j] = false }
    end
  end
  
  sieve.each_with_index.select { |prime, _| prime }.map { |_, i| i }
end
# Time: O(n log log n), Space: O(n)
# For n = 1 billion, requires ~1GB memory

Assuming Big O applies to small inputs causes optimization in the wrong places. Asymptotic analysis describes behavior as n approaches infinity. For bounded input sizes, different factors dominate.

Forgetting that complexity analysis applies per operation, not per program. A loop running 1000 times with O(1) body is O(1000) = O(1) for that specific program, but if the loop count depends on input n, complexity becomes O(n).

Misunderstanding amortized complexity leads to incorrect worst-case assumptions. Dynamic array insertion is O(1) amortized but occasionally O(n). Applications requiring guaranteed O(1) per operation cannot use dynamic arrays.

Reference

Complexity Classes

Notation	Name	Example Operations	Growth Rate
O(1)	Constant	Array index, hash lookup, variable assignment	Unchanged
O(log n)	Logarithmic	Binary search, balanced tree operations	Doubles when n squares
O(n)	Linear	Array iteration, linear search	Doubles when n doubles
O(n log n)	Linearithmic	Merge sort, heap sort, efficient sorting	Slightly more than linear
O(n²)	Quadratic	Nested loops, bubble sort, selection sort	Quadruples when n doubles
O(n³)	Cubic	Triple nested loops, naive matrix multiplication	8x when n doubles
O(2^n)	Exponential	Recursive fibonacci, subset generation	Squares when n increases by 1
O(n!)	Factorial	Permutation generation, traveling salesman brute force	Extremely rapid growth

Common Data Structure Operations

Data Structure	Access	Search	Insert	Delete	Space
Array	O(1)	O(n)	O(n)	O(n)	O(n)
Hash Table	N/A	O(1) avg	O(1) avg	O(1) avg	O(n)
Binary Search Tree	O(log n) avg	O(log n) avg	O(log n) avg	O(log n) avg	O(n)
Balanced Tree	O(log n)	O(log n)	O(log n)	O(log n)	O(n)
Stack	O(n)	O(n)	O(1)	O(1)	O(n)
Queue	O(n)	O(n)	O(1)	O(1)	O(n)
Linked List	O(n)	O(n)	O(1)	O(1)	O(n)

Common Algorithm Complexities

Algorithm	Time Complexity	Space Complexity	Notes
Binary Search	O(log n)	O(1)	Requires sorted input
Linear Search	O(n)	O(1)	Works on unsorted data
Bubble Sort	O(n²)	O(1)	Simple but inefficient
Selection Sort	O(n²)	O(1)	Inefficient, few swaps
Insertion Sort	O(n²) avg, O(n) best	O(1)	Efficient for small or nearly sorted
Merge Sort	O(n log n)	O(n)	Stable, predictable
Quick Sort	O(n log n) avg, O(n²) worst	O(log n)	Fast in practice
Heap Sort	O(n log n)	O(1)	In-place, not stable
Breadth-First Search	O(V + E)	O(V)	Graph traversal
Depth-First Search	O(V + E)	O(V)	Graph traversal
Dijkstra's Algorithm	O((V + E) log V)	O(V)	Shortest path

Analysis Checklist

Step	Question	Action
1	Identify input size variable	Define what n represents
2	Count simple operations	Each statement: O(1)
3	Analyze sequential blocks	Add complexities
4	Analyze loops	Multiply iterations by body complexity
5	Analyze nested loops	Multiply loop complexities
6	Analyze recursion	Solve recurrence relation
7	Identify dominant term	Keep highest-order term
8	Drop constants and coefficients	Simplify to Big O notation
9	Consider best, average, worst	Document all three cases
10	Verify with examples	Test analysis on sample inputs

Ruby Method Complexity Reference

Method	Complexity	Notes
Array#[]	O(1)	Direct index access
Array#push	O(1) amortized	May resize array
Array#pop	O(1)	Removes last element
Array#shift	O(n)	Removes first, shifts remaining
Array#unshift	O(n)	Adds at start, shifts remaining
Array#include?	O(n)	Linear search
Array#sort	O(n log n)	Comparison sort
Array#reverse	O(n)	Creates new array
Hash#[]	O(1) avg	Depends on hash quality
Hash#[]=	O(1) avg	Insertion or update
Hash#delete	O(1) avg	Remove key-value pair
Set#include?	O(1) avg	Hash-based membership
Set#add	O(1) avg	Hash-based insertion
String#[]	O(1)	Character access
String#include?	O(n*m)	Substring search
String#reverse	O(n)	Creates new string

Time Complexity Analysis