CrackedRuby CrackedRuby

Overview

Merge Sort divides an unsorted array into smaller subarrays until each subarray contains a single element, then merges these subarrays back together in sorted order. This approach guarantees consistent performance regardless of input distribution.

The algorithm operates through two distinct phases: the division phase recursively splits the array into halves until reaching base cases of single elements, and the merge phase combines these sorted subarrays back together. During merging, the algorithm compares elements from two sorted subarrays and places them in correct order within a temporary array.

Merge Sort appears in production systems where predictable performance matters more than minimizing memory usage. Database systems use Merge Sort variants for external sorting when data exceeds available memory. Version control systems apply merge algorithms based on similar principles for combining branches. The algorithm's stability property makes it appropriate for multi-level sorting operations where the original order of equal elements must be preserved.

The algorithm requires additional memory proportional to the input size, distinguishing it from in-place sorting algorithms. This memory requirement represents the primary trade-off: guaranteed O(n log n) time complexity in exchange for O(n) space complexity.

def merge_sort(array)
  return array if array.length <= 1
  
  mid = array.length / 2
  left = merge_sort(array[0...mid])
  right = merge_sort(array[mid..-1])
  
  merge(left, right)
end

def merge(left, right)
  result = []
  
  until left.empty? || right.empty?
    result << (left.first <= right.first ? left.shift : right.shift)
  end
  
  result + left + right
end

# Example usage
array = [38, 27, 43, 3, 9, 82, 10]
sorted = merge_sort(array)
# => [3, 9, 10, 27, 38, 43, 82]

Key Principles

Merge Sort implements the divide-and-conquer paradigm through three fundamental operations: divide, conquer, and combine. The divide operation splits the problem into smaller subproblems, the conquer operation solves each subproblem recursively, and the combine operation merges solutions back together.

The division phase recursively partitions the array at its midpoint. Each recursive call receives a subarray half the size of its parent's input. This continues until reaching subarrays containing zero or one element, which are inherently sorted and serve as base cases for the recursion.

The merge phase performs the actual sorting work. The merge operation receives two sorted arrays and produces a single sorted array containing all elements from both inputs. The algorithm maintains two pointers, one for each input array, comparing elements at the current positions and advancing the pointer whose element was selected. This process continues until one array is exhausted, at which point the remaining elements from the other array are appended to the result.

Stability represents a critical property of Merge Sort. When two elements compare as equal, the algorithm preserves their original relative order. The merge operation achieves stability by selecting from the left subarray when elements are equal, maintaining the left-to-right ordering from the original array.

The recursion depth equals the number of times the array can be divided in half, which is log₂(n) for an array of size n. At each level of recursion, the algorithm performs O(n) work merging subarrays. The total time complexity results from multiplying the number of levels by the work per level: O(n) × O(log n) = O(n log n).

The memory footprint consists of two components: the call stack and merge buffers. The call stack requires O(log n) space for the recursion depth. The merge operation requires O(n) space for temporary arrays holding merged results. Different implementation strategies can modify these space requirements while maintaining the core algorithm structure.

Comparison semantics determine sorting order. The merge operation relies on a comparison function that defines the ordering relationship between elements. Ruby's spaceship operator (<=>) provides three-way comparison, returning -1, 0, or 1 to indicate less than, equal to, or greater than relationships.

The algorithm's deterministic nature means identical inputs always produce identical execution paths. The number of comparisons and array accesses depends only on array size, not on the initial ordering of elements. This predictability distinguishes Merge Sort from adaptive algorithms that optimize for partially sorted inputs.

Ruby Implementation

Ruby provides multiple approaches for implementing Merge Sort, each with different trade-offs between clarity, performance, and memory usage. The standard implementation follows the classical recursive structure with explicit merge logic.

class MergeSort
  def self.sort(array, &comparator)
    comparator ||= ->(a, b) { a <=> b }
    merge_sort_recursive(array.dup, comparator)
  end
  
  private
  
  def self.merge_sort_recursive(array, comparator)
    return array if array.length <= 1
    
    mid = array.length / 2
    left = merge_sort_recursive(array[0...mid], comparator)
    right = merge_sort_recursive(array[mid..-1], comparator)
    
    merge(left, right, comparator)
  end
  
  def self.merge(left, right, comparator)
    result = []
    left_idx = 0
    right_idx = 0
    
    while left_idx < left.length && right_idx < right.length
      if comparator.call(left[left_idx], right[right_idx]) <= 0
        result << left[left_idx]
        left_idx += 1
      else
        result << right[right_idx]
        right_idx += 1
      end
    end
    
    result.concat(left[left_idx..-1]) if left_idx < left.length
    result.concat(right[right_idx..-1]) if right_idx < right.length
    
    result
  end
end

# Usage with default comparison
numbers = [64, 34, 25, 12, 22, 11, 90]
sorted = MergeSort.sort(numbers)
# => [11, 12, 22, 25, 34, 64, 90]

# Custom comparison for descending order
descending = MergeSort.sort(numbers) { |a, b| b <=> a }
# => [90, 64, 34, 25, 22, 12, 11]

The in-place variant reduces memory allocation by reusing a single temporary buffer across all merge operations. This approach requires passing buffer and index boundaries through the recursion:

class InPlaceMergeSort
  def self.sort(array)
    return array if array.length <= 1
    temp = Array.new(array.length)
    merge_sort_helper(array, temp, 0, array.length - 1)
    array
  end
  
  private
  
  def self.merge_sort_helper(array, temp, left, right)
    return if left >= right
    
    mid = (left + right) / 2
    merge_sort_helper(array, temp, left, mid)
    merge_sort_helper(array, temp, mid + 1, right)
    merge_inplace(array, temp, left, mid, right)
  end
  
  def self.merge_inplace(array, temp, left, mid, right)
    left_idx = left
    right_idx = mid + 1
    temp_idx = left
    
    while left_idx <= mid && right_idx <= right
      if array[left_idx] <= array[right_idx]
        temp[temp_idx] = array[left_idx]
        left_idx += 1
      else
        temp[temp_idx] = array[right_idx]
        right_idx += 1
      end
      temp_idx += 1
    end
    
    while left_idx <= mid
      temp[temp_idx] = array[left_idx]
      left_idx += 1
      temp_idx += 1
    end
    
    while right_idx <= right
      temp[temp_idx] = array[right_idx]
      right_idx += 1
      temp_idx += 1
    end
    
    (left..right).each { |i| array[i] = temp[i] }
  end
end

Sorting custom objects requires defining comparison logic. Ruby objects can implement the <=> operator or provide comparison blocks:

class Person
  include Comparable
  attr_reader :name, :age
  
  def initialize(name, age)
    @name = name
    @age = age
  end
  
  def <=>(other)
    age <=> other.age
  end
  
  def to_s
    "#{name}(#{age})"
  end
end

people = [
  Person.new("Alice", 30),
  Person.new("Bob", 25),
  Person.new("Carol", 35),
  Person.new("Dave", 25)
]

# Sort by age (default comparison)
by_age = MergeSort.sort(people)
# => [Bob(25), Dave(25), Alice(30), Carol(35)]

# Sort by name using custom comparator
by_name = MergeSort.sort(people) { |a, b| a.name <=> b.name }
# => [Alice(30), Bob(25), Carol(35), Dave(25)]

The iterative implementation eliminates recursion overhead by processing subarrays bottom-up. This approach merges adjacent pairs at each level before proceeding to larger groupings:

class IterativeMergeSort
  def self.sort(array)
    return array if array.length <= 1
    result = array.dup
    temp = Array.new(array.length)
    
    size = 1
    while size < array.length
      left = 0
      while left < array.length
        mid = [left + size - 1, array.length - 1].min
        right = [left + 2 * size - 1, array.length - 1].min
        
        merge_iterative(result, temp, left, mid, right) if mid < right
        left += 2 * size
      end
      size *= 2
    end
    
    result
  end
  
  private
  
  def self.merge_iterative(array, temp, left, mid, right)
    i = left
    j = mid + 1
    k = left
    
    while i <= mid && j <= right
      if array[i] <= array[j]
        temp[k] = array[i]
        i += 1
      else
        temp[k] = array[j]
        j += 1
      end
      k += 1
    end
    
    while i <= mid
      temp[k] = array[i]
      i += 1
      k += 1
    end
    
    while j <= right
      temp[k] = array[j]
      j += 1
      k += 1
    end
    
    (left..right).each { |idx| array[idx] = temp[idx] }
  end
end

Ruby's Array class provides sort and sort_by methods implemented in C for performance, but understanding Merge Sort implementation remains valuable for custom sorting requirements and educational purposes.

Performance Considerations

Merge Sort exhibits O(n log n) time complexity for all input cases: best, average, and worst. This consistency distinguishes it from algorithms like Quicksort, which degrades to O(n²) on certain inputs. The predictable performance makes Merge Sort appropriate for systems requiring guaranteed response times.

The time complexity analysis breaks down into two components. The division phase splits the array log₂(n) times, creating a recursion tree with log₂(n) levels. At each level, the merge operations collectively process every element exactly once, requiring O(n) work. The total complexity equals the number of levels multiplied by work per level: O(n) × O(log n) = O(n log n).

Space complexity requires O(n) auxiliary memory for merge buffers plus O(log n) for the call stack. The auxiliary space holds temporary arrays during merging. Each merge operation needs space equal to the combined size of the subarrays being merged. At the top level of recursion, this equals the entire array size.

The comparison count ranges from n/2 × log₂(n) to n × log₂(n) comparisons. The minimum occurs when one subarray exhausts before the other at each merge, requiring fewer comparisons. The maximum occurs when arrays interleave completely, requiring a comparison for each element placement.

Memory allocation patterns significantly impact performance in garbage-collected languages like Ruby. Creating new arrays for each merge operation generates substantial garbage. The in-place variant reduces allocations by reusing a single temporary buffer, decreasing garbage collection pressure:

# Allocation-heavy approach
def merge_allocating(left, right)
  result = []
  # Creates new array on each merge
  until left.empty? || right.empty?
    result << (left.first <= right.first ? left.shift : right.shift)
  end
  result + left + right
end

# Reduced allocation approach
def merge_efficient(array, temp, left, mid, right)
  # Reuses preallocated temp buffer
  # No intermediate array creation
  # Modifies array in place after merge
end

Cache performance affects practical runtime. Merge Sort exhibits good spatial locality during the merge phase because it accesses arrays sequentially. However, the recursive structure can degrade cache performance when subarrays exceed cache size. The iterative implementation provides better cache behavior by processing data in a more predictable pattern.

Small array optimization improves performance by switching to insertion sort for small subarrays. When subarrays reach a threshold size (typically 10-20 elements), the overhead of recursion and merging exceeds the benefit of divide-and-conquer. Insertion sort performs well on small arrays due to lower constant factors:

def merge_sort_optimized(array)
  return insertion_sort(array) if array.length <= 15
  
  mid = array.length / 2
  left = merge_sort_optimized(array[0...mid])
  right = merge_sort_optimized(array[mid..-1])
  
  merge(left, right)
end

def insertion_sort(array)
  (1...array.length).each do |i|
    key = array[i]
    j = i - 1
    
    while j >= 0 && array[j] > key
      array[j + 1] = array[j]
      j -= 1
    end
    
    array[j + 1] = key
  end
  array
end

Parallel Merge Sort exploits multiple cores by sorting subarrays concurrently. Ruby's threading model limits parallelization benefits due to the Global Interpreter Lock (GIL), but external implementations or Ractor-based approaches can achieve parallelism:

require 'concurrent'

def parallel_merge_sort(array, threshold = 1000)
  return merge_sort(array) if array.length < threshold
  
  mid = array.length / 2
  
  left_future = Concurrent::Future.execute { parallel_merge_sort(array[0...mid], threshold) }
  right_future = Concurrent::Future.execute { parallel_merge_sort(array[mid..-1], threshold) }
  
  left = left_future.value
  right = right_future.value
  
  merge(left, right)
end

Comparison operation cost dominates runtime when sorting complex objects. Expensive comparison functions multiply the O(n log n) comparison count by the per-comparison overhead. Caching comparison keys reduces repeated computation:

# Expensive comparison repeated many times
items.sort { |a, b| expensive_key(a) <=> expensive_key(b) }

# Compute keys once using Schwartzian Transform
items.map { |item| [expensive_key(item), item] }
     .sort_by(&:first)
     .map(&:last)

The external sorting variant handles datasets exceeding memory by sorting chunks that fit in memory, writing them to disk, then merging the sorted chunks. This approach scales to arbitrarily large datasets while maintaining O(n log n) complexity with respect to data size.

Design Considerations

Merge Sort competes with other O(n log n) algorithms, primarily Quicksort and Heapsort. Each algorithm offers different trade-offs between time complexity guarantees, space requirements, and practical performance characteristics.

Merge Sort vs Quicksort represents the most common comparison. Quicksort typically runs faster in practice due to better cache performance and lower constant factors, despite its O(n²) worst case. Merge Sort guarantees O(n log n) in all cases and maintains stability. Choose Merge Sort when:

  • Predictable performance matters more than average-case speed
  • Stability is required for the sorting operation
  • The dataset is linked-list-based rather than array-based
  • External sorting is needed for datasets exceeding memory

Choose Quicksort when:

  • Average-case performance is the priority
  • In-place sorting minimizes memory usage requirements
  • The implementation can use randomization to avoid worst cases
  • Cache performance significantly impacts overall system speed
# Quicksort excels with random access structures
array = (1..1_000_000).to_a.shuffle
# Quicksort: ~150ms, Merge Sort: ~200ms

# Merge Sort better for linked structures
list = LinkedList.from_array((1..1_000_000).to_a.shuffle)
# Merge Sort: ~180ms, Quicksort: ~300ms

Stability requirements dictate algorithm selection when sorting maintains multiple sort keys. Databases frequently perform multi-column sorts by sorting less significant columns first, then more significant columns, relying on stability to preserve prior ordering:

# Sort employees by department, then by hire date within department
employees.sort_by! { |e| e.hire_date }  # Must be stable
employees.sort_by! { |e| e.department } # Preserves hire_date order

# Merge Sort maintains this property automatically
# Unstable algorithms like Quicksort or Heapsort do not

Memory constraints favor in-place algorithms when heap memory is limited. Embedded systems, memory-constrained environments, or very large datasets may prohibit the O(n) auxiliary space requirement. Heapsort provides O(n log n) guaranteed performance with O(1) space. In-place Merge Sort variants exist but increase complexity significantly.

Dataset characteristics influence algorithm selection. Nearly sorted data benefits from adaptive algorithms like Timsort, which combines Merge Sort and insertion sort while detecting sorted runs. Completely random data sees less benefit from adaptivity. Merge Sort maintains consistent performance regardless of initial ordering.

Linked list sorting strongly favors Merge Sort over array-based algorithms. Merge Sort requires only pointer manipulation during merging, avoiding the expensive element shifting of array-based sorts:

class Node
  attr_accessor :value, :next
  
  def initialize(value)
    @value = value
    @next = nil
  end
end

def merge_sort_list(head)
  return head if head.nil? || head.next.nil?
  
  # Split list into halves
  slow = head
  fast = head.next
  
  while fast && fast.next
    slow = slow.next
    fast = fast.next.next
  end
  
  mid = slow.next
  slow.next = nil
  
  left = merge_sort_list(head)
  right = merge_sort_list(mid)
  
  merge_lists(left, right)
end

def merge_lists(left, right)
  dummy = Node.new(0)
  current = dummy
  
  while left && right
    if left.value <= right.value
      current.next = left
      left = left.next
    else
      current.next = right
      right = right.next
    end
    current = current.next
  end
  
  current.next = left || right
  dummy.next
end

Parallel processing opportunities differ between algorithms. Merge Sort naturally parallelizes by sorting subarrays independently. Quicksort also parallelizes well, while Heapsort's sequential nature limits parallel optimization.

Implementation complexity affects maintainability and correctness. Merge Sort implements straightforwardly with clear separation between divide and merge phases. Quicksort requires careful pivot selection and partitioning logic. Heapsort needs heap maintenance operations. Simpler implementations reduce bug introduction risk.

The natural Merge Sort variant optimizes for partially sorted input by identifying existing sorted runs rather than dividing mechanically. This approach bridges the gap between standard Merge Sort and adaptive algorithms:

def natural_merge_sort(array)
  runs = identify_runs(array)
  
  while runs.length > 1
    merged_runs = []
    runs.each_slice(2) do |left, right|
      if right
        merged_runs << merge_runs(array, left, right)
      else
        merged_runs << left
      end
    end
    runs = merged_runs
  end
  
  array
end

def identify_runs(array)
  runs = []
  start = 0
  
  while start < array.length
    finish = start
    finish += 1 while finish < array.length - 1 && array[finish] <= array[finish + 1]
    runs << (start..finish)
    start = finish + 1
  end
  
  runs
end

Practical Examples

Sorting file records demonstrates Merge Sort's stability property. When processing log files or CSV data with multiple sort keys, stability preserves secondary ordering:

class LogEntry
  attr_reader :timestamp, :severity, :message
  
  def initialize(timestamp, severity, message)
    @timestamp = timestamp
    @severity = severity
    @message = message
  end
  
  def to_s
    "[#{timestamp}] #{severity}: #{message}"
  end
end

logs = [
  LogEntry.new("2025-01-15 10:30:00", "ERROR", "Connection timeout"),
  LogEntry.new("2025-01-15 10:30:05", "WARNING", "Slow query"),
  LogEntry.new("2025-01-15 10:30:02", "ERROR", "Invalid token"),
  LogEntry.new("2025-01-15 10:30:08", "INFO", "Request completed"),
  LogEntry.new("2025-01-15 10:30:03", "ERROR", "Database unavailable")
]

# First sort by timestamp to establish chronological order
sorted_by_time = MergeSort.sort(logs) { |a, b| a.timestamp <=> b.timestamp }

# Then sort by severity (ERROR > WARNING > INFO)
severity_order = { "ERROR" => 3, "WARNING" => 2, "INFO" => 1 }
sorted_by_severity = MergeSort.sort(sorted_by_time) do |a, b|
  severity_order[b.severity] <=> severity_order[a.severity]
end

# Result: All ERRORs appear first, chronologically ordered
# Then WARNINGs chronologically ordered
# Then INFOs chronologically ordered

Processing large datasets in chunks demonstrates external sorting. This approach handles files larger than available memory:

class ExternalMergeSort
  CHUNK_SIZE = 10_000
  
  def self.sort_file(input_path, output_path)
    chunk_files = create_sorted_chunks(input_path)
    merge_chunks(chunk_files, output_path)
    cleanup_chunks(chunk_files)
  end
  
  private
  
  def self.create_sorted_chunks(input_path)
    chunk_files = []
    chunk_index = 0
    
    File.open(input_path, 'r').each_slice(CHUNK_SIZE) do |chunk|
      sorted_chunk = MergeSort.sort(chunk.map(&:to_i))
      chunk_file = "chunk_#{chunk_index}.tmp"
      
      File.open(chunk_file, 'w') do |f|
        sorted_chunk.each { |value| f.puts(value) }
      end
      
      chunk_files << chunk_file
      chunk_index += 1
    end
    
    chunk_files
  end
  
  def self.merge_chunks(chunk_files, output_path)
    readers = chunk_files.map { |f| File.open(f, 'r') }
    current_values = readers.map { |r| r.gets&.to_i }
    
    File.open(output_path, 'w') do |output|
      until current_values.all?(&:nil?)
        # Find minimum value among current positions
        min_idx = current_values.each_with_index
                               .reject { |val, _| val.nil? }
                               .min_by { |val, _| val }
                               &.last
        
        break unless min_idx
        
        output.puts(current_values[min_idx])
        current_values[min_idx] = readers[min_idx].gets&.to_i
      end
    end
    
    readers.each(&:close)
  end
  
  def self.cleanup_chunks(chunk_files)
    chunk_files.each { |f| File.delete(f) if File.exist?(f) }
  end
end

# Usage
ExternalMergeSort.sort_file('large_unsorted.txt', 'sorted_output.txt')

Merging sorted sequences appears frequently in data processing pipelines. Database query engines merge sorted index scans, and log aggregation systems merge time-ordered streams:

class SortedStreamMerger
  def initialize(*streams)
    @streams = streams.map { |s| s.each }
    @current_values = @streams.map { |s| peek(s) }
  end
  
  def each
    return enum_for(:each) unless block_given?
    
    until @current_values.all?(&:nil?)
      min_idx = @current_values.each_with_index
                              .reject { |val, _| val.nil? }
                              .min_by { |val, _| val }
                              &.last
      
      break unless min_idx
      
      yield @current_values[min_idx]
      advance_stream(min_idx)
    end
  end
  
  private
  
  def peek(enumerator)
    enumerator.next
  rescue StopIteration
    nil
  end
  
  def advance_stream(index)
    @current_values[index] = peek(@streams[index])
  end
end

# Merge multiple sorted log files
stream1 = [1, 5, 9, 13].each
stream2 = [2, 6, 10, 14].each
stream3 = [3, 7, 11, 15].each

merger = SortedStreamMerger.new(stream1, stream2, stream3)
merged_result = merger.to_a
# => [1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14, 15]

Common Patterns

The recursive divide-and-conquer pattern forms Merge Sort's foundation. This pattern applies to numerous algorithmic problems beyond sorting:

def merge_sort_pattern(problem)
  # Base case: problem is trivial to solve
  return trivial_solution(problem) if problem.trivial?
  
  # Divide: split problem into subproblems
  subproblems = divide(problem)
  
  # Conquer: solve each subproblem recursively
  subsolutions = subproblems.map { |subproblem| merge_sort_pattern(subproblem) }
  
  # Combine: merge subsolutions into final solution
  combine(subsolutions)
end

The sentinel value pattern simplifies merge logic by eliminating boundary checks. Appending infinity values to subarrays allows the merge loop to run without checking for exhausted arrays:

def merge_with_sentinels(left, right)
  left = left + [Float::INFINITY]
  right = right + [Float::INFINITY]
  result = []
  i = j = 0
  
  (left.length + right.length - 2).times do
    if left[i] <= right[j]
      result << left[i]
      i += 1
    else
      result << right[j]
      j += 1
    end
  end
  
  result
end

The three-way merge pattern extends binary merging to three subarrays simultaneously. This reduces the recursion depth from log₂(n) to log₃(n), potentially improving cache performance:

def three_way_merge_sort(array)
  return array if array.length <= 1
  
  third = array.length / 3
  
  first = three_way_merge_sort(array[0...third])
  second = three_way_merge_sort(array[third...2*third])
  third_part = three_way_merge_sort(array[2*third..-1])
  
  three_way_merge(first, second, third_part)
end

def three_way_merge(a, b, c)
  result = []
  i = j = k = 0
  
  while i < a.length && j < b.length && k < c.length
    if a[i] <= b[j] && a[i] <= c[k]
      result << a[i]
      i += 1
    elsif b[j] <= c[k]
      result << b[j]
      j += 1
    else
      result << c[k]
      k += 1
    end
  end
  
  # Handle remaining elements from two arrays
  result.concat(merge(a[i..-1], b[j..-1]))
  result.concat(c[k..-1])
  result
end

The buffer reuse pattern minimizes allocations by maintaining a single temporary buffer throughout the sort. This pattern trades code complexity for reduced garbage collection overhead:

class BufferReusingMergeSort
  def self.sort(array)
    return array if array.length <= 1
    
    working = array.dup
    buffer = Array.new(array.length)
    sort_range(working, buffer, 0, array.length - 1, true)
    working
  end
  
  private
  
  def self.sort_range(array, buffer, left, right, use_array_as_source)
    return if left >= right
    
    mid = (left + right) / 2
    
    # Recursively sort with alternating source/destination
    sort_range(array, buffer, left, mid, !use_array_as_source)
    sort_range(array, buffer, mid + 1, right, !use_array_as_source)
    
    # Merge from the array that was just written to
    if use_array_as_source
      merge_range(array, buffer, left, mid, right)
    else
      merge_range(buffer, array, left, mid, right)
    end
  end
  
  def self.merge_range(source, dest, left, mid, right)
    i = left
    j = mid + 1
    k = left
    
    while i <= mid && j <= right
      if source[i] <= source[j]
        dest[k] = source[i]
        i += 1
      else
        dest[k] = source[j]
        j += 1
      end
      k += 1
    end
    
    while i <= mid
      dest[k] = source[i]
      i += 1
      k += 1
    end
    
    while j <= right
      dest[k] = source[j]
      j += 1
      k += 1
    end
  end
end

The natural run detection pattern identifies existing sorted sequences before dividing, reducing unnecessary work on partially sorted input:

def find_runs(array)
  return [[0, array.length - 1]] if array.length <= 1
  
  runs = []
  run_start = 0
  
  (1...array.length).each do |i|
    if array[i] < array[i - 1]
      runs << [run_start, i - 1]
      run_start = i
    end
  end
  
  runs << [run_start, array.length - 1]
  runs
end

def adaptive_merge_sort(array)
  runs = find_runs(array)
  
  while runs.length > 1
    new_runs = []
    runs.each_slice(2) do |run1, run2|
      if run2
        merged_run = merge_runs(array, run1, run2)
        new_runs << merged_run
      else
        new_runs << run1
      end
    end
    runs = new_runs
  end
  
  array
end

Reference

Time Complexity

Case Comparisons Array Accesses Overall
Best n log n n log n O(n log n)
Average n log n n log n O(n log n)
Worst n log n n log n O(n log n)

Space Complexity

Component Space Required Notes
Auxiliary Array O(n) For merge operations
Call Stack O(log n) Recursion depth
Total O(n) Dominated by auxiliary array
In-Place Variant O(log n) Single buffer reused

Algorithm Characteristics

Property Value Explanation
Stable Yes Preserves relative order of equal elements
Adaptive No Same performance regardless of input order
Online No Requires entire input before starting
In-Place No Requires additional memory
Comparison-Based Yes Only uses element comparisons
Parallelizable Yes Subproblems independent

Comparison with Other Algorithms

Algorithm Time (Avg) Time (Worst) Space Stable In-Place
Merge Sort O(n log n) O(n log n) O(n) Yes No
Quick Sort O(n log n) O(n²) O(log n) No Yes
Heap Sort O(n log n) O(n log n) O(1) No Yes
Tim Sort O(n log n) O(n log n) O(n) Yes No
Insertion Sort O(n²) O(n²) O(1) Yes Yes

Implementation Variants

Variant Space Time Use Case
Top-Down Recursive O(n) O(n log n) Standard implementation
Bottom-Up Iterative O(n) O(n log n) Avoid recursion overhead
In-Place O(log n) O(n log n) Memory-constrained environments
Natural O(n) O(n log n) best Partially sorted data
Three-Way O(n) O(n log₃ n) Better cache locality
External O(k) O(n log n) Data exceeds memory

Common Operations

Operation Implementation Complexity
Find Middle mid = (left + right) / 2 O(1)
Merge Two Arrays Sequential comparison O(n)
Split Array Create subarrays O(n)
Base Case Check length <= 1 O(1)

Ruby Method Signatures

Method Parameters Returns Purpose
merge_sort array sorted array Main sorting method
merge left, right merged array Combine sorted subarrays
merge_sort_recursive array, left, right void Recursive helper
merge_inplace array, temp, left, mid, right void In-place merge

Performance Factors

Factor Impact Mitigation
Array Size Linear with merge operations Use threshold for small arrays
Memory Allocation Garbage collection overhead Reuse buffers
Comparison Cost Multiplies base complexity Cache comparison keys
Cache Misses Memory access latency Use iterative implementation
Recursion Depth Call stack overhead Switch to iterative for deep recursion

When to Use Merge Sort

Scenario Reason
Guaranteed performance needed O(n log n) worst case
Sorting linked lists No random access penalty
Stable sort required Preserves equal element order
External sorting Efficient with disk I/O
Parallel processing Independent subproblems
Merging sorted sequences Natural merge operation

When Not to Use Merge Sort

Scenario Better Alternative
Memory constrained Quick Sort or Heap Sort
In-place requirement Quick Sort or Heap Sort
Small arrays Insertion Sort
Nearly sorted data Tim Sort or Insertion Sort
Average case priority Quick Sort