CrackedRuby - Minimum Spanning Tree (Kruskal, Prim)

Overview

A minimum spanning tree (MST) is a subset of edges in a connected, weighted, undirected graph that connects all vertices with the minimum total edge weight and contains no cycles. The spanning tree includes all vertices from the original graph but uses exactly V-1 edges, where V represents the number of vertices. The "minimum" designation refers to the sum of edge weights being as small as possible among all possible spanning trees.

Two primary algorithms solve the MST problem: Kruskal's algorithm and Prim's algorithm. Both produce optimal solutions but differ in their approach. Kruskal's algorithm builds the MST by sorting edges and adding them in ascending weight order while avoiding cycles. Prim's algorithm grows the MST from a starting vertex, repeatedly adding the minimum-weight edge that connects a vertex in the tree to a vertex outside it.

MST algorithms apply to network design problems where connecting all nodes with minimum cost is required. These include designing telecommunications networks, planning road systems, constructing electrical grids, and creating computer network topologies. The algorithms also support clustering analysis, image segmentation, and approximation algorithms for NP-hard problems like the traveling salesman problem.

Both algorithms guarantee finding an optimal MST through greedy approaches. They make locally optimal choices at each step without backtracking, yet still achieve global optimality. This property makes MST algorithms both efficient and reliable for practical applications.

Key Principles

The fundamental principle underlying MST algorithms is the cut property of graphs. A cut divides graph vertices into two disjoint sets. The cut property states that for any cut of the graph, the minimum-weight edge crossing the cut belongs to some MST. This principle provides the theoretical foundation for both Kruskal's and Prim's algorithms, which repeatedly apply this property in different ways.

Kruskal's algorithm operates on the edge set directly. The algorithm maintains a forest of trees that gradually merges into a single tree. Initially, each vertex forms its own tree. The algorithm examines edges in order of increasing weight, adding an edge to the MST if it connects vertices in different trees. This edge selection process requires efficient detection of cycles, typically implemented using a union-find (disjoint-set) data structure. The union-find structure tracks which vertices belong to the same connected component, enabling constant-time queries about whether adding an edge would create a cycle.

The union-find data structure maintains a partition of vertices into disjoint sets. Two key operations define its interface: find determines which set contains a given element, and union merges two sets. Path compression during find operations and union by rank during merge operations provide nearly constant amortized time complexity. Without these optimizations, the structure would degrade to linear time performance.

Prim's algorithm grows a single tree from a starting vertex. The algorithm maintains two sets: vertices included in the MST and vertices not yet included. At each step, it selects the minimum-weight edge connecting a vertex in the MST to a vertex outside it. This selection process requires efficiently finding minimum-weight edges, typically implemented using a priority queue (min-heap). The priority queue stores vertices with their current minimum connection cost to the growing tree.

The greedy property ensures optimality for both algorithms. At each step, both algorithms make the choice that appears best at that moment without reconsidering previous decisions. The cut property guarantees this greedy approach produces an optimal result. When Kruskal's algorithm adds an edge, it adds the minimum-weight edge crossing some cut. When Prim's algorithm adds an edge, it adds the minimum-weight edge crossing the cut between selected and unselected vertices.

MST algorithms require connected graphs. In disconnected graphs, no spanning tree exists because some vertices cannot be reached from others. The algorithms produce a minimum spanning forest in such cases, creating a separate tree for each connected component. The total edge count equals V - C, where V represents vertices and C represents connected components.

Edge weight handling differs between variations. Standard MST algorithms assume distinct edge weights, though the algorithms work correctly with duplicate weights. When weights are equal, multiple MSTs may exist, and the algorithm's tie-breaking rules determine which one it produces. Some applications require finding all possible MSTs or counting them, which requires modified algorithms.

Implementation Approaches

Kruskal's algorithm follows an edge-centric approach. The implementation requires sorting all edges by weight, then iterating through them in order. For each edge, the algorithm checks whether its endpoints belong to different connected components. If they do, the edge joins two separate trees and gets added to the MST. The union-find structure tracks component membership and performs the merging operation.

The algorithm's structure involves three main phases: edge sorting, union-find initialization, and edge processing. Edge sorting dominates the time complexity when using comparison-based sorting, requiring O(E log E) time where E represents the number of edges. The union-find operations contribute nearly O(E α(V)) time, where α represents the inverse Ackermann function, which grows extremely slowly. For practical purposes, α(V) remains less than 5 for any reasonable graph size.

Prim's algorithm follows a vertex-centric approach. The implementation maintains a priority queue of vertices, ordered by their minimum connection cost to the current MST. Starting from an arbitrary vertex, the algorithm repeatedly extracts the minimum-cost vertex from the queue and adds its connecting edge to the MST. After adding a vertex, the algorithm updates the connection costs of its neighbors if the new vertex provides a cheaper path to the tree.

Implementation varies based on the priority queue structure. A binary heap provides O(log V) extraction and update operations, resulting in O(E log V) total time complexity. A Fibonacci heap improves this to O(E + V log V) through O(1) amortized decrease-key operations, though the constant factors make Fibonacci heaps impractical for most applications. Array-based implementations work well for dense graphs, achieving O(V²) time without the overhead of heap operations.

The choice between Kruskal's and Prim's algorithms depends on graph characteristics. Kruskal's algorithm performs better on sparse graphs where E approaches V, as the edge sorting cost remains manageable. Prim's algorithm excels on dense graphs where E approaches V², as it avoids sorting all edges and instead focuses on vertex processing. For graphs stored as adjacency lists, Prim's algorithm accesses memory more efficiently by following edges from vertices.

Parallel implementations differ significantly between algorithms. Kruskal's algorithm parallelizes poorly because the union-find structure requires synchronization to maintain consistency. Concurrent union-find implementations exist but add substantial overhead. Prim's algorithm parallelizes more naturally by maintaining multiple growing trees simultaneously, though merging them requires care to ensure optimality.

Ruby Implementation

Ruby implementations of MST algorithms use arrays, hashes, and priority queues. The standard library lacks a built-in heap structure, requiring either a custom implementation or a gem. The algorithms gem provides priority queue implementations, though manual binary heap code offers more control over the data structure.

class UnionFind
  def initialize(size)
    @parent = (0...size).to_a
    @rank = Array.new(size, 0)
  end
  
  def find(x)
    if @parent[x] != x
      @parent[x] = find(@parent[x])  # Path compression
    end
    @parent[x]
  end
  
  def union(x, y)
    root_x = find(x)
    root_y = find(y)
    return false if root_x == root_y
    
    # Union by rank
    if @rank[root_x] < @rank[root_y]
      @parent[root_x] = root_y
    elsif @rank[root_x] > @rank[root_y]
      @parent[root_y] = root_x
    else
      @parent[root_y] = root_x
      @rank[root_x] += 1
    end
    true
  end
  
  def connected?(x, y)
    find(x) == find(y)
  end
end

This union-find implementation includes path compression and union by rank optimizations. Path compression flattens the tree structure during find operations, making subsequent finds faster. Union by rank attaches the shorter tree under the taller tree's root, preventing degeneration into linear chains.

def kruskal_mst(vertices, edges)
  # edges = [[u, v, weight], ...]
  sorted_edges = edges.sort_by { |u, v, w| w }
  uf = UnionFind.new(vertices)
  mst = []
  total_weight = 0
  
  sorted_edges.each do |u, v, weight|
    if uf.union(u, v)
      mst << [u, v, weight]
      total_weight += weight
      break if mst.size == vertices - 1
    end
  end
  
  [mst, total_weight]
end

# Usage
vertices = 4
edges = [
  [0, 1, 10],
  [0, 2, 6],
  [0, 3, 5],
  [1, 3, 15],
  [2, 3, 4]
]

mst, weight = kruskal_mst(vertices, edges)
# => [[[2, 3, 4], [0, 3, 5], [0, 1, 10]], 19]

Kruskal's algorithm implementation sorts edges once, then processes them sequentially. The early termination when mst.size == vertices - 1 prevents unnecessary iterations after finding all MST edges. The union operation returns false when both vertices already belong to the same component, indicating the edge would create a cycle.

class PriorityQueue
  def initialize
    @heap = []
  end
  
  def insert(priority, item)
    @heap << [priority, item]
    bubble_up(@heap.size - 1)
  end
  
  def extract_min
    return nil if @heap.empty?
    swap(0, @heap.size - 1)
    min = @heap.pop
    bubble_down(0)
    min[1]
  end
  
  def decrease_key(item, new_priority)
    index = @heap.index { |p, i| i == item }
    return unless index
    
    @heap[index][0] = new_priority
    bubble_up(index)
  end
  
  def empty?
    @heap.empty?
  end
  
  private
  
  def bubble_up(index)
    return if index == 0
    parent = (index - 1) / 2
    
    if @heap[index][0] < @heap[parent][0]
      swap(index, parent)
      bubble_up(parent)
    end
  end
  
  def bubble_down(index)
    left = 2 * index + 1
    right = 2 * index + 2
    smallest = index
    
    smallest = left if left < @heap.size && @heap[left][0] < @heap[smallest][0]
    smallest = right if right < @heap.size && @heap[right][0] < @heap[smallest][0]
    
    if smallest != index
      swap(index, smallest)
      bubble_down(smallest)
    end
  end
  
  def swap(i, j)
    @heap[i], @heap[j] = @heap[j], @heap[i]
  end
end

This priority queue uses a binary heap stored in an array. The bubble_up and bubble_down methods maintain the heap property after insertions and extractions. The decrease_key operation searches for the item linearly, which is inefficient but simpler than maintaining a position map.

def prim_mst(vertices, adjacency_list)
  # adjacency_list = {vertex => [[neighbor, weight], ...]}
  visited = Array.new(vertices, false)
  mst = []
  total_weight = 0
  pq = PriorityQueue.new
  
  # Start from vertex 0
  visited[0] = true
  adjacency_list[0].each { |neighbor, weight| pq.insert(weight, [0, neighbor]) }
  
  while !pq.empty?
    edge = pq.extract_min
    u, v = edge
    
    next if visited[v]
    
    visited[v] = true
    mst << [u, v]
    total_weight += adjacency_list[u].find { |n, w| n == v }[1]
    
    adjacency_list[v].each do |neighbor, weight|
      pq.insert(weight, [v, neighbor]) unless visited[neighbor]
    end
  end
  
  [mst, total_weight]
end

# Usage
adjacency_list = {
  0 => [[1, 10], [2, 6], [3, 5]],
  1 => [[0, 10], [3, 15]],
  2 => [[0, 6], [3, 4]],
  3 => [[0, 5], [1, 15], [2, 4]]
}

mst, weight = prim_mst(4, adjacency_list)
# => [[[0, 3], [3, 2], [0, 1]], 19]

Prim's algorithm implementation maintains a visited array to track vertices in the MST. The priority queue stores edges with their weights as priorities. When extracting an edge, the algorithm skips it if the destination vertex was already visited. After adding a vertex, all its unvisited neighbors get added to the queue.

def prim_mst_optimized(vertices, adjacency_list)
  visited = Array.new(vertices, false)
  min_cost = Array.new(vertices, Float::INFINITY)
  parent = Array.new(vertices, -1)
  
  min_cost[0] = 0
  total_weight = 0
  
  vertices.times do
    # Find minimum cost unvisited vertex
    u = -1
    (0...vertices).each do |v|
      if !visited[v] && (u == -1 || min_cost[v] < min_cost[u])
        u = v
      end
    end
    
    visited[u] = true
    total_weight += min_cost[u]
    
    # Update costs for neighbors
    adjacency_list[u].each do |v, weight|
      if !visited[v] && weight < min_cost[v]
        min_cost[v] = weight
        parent[v] = u
      end
    end
  end
  
  # Reconstruct MST edges
  mst = []
  (1...vertices).each do |v|
    mst << [parent[v], v] if parent[v] != -1
  end
  
  [mst, total_weight]
end

This array-based Prim's implementation avoids the priority queue overhead. The min_cost array tracks the minimum weight edge connecting each vertex to the current MST. The linear search for the minimum cost vertex makes this approach O(V²), which works well for dense graphs where this complexity matches or beats the heap-based version.

Performance Considerations

Kruskal's algorithm time complexity depends primarily on edge sorting. Using comparison-based sorting algorithms like mergesort or heapsort requires O(E log E) time. Since E cannot exceed V² in a simple graph, this simplifies to O(E log V). The union-find operations contribute O(E α(V)) time, where α represents the inverse Ackermann function. This function grows so slowly that α(V) remains less than 5 for any practical graph size, making these operations nearly constant time.

The space complexity of Kruskal's algorithm is O(V) for the union-find structure plus O(E) if edges need to be copied for sorting. In-place sorting reduces the edge storage requirement. The MST itself requires O(V) space to store V-1 edges. For sparse graphs where E is close to V, the total space complexity remains O(V).

Prim's algorithm time complexity varies by implementation. Using a binary heap for the priority queue results in O((V + E) log V) time. Each vertex gets extracted from the heap once (V extractions at O(log V) each), and each edge causes at most one priority update (E updates at O(log V) each). For dense graphs where E approaches V², this becomes O(V² log V).

The array-based Prim's implementation achieves O(V²) time complexity by performing a linear search for the minimum cost vertex at each iteration. Dense graphs make this approach competitive because the number of edges approaches V², matching the V² iterations. The reduced constant factors from avoiding heap maintenance often make array-based Prim faster in practice for dense graphs.

require 'benchmark'

def generate_graph(vertices, density)
  edges = []
  (0...vertices).each do |u|
    ((u+1)...vertices).each do |v|
      edges << [u, v, rand(1..100)] if rand < density
    end
  end
  edges
end

# Sparse graph comparison
vertices = 1000
sparse_edges = generate_graph(vertices, 0.01)

Benchmark.bm(20) do |x|
  x.report("Kruskal (sparse):") { kruskal_mst(vertices, sparse_edges) }
  x.report("Prim (sparse):") { 
    adj_list = Hash.new { |h, k| h[k] = [] }
    sparse_edges.each { |u, v, w| adj_list[u] << [v, w]; adj_list[v] << [u, w] }
    prim_mst(vertices, adj_list)
  }
end

# Dense graph comparison
dense_edges = generate_graph(vertices, 0.5)

Benchmark.bm(20) do |x|
  x.report("Kruskal (dense):") { kruskal_mst(vertices, dense_edges) }
  x.report("Prim (dense):") {
    adj_list = Hash.new { |h, k| h[k] = [] }
    dense_edges.each { |u, v, w| adj_list[u] << [v, w]; adj_list[v] << [u, w] }
    prim_mst_optimized(vertices, adj_list)
  }
end

Benchmark results vary by graph structure. Kruskal's algorithm performs consistently across different densities because edge sorting dominates the runtime regardless of graph structure. Prim's algorithm shows greater variation, with the heap-based version excelling on sparse graphs and the array-based version performing better on dense graphs.

Memory access patterns affect performance significantly. Kruskal's algorithm accesses edges in sorted order, which provides poor cache locality if edges are scattered in memory. Prim's algorithm follows graph structure, accessing neighbors of each vertex sequentially. For graphs stored as adjacency lists, this sequential access pattern improves cache performance.

Edge representation impacts memory usage. Storing edges as arrays of three elements requires 24 bytes per edge on 64-bit Ruby (three 8-byte pointers). For graphs with millions of edges, this memory consumption becomes significant. Packed structures or integer arrays reduce memory usage but sacrifice Ruby's readability and flexibility.

The choice between algorithms depends on graph properties and constraints. Sparse graphs with E close to V favor Kruskal's algorithm, as the sorting cost remains low. Dense graphs with E approaching V² favor array-based Prim's algorithm, which avoids sorting overhead and maintains better cache locality. Real-world graphs often exhibit small-world properties with E = O(V), making Kruskal's algorithm a reliable default choice.

Practical Examples

Network design represents the canonical MST application. A telecommunications company needs to connect offices in multiple cities with fiber optic cables. Each potential cable connection has an associated cost based on distance and terrain. The MST provides the minimum-cost network connecting all offices.

# City network design
cities = {
  0 => "New York",
  1 => "Boston", 
  2 => "Philadelphia",
  3 => "Washington DC",
  4 => "Pittsburgh"
}

# Connections with costs in thousands of dollars
connections = [
  [0, 1, 215],  # NY-Boston
  [0, 2, 95],   # NY-Philadelphia
  [0, 3, 225],  # NY-DC
  [0, 4, 360],  # NY-Pittsburgh
  [1, 2, 270],  # Boston-Philadelphia
  [2, 3, 140],  # Philadelphia-DC
  [2, 4, 305],  # Philadelphia-Pittsburgh
  [3, 4, 245]   # DC-Pittsburgh
]

mst, cost = kruskal_mst(cities.size, connections)

puts "Minimum cost network: $#{cost}k"
mst.each do |u, v, weight|
  puts "Connect #{cities[u]} to #{cities[v]}: $#{weight}k"
end
# Output:
# Minimum cost network: $615k
# Connect New York to Philadelphia: $95k
# Connect Philadelphia to Washington DC: $140k
# Connect New York to Boston: $215k
# Connect Philadelphia to Pittsburgh: $305k

The MST reduces total network cost by eliminating redundant connections while maintaining connectivity. The solution uses only four cables to connect five cities, saving hundreds of thousands compared to connecting every city pair.

Clustering analysis uses MST algorithms to identify natural groupings in data. The approach builds an MST on data points using distance as edge weights, then removes the longest edges to create separate clusters. This single-linkage clustering method assumes clusters connect through their closest points.

def euclidean_distance(p1, p2)
  Math.sqrt((p1[0] - p2[0])**2 + (p1[1] - p2[1])**2)
end

def mst_clustering(points, num_clusters)
  # Build complete graph with distances
  edges = []
  points.size.times do |i|
    ((i+1)...points.size).each do |j|
      dist = euclidean_distance(points[i], points[j])
      edges << [i, j, dist]
    end
  end
  
  # Build MST
  mst, _ = kruskal_mst(points.size, edges)
  
  # Sort MST edges by weight and remove longest ones
  mst_sorted = mst.sort_by { |u, v, w| -w }
  edges_to_remove = num_clusters - 1
  
  # Use union-find to identify clusters
  uf = UnionFind.new(points.size)
  mst_sorted[edges_to_remove..-1].each do |u, v, _|
    uf.union(u, v)
  end
  
  # Group points by cluster
  clusters = Hash.new { |h, k| h[k] = [] }
  points.size.times do |i|
    clusters[uf.find(i)] << i
  end
  
  clusters.values
end

# Data points (x, y coordinates)
points = [
  [1, 2], [2, 1], [1.5, 1.5],      # Cluster 1
  [8, 8], [9, 9], [8.5, 8.5],      # Cluster 2
  [15, 2], [16, 3], [15.5, 2.5]    # Cluster 3
]

clusters = mst_clustering(points, 3)
clusters.each_with_index do |cluster, i|
  puts "Cluster #{i + 1}: #{cluster.map { |idx| points[idx] }}"
end

This clustering approach identifies three distinct groups in the data. By removing the two longest MST edges, the algorithm partitions the data into three connected components. Each component represents a cluster of nearby points.

Image segmentation applies MST algorithms to divide images into regions. Pixels become graph vertices, and edge weights represent the dissimilarity between adjacent pixels based on color or intensity. The MST groups similar pixels together, and removing high-weight edges creates segments.

class ImageSegmentation
  def initialize(image_data, width, height)
    @data = image_data  # Flat array of pixel values
    @width = width
    @height = height
  end
  
  def segment(num_segments)
    # Build graph of adjacent pixels
    edges = []
    
    @height.times do |y|
      @width.times do |x|
        idx = y * @width + x
        
        # Right neighbor
        if x < @width - 1
          right_idx = idx + 1
          weight = (@data[idx] - @data[right_idx]).abs
          edges << [idx, right_idx, weight]
        end
        
        # Bottom neighbor
        if y < @height - 1
          bottom_idx = idx + @width
          weight = (@data[idx] - @data[bottom_idx]).abs
          edges << [idx, bottom_idx, weight]
        end
      end
    end
    
    # Build MST and remove largest edges
    vertices = @width * @height
    mst, _ = kruskal_mst(vertices, edges)
    mst_sorted = mst.sort_by { |u, v, w| -w }
    
    # Create segments
    uf = UnionFind.new(vertices)
    mst_sorted[(num_segments - 1)..-1].each do |u, v, _|
      uf.union(u, v)
    end
    
    # Map each pixel to its segment
    segments = Hash.new { |h, k| h[k] = [] }
    vertices.times { |i| segments[uf.find(i)] << i }
    
    segments.values
  end
end

# Example with simple grayscale image
image = [
  10, 12, 11,  50, 52, 51,
  11, 10, 12,  51, 50, 52,
  12, 11, 10,  52, 51, 50
]

segmenter = ImageSegmentation.new(image, 6, 3)
segments = segmenter.segment(2)

puts "Found #{segments.size} segments"
segments.each_with_index do |segment, i|
  puts "Segment #{i + 1}: #{segment.size} pixels"
end

The segmentation groups pixels with similar intensities. The example image has two distinct regions with different gray levels, and the MST-based approach successfully separates them by cutting the high-weight edges between dissimilar pixels.

Common Pitfalls

Forgetting to handle disconnected graphs causes runtime errors or incorrect results. MST algorithms assume graph connectivity, and disconnected graphs lack spanning trees. The algorithms produce a minimum spanning forest instead, creating separate trees for each component. Detecting disconnection requires checking whether the MST contains V-1 edges.

def kruskal_with_validation(vertices, edges)
  mst, total_weight = kruskal_mst(vertices, edges)
  
  if mst.size < vertices - 1
    components = vertices - mst.size
    raise "Graph is disconnected: #{components} components"
  end
  
  [mst, total_weight]
end

The validation checks edge count after MST construction. A connected graph's MST contains exactly V-1 edges. Fewer edges indicate disconnected components. Some applications require spanning forests rather than failing on disconnected graphs.

Incorrect union-find implementation breaks Kruskal's algorithm. Path compression and union by rank optimizations are optional but provide the nearly constant amortized time complexity that makes the algorithm efficient. Without these optimizations, union-find operations degrade to O(V) time, making the overall algorithm O(EV).

# Inefficient union-find without optimizations
class NaiveUnionFind
  def initialize(size)
    @parent = (0...size).to_a
  end
  
  def find(x)
    # No path compression - follows chain to root
    x = @parent[x] while @parent[x] != x
    x
  end
  
  def union(x, y)
    # No union by rank - arbitrary attachment
    root_x = find(x)
    root_y = find(y)
    return false if root_x == root_y
    
    @parent[root_x] = root_y
    true
  end
end

This naive implementation works correctly but performs poorly on large graphs. Repeated unions can create long chains where find operations require traversing many links. The optimized version flattens these chains and balances tree heights.

Mishandling edge weights leads to incorrect MST construction. Negative weights work correctly in MST algorithms because the greedy approach still selects minimum-weight edges. However, zero weights require attention because multiple MSTs may exist, and the algorithm's tie-breaking determines which one gets selected.

# Graph with zero-weight edges
edges = [
  [0, 1, 0],
  [1, 2, 0],
  [2, 3, 1],
  [0, 3, 2]
]

mst1, _ = kruskal_mst(4, edges)
# Result depends on edge ordering in input
# Multiple valid MSTs exist

The MST is not unique when edges have equal weights. Different edge orderings in the input can produce different but equally valid MSTs with the same total weight. Applications requiring deterministic output should sort edges by weight and then by vertex indices.

Confusing directed and undirected graphs causes errors. MST algorithms operate on undirected graphs where edges work in both directions. Applying MST algorithms to directed graphs produces incorrect results because the undirected assumption breaks. Directed graphs require different algorithms like Edmonds' algorithm for minimum spanning arborescence.

# Incorrect: treating directed graph as undirected
directed_edges = [
  [0, 1, 5],   # 0 -> 1
  [1, 2, 3],   # 1 -> 2
  [2, 0, 4]    # 2 -> 0 (back to start)
]

# This produces wrong results - edges treated as bidirectional
mst, _ = kruskal_mst(3, directed_edges)

MST algorithms assume edges connect vertices in both directions. Directed graphs have edges that only work in one direction, requiring specialized algorithms that respect edge directionality. The minimum spanning arborescence problem finds a directed tree rooted at a source vertex with minimum total edge weight.

Priority queue implementation errors affect Prim's algorithm correctness. Forgetting to update vertex priorities when finding shorter connections prevents the algorithm from finding the optimal MST. The decrease-key operation maintains the invariant that each vertex's priority reflects its minimum connection cost to the current tree.

# Incorrect Prim - missing priority updates
def broken_prim(vertices, adjacency_list)
  visited = Array.new(vertices, false)
  pq = PriorityQueue.new
  
  pq.insert(0, 0)
  
  while !pq.empty?
    v = pq.extract_min
    next if visited[v]
    
    visited[v] = true
    
    adjacency_list[v].each do |neighbor, weight|
      # Bug: always inserting without checking if neighbor
      # already has a better path
      pq.insert(weight, neighbor) unless visited[neighbor]
    end
  end
end

This broken implementation inserts duplicate entries for vertices into the priority queue without updating existing entries. The queue fills with redundant entries, wasting memory and time. The correct approach updates a vertex's priority when a better connection is found or tracks the minimum cost in a separate array.

Reference

Algorithm Comparison

Aspect	Kruskal's Algorithm	Prim's Algorithm
Approach	Edge-based, sorts all edges	Vertex-based, grows from start
Data Structure	Union-Find	Priority Queue or Array
Time Complexity	O(E log E) or O(E log V)	O(E log V) with heap, O(V²) with array
Space Complexity	O(V) for union-find	O(V) for priority queue
Best For	Sparse graphs (E ≈ V)	Dense graphs (E ≈ V²)
Graph Representation	Edge list	Adjacency list or matrix
Parallelization	Difficult	More feasible

Time Complexity Components

Component	Kruskal	Prim (Heap)	Prim (Array)
Initialization	O(V)	O(V)	O(V)
Sorting/Heap Build	O(E log E)	O(V)	O(1)
Main Loop	O(E α(V))	O(E log V)	O(V²)
Total	O(E log E)	O(E log V)	O(V²)

Union-Find Operations

Operation	Without Optimization	With Path Compression	With Both Optimizations
Find	O(V) worst case	O(log V) amortized	O(α(V)) amortized
Union	O(V) worst case	O(log V) amortized	O(α(V)) amortized
Space	O(V)	O(V)	O(V)

Implementation Decision Matrix

Graph Property	Recommended Algorithm	Reason
Sparse (E ≈ V)	Kruskal	Lower sorting cost
Dense (E ≈ V²)	Prim with array	Avoids heap overhead
Edge list available	Kruskal	Direct edge processing
Adjacency list available	Prim	Better memory access pattern
Need online algorithm	Prim	Can start before all edges known
Parallel processing	Prim	More parallelizable

Common Edge Cases

Case	Handling	Impact
Disconnected graph	Returns spanning forest	Fewer than V-1 edges
Equal edge weights	Multiple valid MSTs	Result depends on tie-breaking
Negative weights	Works correctly	Greedy approach still optimal
Self-loops	Ignore during processing	Never part of MST
Duplicate edges	Keep only minimum weight	Others never selected

Ruby Implementation Patterns

Pattern	Code Example	Use Case
Edge list input	[[u, v, w], ...]	Kruskal's algorithm
Adjacency list	{v => [[n, w], ...]}	Prim's algorithm
Union-Find	UnionFind class	Cycle detection
Priority Queue	PriorityQueue class	Minimum edge selection
Array-based	min_cost array	Dense graphs

Minimum Spanning Tree (Kruskal, Prim)