Overview
A spanning tree represents a subgraph that connects all vertices in an undirected graph using the minimum number of edges without creating cycles. Given a graph with n vertices, a spanning tree contains exactly n-1 edges. The concept originated from electrical network analysis and now applies to network design, clustering algorithms, and optimization problems.
For a graph to have a spanning tree, it must be connected. Disconnected graphs cannot have spanning trees but can have spanning forests, where each connected component has its own spanning tree.
A minimum spanning tree (MST) extends this concept to weighted graphs by finding the spanning tree with the smallest total edge weight. Multiple spanning trees may exist for a single graph, but the MST is unique when all edge weights are distinct.
# Simple graph representation
graph = {
'A' => [['B', 4], ['C', 2]],
'B' => [['A', 4], ['C', 1], ['D', 5]],
'C' => [['A', 2], ['B', 1], ['D', 8]],
'D' => [['B', 5], ['C', 8]]
}
# One possible spanning tree: A-C (2), C-B (1), B-D (5)
# Total weight: 8
The spanning tree problem appears in network routing protocols, where routers construct spanning trees to prevent broadcast storms. The IEEE 802.1D Spanning Tree Protocol uses this principle to create loop-free topologies in bridged networks.
Key Principles
Spanning trees satisfy three fundamental properties: connectivity, acyclicity, and minimality. Connectivity ensures every vertex remains reachable from every other vertex. Acyclicity prevents loops, making the structure a tree. Minimality means the tree uses exactly n-1 edges for n vertices, with no redundant connections.
The cut property forms the theoretical foundation for MST algorithms. A cut partitions graph vertices into two disjoint sets. Any edge crossing this cut with minimum weight must belong to some MST. This property guarantees the correctness of greedy algorithms like Kruskal's and Prim's.
The cycle property provides an alternative characterization. For any cycle in the graph, the maximum-weight edge in that cycle cannot be part of any MST. This property enables optimization algorithms that start with all edges and remove maximum edges from cycles.
Union-Find data structures support efficient cycle detection during spanning tree construction. The structure tracks connected components and determines whether adding an edge would create a cycle. Two vertices belong to the same component if a path exists between them in the current forest.
class UnionFind
def initialize(size)
@parent = (0...size).to_a
@rank = Array.new(size, 0)
end
def find(x)
if @parent[x] != x
@parent[x] = find(@parent[x]) # Path compression
end
@parent[x]
end
def union(x, y)
root_x = find(x)
root_y = find(y)
return false if root_x == root_y # Already connected
# Union by rank
if @rank[root_x] < @rank[root_y]
@parent[root_x] = root_y
elsif @rank[root_x] > @rank[root_y]
@parent[root_y] = root_x
else
@parent[root_y] = root_x
@rank[root_x] += 1
end
true
end
end
Greedy algorithms dominate spanning tree construction due to the matroid structure of the problem. A matroid is a mathematical structure where locally optimal choices lead to globally optimal solutions. Both Kruskal's and Prim's algorithms make greedy choices that provably produce optimal results.
The spanning tree uniqueness theorem states that if all edge weights are distinct, exactly one MST exists. When edge weights contain duplicates, multiple MSTs may exist with the same total weight. This ambiguity affects applications requiring deterministic results.
Graph density influences algorithm selection. Sparse graphs with few edges favor Kruskal's algorithm, which processes edges directly. Dense graphs with many edges favor Prim's algorithm, which grows the tree from a starting vertex.
Ruby Implementation
Ruby's standard library lacks built-in spanning tree algorithms, requiring custom implementations or third-party gems. The RGL (Ruby Graph Library) gem provides graph data structures but still requires manual MST implementation for most use cases.
Kruskal's Algorithm Implementation
Kruskal's algorithm processes edges in ascending weight order, adding edges that connect previously disconnected components. The algorithm terminates when n-1 edges have been added.
class Graph
attr_reader :vertices, :edges
def initialize
@vertices = []
@edges = []
end
def add_edge(u, v, weight)
@vertices |= [u, v]
@edges << [weight, u, v]
end
def kruskal_mst
mst_edges = []
vertex_indices = @vertices.each_with_index.to_h
uf = UnionFind.new(@vertices.size)
@edges.sort.each do |weight, u, v|
u_idx = vertex_indices[u]
v_idx = vertex_indices[v]
if uf.union(u_idx, v_idx)
mst_edges << [u, v, weight]
break if mst_edges.size == @vertices.size - 1
end
end
mst_edges
end
end
# Usage
graph = Graph.new
graph.add_edge('A', 'B', 4)
graph.add_edge('A', 'C', 2)
graph.add_edge('B', 'C', 1)
graph.add_edge('B', 'D', 5)
graph.add_edge('C', 'D', 8)
mst = graph.kruskal_mst
# => [["B", "C", 1], ["A", "C", 2], ["B", "D", 5]]
This implementation sorts edges once at the beginning (O(E log E)) and processes each edge with near-constant-time union-find operations. Path compression and union by rank optimizations keep union-find operations effectively O(α(n)), where α is the inverse Ackermann function.
Prim's Algorithm Implementation
Prim's algorithm grows a spanning tree from a starting vertex, repeatedly adding the minimum-weight edge that connects the tree to an unvisited vertex.
require 'set'
class PrimGraph
def initialize
@adjacency = Hash.new { |h, k| h[k] = [] }
end
def add_edge(u, v, weight)
@adjacency[u] << [v, weight]
@adjacency[v] << [u, weight]
end
def prim_mst(start)
visited = Set.new([start])
mst_edges = []
# Min-heap: [weight, from, to]
edges = @adjacency[start].map { |v, w| [w, start, v] }
edges.sort!
while mst_edges.size < @adjacency.keys.size - 1 && !edges.empty?
weight, u, v = edges.shift
next if visited.include?(v)
visited.add(v)
mst_edges << [u, v, weight]
@adjacency[v].each do |neighbor, w|
unless visited.include?(neighbor)
# Insert edge in sorted position
idx = edges.bsearch_index { |e| e[0] >= w } || edges.size
edges.insert(idx, [w, v, neighbor])
end
end
end
mst_edges
end
end
# Usage
graph = PrimGraph.new
graph.add_edge('A', 'B', 4)
graph.add_edge('A', 'C', 2)
graph.add_edge('B', 'C', 1)
graph.add_edge('B', 'D', 5)
graph.add_edge('C', 'D', 8)
mst = graph.prim_mst('A')
# => [["A", "C", 2], ["C", "B", 1], ["B", "D", 5]]
This implementation maintains edges in sorted order through binary search insertion. Production implementations should use a proper priority queue data structure for O(log E) insertions instead of maintaining sorted arrays.
Priority Queue Implementation
Ruby lacks a built-in priority queue, but the standard library provides a binary heap through the PriorityQueue pattern or custom implementations.
class MinHeap
def initialize
@elements = []
end
def insert(priority, value)
@elements << [priority, value]
bubble_up(@elements.size - 1)
end
def extract_min
return nil if @elements.empty?
min = @elements[0]
@elements[0] = @elements.pop
bubble_down(0) unless @elements.empty?
min
end
def empty?
@elements.empty?
end
private
def bubble_up(index)
while index > 0
parent = (index - 1) / 2
break if @elements[parent][0] <= @elements[index][0]
@elements[parent], @elements[index] = @elements[index], @elements[parent]
index = parent
end
end
def bubble_down(index)
loop do
left = 2 * index + 1
right = 2 * index + 2
smallest = index
if left < @elements.size && @elements[left][0] < @elements[smallest][0]
smallest = left
end
if right < @elements.size && @elements[right][0] < @elements[smallest][0]
smallest = right
end
break if smallest == index
@elements[index], @elements[smallest] = @elements[smallest], @elements[index]
index = smallest
end
end
end
def prim_mst_heap(graph, start)
visited = Set.new
mst_edges = []
heap = MinHeap.new
visited.add(start)
graph[start].each { |neighbor, weight| heap.insert(weight, [start, neighbor]) }
until heap.empty?
weight, (u, v) = heap.extract_min
next if visited.include?(v)
visited.add(v)
mst_edges << [u, v, weight]
graph[v].each do |neighbor, w|
heap.insert(w, [v, neighbor]) unless visited.include?(neighbor)
end
end
mst_edges
end
The heap-based implementation reduces edge processing time from O(E²) to O(E log E), making it practical for large graphs.
Handling Disconnected Graphs
Disconnected graphs require spanning forest algorithms that process each connected component independently.
def spanning_forest(graph)
all_vertices = graph.vertices
visited = Set.new
forests = []
all_vertices.each do |start|
next if visited.include?(start)
component_vertices = Set.new
queue = [start]
until queue.empty?
vertex = queue.shift
next if visited.include?(vertex)
visited.add(vertex)
component_vertices.add(vertex)
graph.neighbors(vertex).each do |neighbor|
queue << neighbor unless visited.include?(neighbor)
end
end
# Build MST for this component
subgraph = graph.subgraph(component_vertices)
forests << subgraph.kruskal_mst
end
forests
end
Common Patterns
Kruskal's Algorithm Pattern
Kruskal's algorithm excels with sparse graphs where edge count remains small relative to vertex count. The pattern sorts edges by weight and greedily selects edges that don't form cycles.
Time complexity: O(E log E) for sorting edges, O(E α(V)) for union-find operations. Space complexity: O(V) for union-find structure, O(E) for edge storage.
def kruskal_with_details(edges, num_vertices)
edges_sorted = edges.sort_by { |u, v, w| w }
uf = UnionFind.new(num_vertices)
mst = []
rejected = []
edges_sorted.each do |u, v, weight|
if uf.union(u, v)
mst << [u, v, weight]
break if mst.size == num_vertices - 1
else
rejected << [u, v, weight] # Would create cycle
end
end
{ mst: mst, rejected: rejected, total_weight: mst.sum { |_, _, w| w } }
end
The algorithm remains optimal for graphs with E << V², such as road networks or sparse social networks. The union-find structure maintains disjoint sets efficiently through path compression.
Prim's Algorithm Pattern
Prim's algorithm works efficiently with dense graphs where edge count approaches V². The pattern maintains a cut between visited and unvisited vertices, always selecting the minimum-weight edge crossing the cut.
Time complexity: O(E log V) with binary heap, O(E + V log V) with Fibonacci heap. Space complexity: O(V) for visited set and priority queue.
def prim_with_tracking(adjacency, start)
visited = Set.new([start])
heap = MinHeap.new
mst = []
cut_edges = [] # Edges crossing the cut at each step
adjacency[start].each { |v, w| heap.insert(w, [start, v]) }
until heap.empty? || visited.size == adjacency.size
weight, (u, v) = heap.extract_min
if visited.include?(v)
next # Edge crosses within visited set
end
visited.add(v)
mst << [u, v, weight]
adjacency[v].each do |neighbor, w|
unless visited.include?(neighbor)
heap.insert(w, [v, neighbor])
cut_edges << [v, neighbor, w]
end
end
end
{ mst: mst, total_weight: mst.sum { |_, _, w| w } }
end
Prim's algorithm maintains a connected tree throughout execution, unlike Kruskal's forest-based approach. This property makes Prim's algorithm suitable for online scenarios where partial results need immediate availability.
Borůvka's Algorithm Pattern
Borůvka's algorithm, the oldest MST algorithm, works by simultaneously growing multiple trees in parallel. Each component selects its minimum outgoing edge, then all selected edges merge components.
def boruvka_mst(edges, num_vertices)
uf = UnionFind.new(num_vertices)
mst = []
loop do
# Find minimum outgoing edge for each component
min_edge = Array.new(num_vertices)
edges.each do |u, v, weight|
comp_u = uf.find(u)
comp_v = uf.find(v)
next if comp_u == comp_v # Same component
# Track minimum edge for each component
min_edge[comp_u] = [u, v, weight] if min_edge[comp_u].nil? || weight < min_edge[comp_u][2]
min_edge[comp_v] = [u, v, weight] if min_edge[comp_v].nil? || weight < min_edge[comp_v][2]
end
added = false
min_edge.compact.uniq.each do |u, v, weight|
if uf.union(u, v)
mst << [u, v, weight]
added = true
end
end
break unless added || mst.size >= num_vertices - 1
end
mst
end
Borůvka's algorithm performs well on parallel architectures where multiple components can process edges simultaneously. The algorithm completes in O(log V) iterations, making it attractive for distributed computing scenarios.
Reverse-Delete Algorithm Pattern
The reverse-delete algorithm starts with all edges and removes maximum-weight edges that don't disconnect the graph. This approach mirrors Kruskal's algorithm in reverse.
def reverse_delete_mst(edges, num_vertices)
mst = edges.sort_by { |u, v, w| -w } # Sort descending
mst.reject! do |u, v, weight|
# Try removing edge
test_graph = build_graph(mst - [[u, v, weight]])
# Check if graph remains connected
!connected?(test_graph, num_vertices)
end
mst
end
def connected?(graph, num_vertices)
visited = Set.new
queue = [graph.keys.first]
until queue.empty?
vertex = queue.shift
next if visited.include?(vertex)
visited.add(vertex)
graph[vertex]&.each { |neighbor| queue << neighbor }
end
visited.size == num_vertices
end
This algorithm requires O(E) connectivity checks, each taking O(E + V) time, resulting in O(E²(E + V)) worst-case complexity. The algorithm demonstrates correctness through the cycle property but remains impractical for large graphs.
Performance Considerations
Algorithm selection depends on graph characteristics and available data structures. Kruskal's algorithm performs better on sparse graphs with E = O(V), while Prim's algorithm excels with dense graphs where E = O(V²).
Time Complexity Analysis
Kruskal's algorithm: O(E log E) sorting dominates with simple union-find. The sorting step processes all edges once. Union-find operations approach O(1) amortized time with path compression and union by rank, contributing O(E α(V)) where α(V) grows extremely slowly (α(V) < 5 for any practical V).
Prim's algorithm: O(E log V) with binary heap. Each edge enters and leaves the priority queue once, with each operation costing O(log V). The algorithm processes V vertices and E edges total. With Fibonacci heaps, this improves to O(E + V log V) by reducing decrease-key operations to O(1) amortized time.
Borůvka's algorithm: O(E log V) for sequential execution. Each iteration merges components, reducing component count by at least half. This guarantees O(log V) iterations with O(E) work per iteration.
Space Complexity Considerations
Kruskal's algorithm: O(V) for union-find structure, O(E) for edge list. The algorithm requires storing all edges in memory for sorting. Edge-weighted graphs with millions of edges may exceed available memory, requiring external sorting algorithms.
Prim's algorithm: O(V) for visited set and priority queue, O(E) for adjacency representation. The algorithm can process edges lazily, reading from disk as needed. This property makes Prim's algorithm suitable for out-of-core computation with graphs too large for memory.
def memory_efficient_prim(edge_iterator, num_vertices, start)
visited = Set.new([start])
heap = MinHeap.new
mst = []
# Initial edges from start vertex
edge_iterator.edges_from(start).each do |v, weight|
heap.insert(weight, [start, v])
end
until heap.empty? || visited.size == num_vertices
weight, (u, v) = heap.extract_min
next if visited.include?(v)
visited.add(v)
mst << [u, v, weight]
# Load edges lazily
edge_iterator.edges_from(v).each do |neighbor, w|
heap.insert(w, [v, neighbor]) unless visited.include?(neighbor)
end
end
mst
end
Cache Performance
Modern processors perform better with algorithms exhibiting good spatial locality. Kruskal's algorithm processes edges sequentially after sorting, providing excellent cache behavior. Prim's algorithm performs random access to adjacency lists, potentially causing cache misses.
# Cache-friendly edge storage
class EdgeList
def initialize(edges)
@edges = edges.sort_by { |u, v, w| [u, v] } # Group by source
@vertex_start = {}
current_vertex = nil
@edges.each_with_index do |(u, v, w), idx|
if u != current_vertex
@vertex_start[u] = idx
current_vertex = u
end
end
end
def edges_from(vertex)
start_idx = @vertex_start[vertex]
return [] unless start_idx
end_idx = start_idx
while end_idx < @edges.size && @edges[end_idx][0] == vertex
end_idx += 1
end
@edges[start_idx...end_idx].map { |u, v, w| [v, w] }
end
end
Parallel Processing
Borůvka's algorithm parallelizes naturally by processing components independently. Each component identifies its minimum outgoing edge without coordination.
require 'parallel'
def parallel_boruvka(edges, num_vertices, num_threads: 4)
uf = UnionFind.new(num_vertices)
mst = []
loop do
components = (0...num_vertices).group_by { |v| uf.find(v) }
# Find minimum edge per component in parallel
min_edges = Parallel.map(components.values, in_threads: num_threads) do |comp|
min_edge = nil
min_weight = Float::INFINITY
edges.each do |u, v, weight|
if comp.include?(u) && !comp.include?(v) && weight < min_weight
min_edge = [u, v, weight]
min_weight = weight
end
end
min_edge
end
break unless min_edges.any?
min_edges.compact.each do |u, v, weight|
mst << [u, v, weight] if uf.union(u, v)
end
break if mst.size >= num_vertices - 1
end
mst
end
Parallel Prim's algorithm requires more sophisticated synchronization but can partition the graph and merge results.
Benchmarking Trade-offs
require 'benchmark'
def compare_algorithms(edges, num_vertices)
Benchmark.bm(15) do |x|
x.report("Kruskal:") do
graph = Graph.new
edges.each { |u, v, w| graph.add_edge(u, v, w) }
graph.kruskal_mst
end
x.report("Prim (array):") do
adj = build_adjacency(edges)
prim_mst_array(adj, edges[0][0])
end
x.report("Prim (heap):") do
adj = build_adjacency(edges)
prim_mst_heap(adj, edges[0][0])
end
x.report("Boruvka:") do
boruvka_mst(edges, num_vertices)
end
end
end
Empirical testing reveals that for graphs with fewer than 1000 vertices, implementation overhead often matters more than theoretical complexity. Simple array-based implementations may outperform heap-based algorithms due to lower constant factors.
Practical Examples
Network Design
Network topology design requires connecting nodes with minimum total cable cost. Each cable has installation cost based on distance.
class NetworkDesigner
def initialize
@locations = {}
@connections = []
end
def add_location(name, x, y)
@locations[name] = [x, y]
end
def calculate_costs
@locations.keys.combination(2).each do |loc1, loc2|
x1, y1 = @locations[loc1]
x2, y2 = @locations[loc2]
distance = Math.sqrt((x2 - x1)**2 + (y2 - y1)**2)
@connections << [loc1, loc2, distance.round(2)]
end
end
def optimal_network
graph = Graph.new
@connections.each { |u, v, w| graph.add_edge(u, v, w) }
mst = graph.kruskal_mst
total_cost = mst.sum { |_, _, w| w }
{
connections: mst,
total_cost: total_cost,
savings: @connections.sum { |_, _, w| w } - total_cost
}
end
end
# Usage
designer = NetworkDesigner.new
designer.add_location('Server', 0, 0)
designer.add_location('Office A', 3, 4)
designer.add_location('Office B', 6, 8)
designer.add_location('Office C', 9, 2)
designer.calculate_costs
result = designer.optimal_network
# => {
# connections: [["Office A", "Server", 5.0], ["Office A", "Office B", 5.0], ...],
# total_cost: 18.47,
# savings: 23.15
# }
The spanning tree ensures all offices connect with minimum cable while avoiding redundant connections that increase cost without improving connectivity.
Clustering and Image Segmentation
Spanning trees support clustering by identifying natural groups in data. Removing the heaviest edges from an MST creates clusters.
class ImageSegmenter
def initialize(pixels)
@pixels = pixels # Array of [r, g, b] values
@width = Math.sqrt(@pixels.size).to_i
end
def segment(num_clusters)
# Build graph where edge weight = color difference
edges = []
@pixels.each_with_index do |pixel, idx|
x, y = idx % @width, idx / @width
# Connect to right neighbor
if x < @width - 1
neighbor_idx = idx + 1
weight = color_distance(pixel, @pixels[neighbor_idx])
edges << [idx, neighbor_idx, weight]
end
# Connect to bottom neighbor
if y < @width - 1
neighbor_idx = idx + @width
weight = color_distance(pixel, @pixels[neighbor_idx])
edges << [idx, neighbor_idx, weight]
end
end
# Build MST
graph = Graph.new
edges.each { |u, v, w| graph.add_edge(u, v, w) }
mst = graph.kruskal_mst
# Remove heaviest edges to create clusters
mst_sorted = mst.sort_by { |_, _, w| -w }
keep_edges = mst_sorted.drop(num_clusters - 1)
build_clusters(keep_edges)
end
private
def color_distance(c1, c2)
Math.sqrt((c1[0] - c2[0])**2 + (c1[1] - c2[1])**2 + (c1[2] - c2[2])**2)
end
def build_clusters(edges)
uf = UnionFind.new(@pixels.size)
edges.each { |u, v, _| uf.union(u, v) }
clusters = Hash.new { |h, k| h[k] = [] }
@pixels.each_with_index do |_, idx|
clusters[uf.find(idx)] << idx
end
clusters.values
end
end
This approach segments images by treating pixels as graph vertices and color similarity as edge weights. The MST captures strong color relationships, and cutting heavy edges separates distinct regions.
Routing Protocol Simulation
Network routing protocols use spanning trees to prevent packet loops while maintaining connectivity.
class NetworkRouter
def initialize
@network = {}
@bridge_id = {}
end
def add_bridge(id, priority)
@bridge_id[id] = priority
@network[id] = []
end
def add_link(bridge1, bridge2, cost)
@network[bridge1] << [bridge2, cost]
@network[bridge2] << [bridge1, cost]
end
def calculate_spanning_tree
# Select root bridge (lowest priority)
root = @bridge_id.min_by { |_, priority| priority }[0]
# Run Prim's algorithm from root
visited = Set.new([root])
heap = MinHeap.new
tree = []
blocked_ports = {}
@network[root].each do |neighbor, cost|
heap.insert(cost, [root, neighbor, cost])
end
while !heap.empty? && visited.size < @network.size
cost, (from, to, link_cost) = heap.extract_min
if visited.include?(to)
# Block port to prevent loop
blocked_ports[to] ||= []
blocked_ports[to] << from
next
end
visited.add(to)
tree << { from: from, to: to, cost: link_cost }
@network[to].each do |neighbor, neighbor_cost|
unless visited.include?(neighbor)
heap.insert(neighbor_cost, [to, neighbor, neighbor_cost])
end
end
end
{ tree: tree, blocked: blocked_ports, root: root }
end
end
# Usage
router = NetworkRouter.new
router.add_bridge('A', 1) # Root bridge (lowest priority)
router.add_bridge('B', 2)
router.add_bridge('C', 3)
router.add_bridge('D', 4)
router.add_link('A', 'B', 10)
router.add_link('A', 'C', 5)
router.add_link('B', 'C', 15)
router.add_link('B', 'D', 8)
router.add_link('C', 'D', 12)
result = router.calculate_spanning_tree
# Blocks redundant links to prevent broadcast storms
The spanning tree protocol ensures loop-free forwarding while maintaining network connectivity, critical for Ethernet bridging.
Maze Generation
Spanning trees generate perfect mazes (mazes with exactly one path between any two points). Random spanning tree algorithms create mazes with uniform distribution.
class MazeGenerator
def initialize(width, height)
@width = width
@height = height
@cells = width * height
end
def generate
# Build grid graph
edges = []
(0...@cells).each do |cell|
x, y = cell % @width, cell / @width
# Add edge to right cell
if x < @width - 1
edges << [cell, cell + 1, rand]
end
# Add edge to bottom cell
if y < @height - 1
edges << [cell, cell + @width, rand]
end
end
# Build random spanning tree (Kruskal's with random weights)
graph = Graph.new
edges.each { |u, v, w| graph.add_edge(u, v, w) }
mst = graph.kruskal_mst
format_maze(mst)
end
private
def format_maze(edges)
passages = Set.new
edges.each { |u, v, _| passages.add([u, v].sort) }
maze = Array.new(@height) { Array.new(@width, '█') }
(0...@cells).each do |cell|
x, y = cell % @width, cell / @width
maze[y][x] = ' '
# Add passages
if passages.include?([cell, cell + 1].sort)
maze[y][x + 1] = ' ' if x < @width - 1
end
if passages.include?([cell, cell + @width].sort)
maze[y + 1][x] = ' ' if y < @height - 1
end
end
maze.map { |row| row.join }.join("\n")
end
end
generator = MazeGenerator.new(10, 10)
puts generator.generate
Random spanning trees ensure maze solvability (exactly one solution) while preventing loops or isolated regions.
Reference
Algorithm Comparison
| Algorithm | Time Complexity | Space | Best For | Characteristics |
|---|---|---|---|---|
| Kruskal | O(E log E) | O(V) | Sparse graphs | Edge-based, requires sorting |
| Prim (binary heap) | O(E log V) | O(V) | Dense graphs | Vertex-based, grows single tree |
| Prim (Fibonacci heap) | O(E + V log V) | O(V) | Dense graphs | Optimal for dense graphs |
| Boruvka | O(E log V) | O(V) | Parallel processing | Component-based, parallelizable |
| Reverse-Delete | O(E² log V) | O(E) | Theoretical interest | Edge removal, impractical |
Union-Find Operations
| Operation | Without Optimization | With Path Compression | With Union by Rank | With Both |
|---|---|---|---|---|
| Find | O(n) | O(log n) | O(log n) | O(α(n)) |
| Union | O(n) | O(log n) | O(log n) | O(α(n)) |
| Space | O(n) | O(n) | O(n) | O(n) |
Graph Properties
| Property | Description | MST Impact |
|---|---|---|
| Connected | All vertices reachable | Required for spanning tree |
| Cyclic | Contains loops | MST breaks all cycles |
| Weighted | Edges have costs | MST minimizes total weight |
| Directed | Edges have direction | Requires directed spanning tree algorithms |
| Sparse | E = O(V) | Favors Kruskal's algorithm |
| Dense | E = O(V²) | Favors Prim's algorithm |
Common Edge Weight Functions
| Weight Function | Formula | Use Case |
|---|---|---|
| Euclidean distance | sqrt((x2-x1)² + (y2-y1)²) | Geographic networks |
| Manhattan distance | abs(x2-x1) + abs(y2-y1) | Grid-based routing |
| Inverse similarity | 1 / similarity(u,v) | Clustering |
| Network latency | measured_latency | Network routing |
| Installation cost | fixed_cost + distance * unit_cost | Infrastructure planning |
Ruby Implementation Patterns
| Pattern | Code | Purpose |
|---|---|---|
| Graph initialization | graph = Hash.new { Hash.new } | Adjacency list with default |
| Edge sorting | edges.sort_by { u, v, w / w } | Kruskal preprocessing |
| Priority queue | heap.insert(weight, data) | Prim's algorithm |
| Union-Find check | uf.find(u) == uf.find(v) | Cycle detection |
| Component merging | uf.union(u, v) | Tree construction |
Performance Characteristics by Graph Size
| Vertices | Edges (sparse) | Edges (dense) | Recommended Algorithm |
|---|---|---|---|
| < 100 | < 500 | < 5000 | Any (overhead dominates) |
| 100-1000 | 500-5000 | 5000-500000 | Kruskal for sparse, Prim for dense |
| 1000-10000 | 5000-50000 | > 500000 | Prim with Fibonacci heap |
| > 10000 | > 50000 | N/A | Parallel Boruvka or distributed |