CrackedRuby - Spanning Trees

Overview

A spanning tree represents a subgraph that connects all vertices in an undirected graph using the minimum number of edges without creating cycles. Given a graph with n vertices, a spanning tree contains exactly n-1 edges. The concept originated from electrical network analysis and now applies to network design, clustering algorithms, and optimization problems.

For a graph to have a spanning tree, it must be connected. Disconnected graphs cannot have spanning trees but can have spanning forests, where each connected component has its own spanning tree.

A minimum spanning tree (MST) extends this concept to weighted graphs by finding the spanning tree with the smallest total edge weight. Multiple spanning trees may exist for a single graph, but the MST is unique when all edge weights are distinct.

# Simple graph representation
graph = {
  'A' => [['B', 4], ['C', 2]],
  'B' => [['A', 4], ['C', 1], ['D', 5]],
  'C' => [['A', 2], ['B', 1], ['D', 8]],
  'D' => [['B', 5], ['C', 8]]
}

# One possible spanning tree: A-C (2), C-B (1), B-D (5)
# Total weight: 8

The spanning tree problem appears in network routing protocols, where routers construct spanning trees to prevent broadcast storms. The IEEE 802.1D Spanning Tree Protocol uses this principle to create loop-free topologies in bridged networks.

Key Principles

Spanning trees satisfy three fundamental properties: connectivity, acyclicity, and minimality. Connectivity ensures every vertex remains reachable from every other vertex. Acyclicity prevents loops, making the structure a tree. Minimality means the tree uses exactly n-1 edges for n vertices, with no redundant connections.

The cut property forms the theoretical foundation for MST algorithms. A cut partitions graph vertices into two disjoint sets. Any edge crossing this cut with minimum weight must belong to some MST. This property guarantees the correctness of greedy algorithms like Kruskal's and Prim's.

The cycle property provides an alternative characterization. For any cycle in the graph, the maximum-weight edge in that cycle cannot be part of any MST. This property enables optimization algorithms that start with all edges and remove maximum edges from cycles.

Union-Find data structures support efficient cycle detection during spanning tree construction. The structure tracks connected components and determines whether adding an edge would create a cycle. Two vertices belong to the same component if a path exists between them in the current forest.

class UnionFind
  def initialize(size)
    @parent = (0...size).to_a
    @rank = Array.new(size, 0)
  end
  
  def find(x)
    if @parent[x] != x
      @parent[x] = find(@parent[x])  # Path compression
    end
    @parent[x]
  end
  
  def union(x, y)
    root_x = find(x)
    root_y = find(y)
    return false if root_x == root_y  # Already connected
    
    # Union by rank
    if @rank[root_x] < @rank[root_y]
      @parent[root_x] = root_y
    elsif @rank[root_x] > @rank[root_y]
      @parent[root_y] = root_x
    else
      @parent[root_y] = root_x
      @rank[root_x] += 1
    end
    true
  end
end

Greedy algorithms dominate spanning tree construction due to the matroid structure of the problem. A matroid is a mathematical structure where locally optimal choices lead to globally optimal solutions. Both Kruskal's and Prim's algorithms make greedy choices that provably produce optimal results.

The spanning tree uniqueness theorem states that if all edge weights are distinct, exactly one MST exists. When edge weights contain duplicates, multiple MSTs may exist with the same total weight. This ambiguity affects applications requiring deterministic results.

Graph density influences algorithm selection. Sparse graphs with few edges favor Kruskal's algorithm, which processes edges directly. Dense graphs with many edges favor Prim's algorithm, which grows the tree from a starting vertex.

Ruby Implementation

Ruby's standard library lacks built-in spanning tree algorithms, requiring custom implementations or third-party gems. The RGL (Ruby Graph Library) gem provides graph data structures but still requires manual MST implementation for most use cases.

Kruskal's Algorithm Implementation

Kruskal's algorithm processes edges in ascending weight order, adding edges that connect previously disconnected components. The algorithm terminates when n-1 edges have been added.

class Graph
  attr_reader :vertices, :edges
  
  def initialize
    @vertices = []
    @edges = []
  end
  
  def add_edge(u, v, weight)
    @vertices |= [u, v]
    @edges << [weight, u, v]
  end
  
  def kruskal_mst
    mst_edges = []
    vertex_indices = @vertices.each_with_index.to_h
    uf = UnionFind.new(@vertices.size)
    
    @edges.sort.each do |weight, u, v|
      u_idx = vertex_indices[u]
      v_idx = vertex_indices[v]
      
      if uf.union(u_idx, v_idx)
        mst_edges << [u, v, weight]
        break if mst_edges.size == @vertices.size - 1
      end
    end
    
    mst_edges
  end
end

# Usage
graph = Graph.new
graph.add_edge('A', 'B', 4)
graph.add_edge('A', 'C', 2)
graph.add_edge('B', 'C', 1)
graph.add_edge('B', 'D', 5)
graph.add_edge('C', 'D', 8)

mst = graph.kruskal_mst
# => [["B", "C", 1], ["A", "C", 2], ["B", "D", 5]]

This implementation sorts edges once at the beginning (O(E log E)) and processes each edge with near-constant-time union-find operations. Path compression and union by rank optimizations keep union-find operations effectively O(α(n)), where α is the inverse Ackermann function.

Prim's Algorithm Implementation

Prim's algorithm grows a spanning tree from a starting vertex, repeatedly adding the minimum-weight edge that connects the tree to an unvisited vertex.

require 'set'

class PrimGraph
  def initialize
    @adjacency = Hash.new { |h, k| h[k] = [] }
  end
  
  def add_edge(u, v, weight)
    @adjacency[u] << [v, weight]
    @adjacency[v] << [u, weight]
  end
  
  def prim_mst(start)
    visited = Set.new([start])
    mst_edges = []
    # Min-heap: [weight, from, to]
    edges = @adjacency[start].map { |v, w| [w, start, v] }
    edges.sort!
    
    while mst_edges.size < @adjacency.keys.size - 1 && !edges.empty?
      weight, u, v = edges.shift
      
      next if visited.include?(v)
      
      visited.add(v)
      mst_edges << [u, v, weight]
      
      @adjacency[v].each do |neighbor, w|
        unless visited.include?(neighbor)
          # Insert edge in sorted position
          idx = edges.bsearch_index { |e| e[0] >= w } || edges.size
          edges.insert(idx, [w, v, neighbor])
        end
      end
    end
    
    mst_edges
  end
end

# Usage
graph = PrimGraph.new
graph.add_edge('A', 'B', 4)
graph.add_edge('A', 'C', 2)
graph.add_edge('B', 'C', 1)
graph.add_edge('B', 'D', 5)
graph.add_edge('C', 'D', 8)

mst = graph.prim_mst('A')
# => [["A", "C", 2], ["C", "B", 1], ["B", "D", 5]]

This implementation maintains edges in sorted order through binary search insertion. Production implementations should use a proper priority queue data structure for O(log E) insertions instead of maintaining sorted arrays.

Priority Queue Implementation

Ruby lacks a built-in priority queue, but the standard library provides a binary heap through the PriorityQueue pattern or custom implementations.

class MinHeap
  def initialize
    @elements = []
  end
  
  def insert(priority, value)
    @elements << [priority, value]
    bubble_up(@elements.size - 1)
  end
  
  def extract_min
    return nil if @elements.empty?
    
    min = @elements[0]
    @elements[0] = @elements.pop
    bubble_down(0) unless @elements.empty?
    min
  end
  
  def empty?
    @elements.empty?
  end
  
  private
  
  def bubble_up(index)
    while index > 0
      parent = (index - 1) / 2
      break if @elements[parent][0] <= @elements[index][0]
      
      @elements[parent], @elements[index] = @elements[index], @elements[parent]
      index = parent
    end
  end
  
  def bubble_down(index)
    loop do
      left = 2 * index + 1
      right = 2 * index + 2
      smallest = index
      
      if left < @elements.size && @elements[left][0] < @elements[smallest][0]
        smallest = left
      end
      
      if right < @elements.size && @elements[right][0] < @elements[smallest][0]
        smallest = right
      end
      
      break if smallest == index
      
      @elements[index], @elements[smallest] = @elements[smallest], @elements[index]
      index = smallest
    end
  end
end

def prim_mst_heap(graph, start)
  visited = Set.new
  mst_edges = []
  heap = MinHeap.new
  
  visited.add(start)
  graph[start].each { |neighbor, weight| heap.insert(weight, [start, neighbor]) }
  
  until heap.empty?
    weight, (u, v) = heap.extract_min
    next if visited.include?(v)
    
    visited.add(v)
    mst_edges << [u, v, weight]
    
    graph[v].each do |neighbor, w|
      heap.insert(w, [v, neighbor]) unless visited.include?(neighbor)
    end
  end
  
  mst_edges
end

The heap-based implementation reduces edge processing time from O(E²) to O(E log E), making it practical for large graphs.

Handling Disconnected Graphs

Disconnected graphs require spanning forest algorithms that process each connected component independently.

def spanning_forest(graph)
  all_vertices = graph.vertices
  visited = Set.new
  forests = []
  
  all_vertices.each do |start|
    next if visited.include?(start)
    
    component_vertices = Set.new
    queue = [start]
    
    until queue.empty?
      vertex = queue.shift
      next if visited.include?(vertex)
      
      visited.add(vertex)
      component_vertices.add(vertex)
      
      graph.neighbors(vertex).each do |neighbor|
        queue << neighbor unless visited.include?(neighbor)
      end
    end
    
    # Build MST for this component
    subgraph = graph.subgraph(component_vertices)
    forests << subgraph.kruskal_mst
  end
  
  forests
end

Common Patterns

Kruskal's Algorithm Pattern

Kruskal's algorithm excels with sparse graphs where edge count remains small relative to vertex count. The pattern sorts edges by weight and greedily selects edges that don't form cycles.

Time complexity: O(E log E) for sorting edges, O(E α(V)) for union-find operations. Space complexity: O(V) for union-find structure, O(E) for edge storage.

def kruskal_with_details(edges, num_vertices)
  edges_sorted = edges.sort_by { |u, v, w| w }
  uf = UnionFind.new(num_vertices)
  mst = []
  rejected = []
  
  edges_sorted.each do |u, v, weight|
    if uf.union(u, v)
      mst << [u, v, weight]
      break if mst.size == num_vertices - 1
    else
      rejected << [u, v, weight]  # Would create cycle
    end
  end
  
  { mst: mst, rejected: rejected, total_weight: mst.sum { |_, _, w| w } }
end

The algorithm remains optimal for graphs with E << V², such as road networks or sparse social networks. The union-find structure maintains disjoint sets efficiently through path compression.

Prim's Algorithm Pattern

Prim's algorithm works efficiently with dense graphs where edge count approaches V². The pattern maintains a cut between visited and unvisited vertices, always selecting the minimum-weight edge crossing the cut.

Time complexity: O(E log V) with binary heap, O(E + V log V) with Fibonacci heap. Space complexity: O(V) for visited set and priority queue.

def prim_with_tracking(adjacency, start)
  visited = Set.new([start])
  heap = MinHeap.new
  mst = []
  cut_edges = []  # Edges crossing the cut at each step
  
  adjacency[start].each { |v, w| heap.insert(w, [start, v]) }
  
  until heap.empty? || visited.size == adjacency.size
    weight, (u, v) = heap.extract_min
    
    if visited.include?(v)
      next  # Edge crosses within visited set
    end
    
    visited.add(v)
    mst << [u, v, weight]
    
    adjacency[v].each do |neighbor, w|
      unless visited.include?(neighbor)
        heap.insert(w, [v, neighbor])
        cut_edges << [v, neighbor, w]
      end
    end
  end
  
  { mst: mst, total_weight: mst.sum { |_, _, w| w } }
end

Prim's algorithm maintains a connected tree throughout execution, unlike Kruskal's forest-based approach. This property makes Prim's algorithm suitable for online scenarios where partial results need immediate availability.

Borůvka's Algorithm Pattern

Borůvka's algorithm, the oldest MST algorithm, works by simultaneously growing multiple trees in parallel. Each component selects its minimum outgoing edge, then all selected edges merge components.

def boruvka_mst(edges, num_vertices)
  uf = UnionFind.new(num_vertices)
  mst = []
  
  loop do
    # Find minimum outgoing edge for each component
    min_edge = Array.new(num_vertices)
    
    edges.each do |u, v, weight|
      comp_u = uf.find(u)
      comp_v = uf.find(v)
      
      next if comp_u == comp_v  # Same component
      
      # Track minimum edge for each component
      min_edge[comp_u] = [u, v, weight] if min_edge[comp_u].nil? || weight < min_edge[comp_u][2]
      min_edge[comp_v] = [u, v, weight] if min_edge[comp_v].nil? || weight < min_edge[comp_v][2]
    end
    
    added = false
    min_edge.compact.uniq.each do |u, v, weight|
      if uf.union(u, v)
        mst << [u, v, weight]
        added = true
      end
    end
    
    break unless added || mst.size >= num_vertices - 1
  end
  
  mst
end

Borůvka's algorithm performs well on parallel architectures where multiple components can process edges simultaneously. The algorithm completes in O(log V) iterations, making it attractive for distributed computing scenarios.

Reverse-Delete Algorithm Pattern

The reverse-delete algorithm starts with all edges and removes maximum-weight edges that don't disconnect the graph. This approach mirrors Kruskal's algorithm in reverse.

def reverse_delete_mst(edges, num_vertices)
  mst = edges.sort_by { |u, v, w| -w }  # Sort descending
  
  mst.reject! do |u, v, weight|
    # Try removing edge
    test_graph = build_graph(mst - [[u, v, weight]])
    # Check if graph remains connected
    !connected?(test_graph, num_vertices)
  end
  
  mst
end

def connected?(graph, num_vertices)
  visited = Set.new
  queue = [graph.keys.first]
  
  until queue.empty?
    vertex = queue.shift
    next if visited.include?(vertex)
    
    visited.add(vertex)
    graph[vertex]&.each { |neighbor| queue << neighbor }
  end
  
  visited.size == num_vertices
end

This algorithm requires O(E) connectivity checks, each taking O(E + V) time, resulting in O(E²(E + V)) worst-case complexity. The algorithm demonstrates correctness through the cycle property but remains impractical for large graphs.

Performance Considerations

Algorithm selection depends on graph characteristics and available data structures. Kruskal's algorithm performs better on sparse graphs with E = O(V), while Prim's algorithm excels with dense graphs where E = O(V²).

Time Complexity Analysis

Kruskal's algorithm: O(E log E) sorting dominates with simple union-find. The sorting step processes all edges once. Union-find operations approach O(1) amortized time with path compression and union by rank, contributing O(E α(V)) where α(V) grows extremely slowly (α(V) < 5 for any practical V).

Prim's algorithm: O(E log V) with binary heap. Each edge enters and leaves the priority queue once, with each operation costing O(log V). The algorithm processes V vertices and E edges total. With Fibonacci heaps, this improves to O(E + V log V) by reducing decrease-key operations to O(1) amortized time.

Borůvka's algorithm: O(E log V) for sequential execution. Each iteration merges components, reducing component count by at least half. This guarantees O(log V) iterations with O(E) work per iteration.

Space Complexity Considerations

Kruskal's algorithm: O(V) for union-find structure, O(E) for edge list. The algorithm requires storing all edges in memory for sorting. Edge-weighted graphs with millions of edges may exceed available memory, requiring external sorting algorithms.

Prim's algorithm: O(V) for visited set and priority queue, O(E) for adjacency representation. The algorithm can process edges lazily, reading from disk as needed. This property makes Prim's algorithm suitable for out-of-core computation with graphs too large for memory.

def memory_efficient_prim(edge_iterator, num_vertices, start)
  visited = Set.new([start])
  heap = MinHeap.new
  mst = []
  
  # Initial edges from start vertex
  edge_iterator.edges_from(start).each do |v, weight|
    heap.insert(weight, [start, v])
  end
  
  until heap.empty? || visited.size == num_vertices
    weight, (u, v) = heap.extract_min
    next if visited.include?(v)
    
    visited.add(v)
    mst << [u, v, weight]
    
    # Load edges lazily
    edge_iterator.edges_from(v).each do |neighbor, w|
      heap.insert(w, [v, neighbor]) unless visited.include?(neighbor)
    end
  end
  
  mst
end

Cache Performance

Modern processors perform better with algorithms exhibiting good spatial locality. Kruskal's algorithm processes edges sequentially after sorting, providing excellent cache behavior. Prim's algorithm performs random access to adjacency lists, potentially causing cache misses.

# Cache-friendly edge storage
class EdgeList
  def initialize(edges)
    @edges = edges.sort_by { |u, v, w| [u, v] }  # Group by source
    @vertex_start = {}
    
    current_vertex = nil
    @edges.each_with_index do |(u, v, w), idx|
      if u != current_vertex
        @vertex_start[u] = idx
        current_vertex = u
      end
    end
  end
  
  def edges_from(vertex)
    start_idx = @vertex_start[vertex]
    return [] unless start_idx
    
    end_idx = start_idx
    while end_idx < @edges.size && @edges[end_idx][0] == vertex
      end_idx += 1
    end
    
    @edges[start_idx...end_idx].map { |u, v, w| [v, w] }
  end
end

Parallel Processing

Borůvka's algorithm parallelizes naturally by processing components independently. Each component identifies its minimum outgoing edge without coordination.

require 'parallel'

def parallel_boruvka(edges, num_vertices, num_threads: 4)
  uf = UnionFind.new(num_vertices)
  mst = []
  
  loop do
    components = (0...num_vertices).group_by { |v| uf.find(v) }
    
    # Find minimum edge per component in parallel
    min_edges = Parallel.map(components.values, in_threads: num_threads) do |comp|
      min_edge = nil
      min_weight = Float::INFINITY
      
      edges.each do |u, v, weight|
        if comp.include?(u) && !comp.include?(v) && weight < min_weight
          min_edge = [u, v, weight]
          min_weight = weight
        end
      end
      
      min_edge
    end
    
    break unless min_edges.any?
    
    min_edges.compact.each do |u, v, weight|
      mst << [u, v, weight] if uf.union(u, v)
    end
    
    break if mst.size >= num_vertices - 1
  end
  
  mst
end

Parallel Prim's algorithm requires more sophisticated synchronization but can partition the graph and merge results.

Benchmarking Trade-offs

require 'benchmark'

def compare_algorithms(edges, num_vertices)
  Benchmark.bm(15) do |x|
    x.report("Kruskal:") do
      graph = Graph.new
      edges.each { |u, v, w| graph.add_edge(u, v, w) }
      graph.kruskal_mst
    end
    
    x.report("Prim (array):") do
      adj = build_adjacency(edges)
      prim_mst_array(adj, edges[0][0])
    end
    
    x.report("Prim (heap):") do
      adj = build_adjacency(edges)
      prim_mst_heap(adj, edges[0][0])
    end
    
    x.report("Boruvka:") do
      boruvka_mst(edges, num_vertices)
    end
  end
end

Empirical testing reveals that for graphs with fewer than 1000 vertices, implementation overhead often matters more than theoretical complexity. Simple array-based implementations may outperform heap-based algorithms due to lower constant factors.

Practical Examples

Network Design

Network topology design requires connecting nodes with minimum total cable cost. Each cable has installation cost based on distance.

class NetworkDesigner
  def initialize
    @locations = {}
    @connections = []
  end
  
  def add_location(name, x, y)
    @locations[name] = [x, y]
  end
  
  def calculate_costs
    @locations.keys.combination(2).each do |loc1, loc2|
      x1, y1 = @locations[loc1]
      x2, y2 = @locations[loc2]
      distance = Math.sqrt((x2 - x1)**2 + (y2 - y1)**2)
      @connections << [loc1, loc2, distance.round(2)]
    end
  end
  
  def optimal_network
    graph = Graph.new
    @connections.each { |u, v, w| graph.add_edge(u, v, w) }
    
    mst = graph.kruskal_mst
    total_cost = mst.sum { |_, _, w| w }
    
    {
      connections: mst,
      total_cost: total_cost,
      savings: @connections.sum { |_, _, w| w } - total_cost
    }
  end
end

# Usage
designer = NetworkDesigner.new
designer.add_location('Server', 0, 0)
designer.add_location('Office A', 3, 4)
designer.add_location('Office B', 6, 8)
designer.add_location('Office C', 9, 2)
designer.calculate_costs

result = designer.optimal_network
# => {
#   connections: [["Office A", "Server", 5.0], ["Office A", "Office B", 5.0], ...],
#   total_cost: 18.47,
#   savings: 23.15
# }

The spanning tree ensures all offices connect with minimum cable while avoiding redundant connections that increase cost without improving connectivity.

Clustering and Image Segmentation

Spanning trees support clustering by identifying natural groups in data. Removing the heaviest edges from an MST creates clusters.

class ImageSegmenter
  def initialize(pixels)
    @pixels = pixels  # Array of [r, g, b] values
    @width = Math.sqrt(@pixels.size).to_i
  end
  
  def segment(num_clusters)
    # Build graph where edge weight = color difference
    edges = []
    
    @pixels.each_with_index do |pixel, idx|
      x, y = idx % @width, idx / @width
      
      # Connect to right neighbor
      if x < @width - 1
        neighbor_idx = idx + 1
        weight = color_distance(pixel, @pixels[neighbor_idx])
        edges << [idx, neighbor_idx, weight]
      end
      
      # Connect to bottom neighbor
      if y < @width - 1
        neighbor_idx = idx + @width
        weight = color_distance(pixel, @pixels[neighbor_idx])
        edges << [idx, neighbor_idx, weight]
      end
    end
    
    # Build MST
    graph = Graph.new
    edges.each { |u, v, w| graph.add_edge(u, v, w) }
    mst = graph.kruskal_mst
    
    # Remove heaviest edges to create clusters
    mst_sorted = mst.sort_by { |_, _, w| -w }
    keep_edges = mst_sorted.drop(num_clusters - 1)
    
    build_clusters(keep_edges)
  end
  
  private
  
  def color_distance(c1, c2)
    Math.sqrt((c1[0] - c2[0])**2 + (c1[1] - c2[1])**2 + (c1[2] - c2[2])**2)
  end
  
  def build_clusters(edges)
    uf = UnionFind.new(@pixels.size)
    edges.each { |u, v, _| uf.union(u, v) }
    
    clusters = Hash.new { |h, k| h[k] = [] }
    @pixels.each_with_index do |_, idx|
      clusters[uf.find(idx)] << idx
    end
    
    clusters.values
  end
end

This approach segments images by treating pixels as graph vertices and color similarity as edge weights. The MST captures strong color relationships, and cutting heavy edges separates distinct regions.

Routing Protocol Simulation

Network routing protocols use spanning trees to prevent packet loops while maintaining connectivity.

class NetworkRouter
  def initialize
    @network = {}
    @bridge_id = {}
  end
  
  def add_bridge(id, priority)
    @bridge_id[id] = priority
    @network[id] = []
  end
  
  def add_link(bridge1, bridge2, cost)
    @network[bridge1] << [bridge2, cost]
    @network[bridge2] << [bridge1, cost]
  end
  
  def calculate_spanning_tree
    # Select root bridge (lowest priority)
    root = @bridge_id.min_by { |_, priority| priority }[0]
    
    # Run Prim's algorithm from root
    visited = Set.new([root])
    heap = MinHeap.new
    tree = []
    blocked_ports = {}
    
    @network[root].each do |neighbor, cost|
      heap.insert(cost, [root, neighbor, cost])
    end
    
    while !heap.empty? && visited.size < @network.size
      cost, (from, to, link_cost) = heap.extract_min
      
      if visited.include?(to)
        # Block port to prevent loop
        blocked_ports[to] ||= []
        blocked_ports[to] << from
        next
      end
      
      visited.add(to)
      tree << { from: from, to: to, cost: link_cost }
      
      @network[to].each do |neighbor, neighbor_cost|
        unless visited.include?(neighbor)
          heap.insert(neighbor_cost, [to, neighbor, neighbor_cost])
        end
      end
    end
    
    { tree: tree, blocked: blocked_ports, root: root }
  end
end

# Usage
router = NetworkRouter.new
router.add_bridge('A', 1)  # Root bridge (lowest priority)
router.add_bridge('B', 2)
router.add_bridge('C', 3)
router.add_bridge('D', 4)

router.add_link('A', 'B', 10)
router.add_link('A', 'C', 5)
router.add_link('B', 'C', 15)
router.add_link('B', 'D', 8)
router.add_link('C', 'D', 12)

result = router.calculate_spanning_tree
# Blocks redundant links to prevent broadcast storms

The spanning tree protocol ensures loop-free forwarding while maintaining network connectivity, critical for Ethernet bridging.

Maze Generation

Spanning trees generate perfect mazes (mazes with exactly one path between any two points). Random spanning tree algorithms create mazes with uniform distribution.

class MazeGenerator
  def initialize(width, height)
    @width = width
    @height = height
    @cells = width * height
  end
  
  def generate
    # Build grid graph
    edges = []
    
    (0...@cells).each do |cell|
      x, y = cell % @width, cell / @width
      
      # Add edge to right cell
      if x < @width - 1
        edges << [cell, cell + 1, rand]
      end
      
      # Add edge to bottom cell
      if y < @height - 1
        edges << [cell, cell + @width, rand]
      end
    end
    
    # Build random spanning tree (Kruskal's with random weights)
    graph = Graph.new
    edges.each { |u, v, w| graph.add_edge(u, v, w) }
    mst = graph.kruskal_mst
    
    format_maze(mst)
  end
  
  private
  
  def format_maze(edges)
    passages = Set.new
    edges.each { |u, v, _| passages.add([u, v].sort) }
    
    maze = Array.new(@height) { Array.new(@width, '█') }
    
    (0...@cells).each do |cell|
      x, y = cell % @width, cell / @width
      maze[y][x] = ' '
      
      # Add passages
      if passages.include?([cell, cell + 1].sort)
        maze[y][x + 1] = ' ' if x < @width - 1
      end
      
      if passages.include?([cell, cell + @width].sort)
        maze[y + 1][x] = ' ' if y < @height - 1
      end
    end
    
    maze.map { |row| row.join }.join("\n")
  end
end

generator = MazeGenerator.new(10, 10)
puts generator.generate

Random spanning trees ensure maze solvability (exactly one solution) while preventing loops or isolated regions.

Reference

Algorithm Comparison

Algorithm	Time Complexity	Space	Best For	Characteristics
Kruskal	O(E log E)	O(V)	Sparse graphs	Edge-based, requires sorting
Prim (binary heap)	O(E log V)	O(V)	Dense graphs	Vertex-based, grows single tree
Prim (Fibonacci heap)	O(E + V log V)	O(V)	Dense graphs	Optimal for dense graphs
Boruvka	O(E log V)	O(V)	Parallel processing	Component-based, parallelizable
Reverse-Delete	O(E² log V)	O(E)	Theoretical interest	Edge removal, impractical

Union-Find Operations

Operation	Without Optimization	With Path Compression	With Union by Rank	With Both
Find	O(n)	O(log n)	O(log n)	O(α(n))
Union	O(n)	O(log n)	O(log n)	O(α(n))
Space	O(n)	O(n)	O(n)	O(n)

Graph Properties

Property	Description	MST Impact
Connected	All vertices reachable	Required for spanning tree
Cyclic	Contains loops	MST breaks all cycles
Weighted	Edges have costs	MST minimizes total weight
Directed	Edges have direction	Requires directed spanning tree algorithms
Sparse	E = O(V)	Favors Kruskal's algorithm
Dense	E = O(V²)	Favors Prim's algorithm

Common Edge Weight Functions

Weight Function	Formula	Use Case
Euclidean distance	sqrt((x2-x1)² + (y2-y1)²)	Geographic networks
Manhattan distance	abs(x2-x1) + abs(y2-y1)	Grid-based routing
Inverse similarity	1 / similarity(u,v)	Clustering
Network latency	measured_latency	Network routing
Installation cost	fixed_cost + distance * unit_cost	Infrastructure planning

Ruby Implementation Patterns

Pattern	Code	Purpose
Graph initialization	graph = Hash.new { Hash.new }	Adjacency list with default
Edge sorting	edges.sort_by { u, v, w / w }	Kruskal preprocessing
Priority queue	heap.insert(weight, data)	Prim's algorithm
Union-Find check	uf.find(u) == uf.find(v)	Cycle detection
Component merging	uf.union(u, v)	Tree construction

Performance Characteristics by Graph Size

Vertices	Edges (sparse)	Edges (dense)	Recommended Algorithm
< 100	< 500	< 5000	Any (overhead dominates)
100-1000	500-5000	5000-500000	Kruskal for sparse, Prim for dense
1000-10000	5000-50000	> 500000	Prim with Fibonacci heap
> 10000	> 50000	N/A	Parallel Boruvka or distributed

Spanning Trees