CrackedRuby - Network Flow

Overview

Network flow addresses a fundamental class of optimization problems in computer science and operations research. The core problem involves a directed graph where edges have capacity constraints, and the goal is to determine the maximum amount of flow that can be sent from a source node to a sink node while respecting these constraints.

The concept originated from transportation and distribution problems. Consider a pipeline network where oil flows from production facilities to refineries, or a road network where traffic moves from suburbs to city centers. Each pipe or road has a maximum capacity, and the network flow problem determines the maximum throughput possible given these constraints.

Network flow problems appear across diverse domains: matching problems in bipartite graphs, image segmentation in computer vision, airline crew scheduling, network reliability analysis, and resource allocation in distributed systems. The algorithms developed for network flow also provide solutions to seemingly unrelated problems through problem transformation.

A flow network consists of a directed graph G = (V, E) with:

A source vertex s where flow originates
A sink vertex t where flow terminates
Capacity function c(u, v) ≥ 0 for each edge (u, v)
Flow function f(u, v) representing actual flow on each edge

The maximum flow problem asks: what is the maximum amount of flow that can be sent from s to t?

# Simple flow network representation
network = {
  's' => { 'a' => 10, 'b' => 5 },
  'a' => { 'b' => 15, 't' => 10 },
  'b' => { 't' => 10 }
}
# Maximum flow from 's' to 't' is 15

Key Principles

The foundation of network flow rests on several mathematical principles that govern how flow behaves in a network.

Capacity Constraints: For each edge (u, v) in the network, the flow cannot exceed the edge's capacity: 0 ≤ f(u, v) ≤ c(u, v). This constraint ensures physical or logical limits are respected. In a water pipe network, you cannot pump more water than the pipe's diameter allows.

Flow Conservation: At every vertex except the source and sink, the total incoming flow equals the total outgoing flow. Mathematically, for all vertices v ∈ V - {s, t}: Σ f(u, v) = Σ f(v, w) where the first sum is over all incoming edges and the second over all outgoing edges. This principle ensures flow is neither created nor destroyed at intermediate nodes.

Skew Symmetry: The flow from u to v is the negative of flow from v to u: f(u, v) = -f(v, u). This property handles the mathematical representation of flow in both directions and simplifies certain algorithmic implementations.

The value of a flow |f| equals the total flow leaving the source (or equivalently, entering the sink): |f| = Σ f(s, v) for all v where (s, v) ∈ E.

A residual network G_f represents remaining capacity after accounting for current flow. For each edge (u, v) with flow f(u, v) and capacity c(u, v), the residual network contains:

Forward edge (u, v) with residual capacity c_f(u, v) = c(u, v) - f(u, v)
Backward edge (v, u) with residual capacity c_f(v, u) = f(u, v)

The backward edge allows the algorithm to "undo" flow sent earlier if a better path is found.

An augmenting path is a path from s to t in the residual network where every edge has positive residual capacity. The existence of an augmenting path indicates the current flow can be increased. The minimum residual capacity along an augmenting path determines how much additional flow can be pushed through that path.

A cut (S, T) partitions vertices into two sets where s ∈ S and t ∈ T. The capacity of a cut equals the sum of capacities of edges going from S to T. A minimum cut is a cut with minimum capacity among all cuts.

The Max-Flow Min-Cut Theorem states that the maximum value of flow in a network equals the minimum capacity of all cuts. This fundamental result connects flow maximization with cut minimization and provides the theoretical foundation for proving algorithm correctness.

# Residual capacity calculation
def residual_capacity(capacity, flow, u, v)
  # Forward edge: remaining capacity
  forward = capacity[[u, v]] - flow[[u, v]]
  # Backward edge: flow that can be cancelled
  backward = flow[[v, u]]
  forward + backward
end

Ruby Implementation

Implementing network flow algorithms in Ruby requires representing the graph structure and flow values, then applying an augmenting path algorithm until no more augmenting paths exist.

Graph Representation: A hash-based adjacency list provides a clean representation where keys are vertices and values are hashes mapping neighbors to capacities.

class FlowNetwork
  attr_reader :graph, :flow
  
  def initialize
    @graph = Hash.new { |h, k| h[k] = {} }
    @flow = Hash.new(0)
  end
  
  def add_edge(from, to, capacity)
    @graph[from][to] = capacity
    @graph[to][from] ||= 0  # Ensure reverse edge exists
  end
  
  def residual_capacity(u, v)
    capacity = @graph[u][v] || 0
    capacity - @flow[[u, v]] + @flow[[v, u]]
  end
end

Ford-Fulkerson Method: This method repeatedly finds augmenting paths and pushes flow along them. The method terminates when no augmenting path exists. The specific implementation depends on how augmenting paths are found.

class FordFulkerson
  def initialize(network)
    @network = network
    @flow = Hash.new(0)
  end
  
  def max_flow(source, sink)
    total_flow = 0
    
    while path = find_path(source, sink)
      # Find minimum residual capacity along path
      flow_increment = path.each_cons(2).map { |u, v|
        @network.residual_capacity(u, v)
      }.min
      
      # Update flow along path
      path.each_cons(2) do |u, v|
        @flow[[u, v]] += flow_increment
        @flow[[v, u]] -= flow_increment
      end
      
      total_flow += flow_increment
    end
    
    @network.flow = @flow
    total_flow
  end
  
  private
  
  def find_path(source, sink)
    visited = Set.new
    queue = [[source, [source]]]
    
    while !queue.empty?
      current, path = queue.shift
      return path if current == sink
      
      next if visited.include?(current)
      visited.add(current)
      
      @network.graph[current].each do |neighbor, _|
        if @network.residual_capacity(current, neighbor) > 0
          queue << [neighbor, path + [neighbor]]
        end
      end
    end
    
    nil
  end
end

Edmonds-Karp Algorithm: This variant of Ford-Fulkerson uses breadth-first search to find the shortest augmenting path (in terms of number of edges). This guarantees polynomial time complexity.

class EdmondsKarp
  def initialize(network)
    @network = network
    @flow = Hash.new(0)
  end
  
  def max_flow(source, sink)
    total_flow = 0
    
    while path_data = bfs_path(source, sink)
      path, min_capacity = path_data
      
      # Update flow
      path.each_cons(2) do |u, v|
        @flow[[u, v]] += min_capacity
        @flow[[v, u]] -= min_capacity
      end
      
      total_flow += min_capacity
    end
    
    @network.flow = @flow
    total_flow
  end
  
  private
  
  def bfs_path(source, sink)
    parent = {}
    visited = Set.new([source])
    queue = [source]
    
    while !queue.empty?
      current = queue.shift
      
      @network.graph[current].each do |neighbor, capacity|
        residual = @network.residual_capacity(current, neighbor)
        
        if residual > 0 && !visited.include?(neighbor)
          parent[neighbor] = current
          visited.add(neighbor)
          return build_path(parent, source, sink) if neighbor == sink
          queue << neighbor
        end
      end
    end
    
    nil
  end
  
  def build_path(parent, source, sink)
    path = [sink]
    current = sink
    
    while current != source
      current = parent[current]
      path.unshift(current)
    end
    
    min_capacity = path.each_cons(2).map { |u, v|
      @network.residual_capacity(u, v)
    }.min
    
    [path, min_capacity]
  end
end

Handling Multiple Sources or Sinks: Transform the problem by adding a supersource connected to all sources with infinite capacity, or a supersink connected from all sinks with infinite capacity.

def add_supersource(network, sources)
  supersource = :supersource
  sources.each do |source|
    network.add_edge(supersource, source, Float::INFINITY)
  end
  supersource
end

def add_supersink(network, sinks)
  supersink = :supersink
  sinks.each do |sink|
    network.add_edge(sink, supersink, Float::INFINITY)
  end
  supersink
end

Minimum Cut Calculation: After computing maximum flow, the minimum cut can be found by identifying vertices reachable from the source in the residual graph.

def minimum_cut(network, source, sink)
  reachable = find_reachable(network, source)
  
  cut_edges = []
  reachable.each do |u|
    network.graph[u].each do |v, capacity|
      if !reachable.include?(v) && capacity > 0
        cut_edges << [u, v, capacity]
      end
    end
  end
  
  cut_edges
end

def find_reachable(network, source)
  visited = Set.new
  queue = [source]
  
  while !queue.empty?
    current = queue.shift
    next if visited.include?(current)
    visited.add(current)
    
    network.graph[current].each do |neighbor, _|
      if network.residual_capacity(current, neighbor) > 0
        queue << neighbor
      end
    end
  end
  
  visited
end

Practical Examples

Network flow algorithms solve a wide range of real-world problems through direct modeling or problem transformation.

Bipartite Matching: Given a bipartite graph with sets L and R, find the maximum number of edges such that no two edges share a vertex. Transform to max flow by adding a source connected to all L vertices and a sink connected from all R vertices, with all edges having capacity 1.

def max_bipartite_matching(left, right, edges)
  network = FlowNetwork.new
  source = :source
  sink = :sink
  
  # Connect source to left vertices
  left.each do |l|
    network.add_edge(source, l, 1)
  end
  
  # Add edges between left and right
  edges.each do |l, r|
    network.add_edge(l, r, 1)
  end
  
  # Connect right vertices to sink
  right.each do |r|
    network.add_edge(r, sink, 1)
  end
  
  solver = EdmondsKarp.new(network)
  matching_size = solver.max_flow(source, sink)
  
  # Extract matching edges
  matching = []
  left.each do |l|
    right.each do |r|
      if network.flow[[l, r]] == 1
        matching << [l, r]
      end
    end
  end
  
  { size: matching_size, edges: matching }
end

# Example: Job assignment
programmers = [:alice, :bob, :charlie]
projects = [:web, :mobile, :backend]
skills = [
  [:alice, :web], [:alice, :backend],
  [:bob, :web], [:bob, :mobile],
  [:charlie, :backend], [:charlie, :mobile]
]

result = max_bipartite_matching(programmers, projects, skills)
# => { size: 3, edges: [[:alice, :backend], [:bob, :mobile], [:charlie, :web]] }

Network Reliability: Determine the minimum number of edges whose removal disconnects source from sink. This equals the value of maximum flow when all edges have capacity 1.

def edge_connectivity(graph, source, sink)
  network = FlowNetwork.new
  
  graph.each do |u, neighbors|
    neighbors.each do |v|
      network.add_edge(u, v, 1)
    end
  end
  
  solver = EdmondsKarp.new(network)
  solver.max_flow(source, sink)
end

# Network with redundant paths
network_graph = {
  'A' => ['B', 'C'],
  'B' => ['D', 'E'],
  'C' => ['D', 'E'],
  'D' => ['F'],
  'E' => ['F']
}

connectivity = edge_connectivity(network_graph, 'A', 'F')
# => 2 (must remove at least 2 edges to disconnect A from F)

Project Selection: Given projects with profits and dependencies, select a subset maximizing total profit. Transform to min-cut problem where cut capacity represents lost profit from rejected projects.

def project_selection(projects, dependencies)
  network = FlowNetwork.new
  source = :source
  sink = :sink
  
  total_profit = 0
  
  # Projects with positive profit connect from source
  projects.each do |project, profit|
    if profit > 0
      network.add_edge(source, project, profit)
      total_profit += profit
    else
      network.add_edge(project, sink, -profit)
    end
  end
  
  # Dependencies: if we select dependent, must select dependency
  dependencies.each do |dependent, dependency|
    network.add_edge(dependency, dependent, Float::INFINITY)
  end
  
  solver = EdmondsKarp.new(network)
  min_cut_value = solver.max_flow(source, sink)
  
  # Maximum profit = total positive profit - min cut
  max_profit = total_profit - min_cut_value
  
  # Selected projects are those reachable from source
  selected = find_reachable(network, source) - [source]
  
  { profit: max_profit, projects: selected }
end

projects_data = {
  :A => 100, :B => 200, :C => -50, :D => 150
}
deps = [[:A, :C], [:B, :C], [:D, :A]]

selection = project_selection(projects_data, deps)
# => { profit: 250, projects: [:B, :C, :D, :A] }

Image Segmentation: Partition image pixels into foreground and background by minimizing boundary cost. Each pixel becomes a vertex, with edges to neighbors weighted by similarity. Source connects to likely foreground pixels, sink to likely background pixels.

def image_segmentation(pixels, foreground_weights, background_weights, neighbor_weights)
  network = FlowNetwork.new
  source = :source
  sink = :sink
  
  pixels.each do |pixel|
    # Foreground likelihood
    if foreground_weights[pixel] > 0
      network.add_edge(source, pixel, foreground_weights[pixel])
    end
    
    # Background likelihood
    if background_weights[pixel] > 0
      network.add_edge(pixel, sink, background_weights[pixel])
    end
  end
  
  # Neighbor similarity (cost to separate)
  neighbor_weights.each do |(p1, p2), weight|
    network.add_edge(p1, p2, weight)
    network.add_edge(p2, p1, weight)
  end
  
  solver = EdmondsKarp.new(network)
  solver.max_flow(source, sink)
  
  foreground = find_reachable(network, source) - [source]
  background = pixels - foreground
  
  { foreground: foreground, background: background }
end

Performance Considerations

Different network flow algorithms exhibit vastly different performance characteristics depending on network structure and edge capacities.

Ford-Fulkerson Method: The runtime depends on how augmenting paths are found and the maximum flow value. In the worst case with integer capacities, if the maximum flow is F and there are E edges, the algorithm runs in O(E * F) time. Each augmenting path increases flow by at least 1, so at most F iterations occur, and each path search takes O(E) time.

This analysis reveals a critical weakness: performance degrades when maximum flow value is large. Consider a network where maximum flow equals 1,000,000 but only 3 augmenting paths are needed if chosen wisely. Ford-Fulkerson might find 1,000,000 augmenting paths if it makes poor choices, each increasing flow by only 1.

Edmonds-Karp Algorithm: Using BFS to find shortest augmenting paths guarantees O(V * E^2) time complexity, independent of flow values. The algorithm performs at most O(V * E) augmenting path iterations because BFS ensures each iteration increases the distance from source to some vertex. Each BFS takes O(E) time.

This represents a significant improvement when capacities are large integers or irrational numbers. The algorithm's performance depends only on network size, not capacity values.

require 'benchmark'

def benchmark_flow_algorithms(network, source, sink)
  Benchmark.bm(20) do |x|
    x.report("Ford-Fulkerson:") {
      ff = FordFulkerson.new(network.dup)
      ff.max_flow(source, sink)
    }
    
    x.report("Edmonds-Karp:") {
      ek = EdmondsKarp.new(network.dup)
      ek.max_flow(source, sink)
    }
  end
end

Dinic's Algorithm: This advanced algorithm runs in O(V^2 * E) time by using level graphs and blocking flows. The algorithm builds a level graph using BFS, then finds blocking flows using DFS. A blocking flow is a flow where every path from source to sink contains at least one saturated edge.

Dinic's algorithm performs better than Edmonds-Karp on dense graphs or graphs with special structure. For bipartite matching, Dinic's achieves O(E * √V) time, significantly faster than the general bound.

Push-Relabel Algorithms: These algorithms achieve O(V^2 * E) or O(V^3) time through a different approach. Instead of finding augmenting paths, they maintain a preflow (which violates conservation at some vertices) and push excess flow toward the sink. The relabel operation adjusts vertex heights to enable pushing flow.

Push-relabel algorithms excel on dense graphs and often outperform augmenting path methods in practice despite similar theoretical complexity.

Capacity Scaling: When edge capacities are integers, capacity scaling reduces the effective maximum flow value by processing high-capacity edges first. The algorithm runs in O(E^2 * log U) time where U is the maximum capacity. For networks where U is small compared to the maximum flow, this provides substantial speedup.

Practical Performance Factors: Real-world performance depends on graph density, capacity distribution, and source-sink distance:

Sparse graphs (E ≈ V): Edmonds-Karp or Dinic's algorithm performs well
Dense graphs (E ≈ V^2): Push-relabel algorithms often faster
Small integer capacities: Capacity scaling effective
Unit capacity networks: Specialized algorithms like Hopcroft-Karp for bipartite matching
Long paths from source to sink: Algorithms with better distance handling perform better

# Performance comparison on different graph types
def compare_on_graph_types
  # Sparse graph
  sparse = create_sparse_network(1000, 2000)
  
  # Dense graph  
  dense = create_dense_network(100, 4000)
  
  # Unit capacity graph
  unit = create_unit_capacity_network(500, 1500)
  
  [
    ["Sparse (V=1000, E=2000)", sparse],
    ["Dense (V=100, E=4000)", dense],
    ["Unit capacity (V=500, E=1500)", unit]
  ].each do |name, network|
    puts "\n#{name}:"
    benchmark_flow_algorithms(network, :s, :t)
  end
end

Common Patterns

Several recurring patterns appear when modeling problems as network flow.

Source-Sink Transformation Pattern: Problems with multiple sources or sinks transform to single source-sink by adding supersource or supersink nodes. The supersource connects to all original sources with infinite capacity, ensuring it never constrains flow. Similarly, all original sinks connect to the supersink with infinite capacity.

This pattern applies when modeling scenarios like:

Multiple warehouses (sources) shipping to multiple stores (sinks)
Multiple power plants (sources) supplying multiple cities (sinks)
Multiple data sources feeding into multiple processing systems

Vertex Capacity Pattern: When vertices have capacity constraints (not just edges), split each vertex v into v_in and v_out with an edge (v_in, v_out) of capacity equal to the vertex capacity. All incoming edges connect to v_in, and all outgoing edges leave from v_out.

def add_vertex_capacity(network, vertex, capacity)
  v_in = "#{vertex}_in".to_sym
  v_out = "#{vertex}_out".to_sym
  
  # Capacity edge through vertex
  network.add_edge(v_in, v_out, capacity)
  
  # Redirect edges
  network.graph[vertex].each do |neighbor, edge_capacity|
    network.add_edge(v_out, neighbor, edge_capacity)
    network.graph[vertex].delete(neighbor)
  end
  
  network.graph.each do |u, neighbors|
    if neighbors.key?(vertex)
      capacity = neighbors[vertex]
      network.add_edge(u, v_in, capacity)
      neighbors.delete(vertex)
    end
  end
end

Disjoint Path Pattern: Finding maximum number of edge-disjoint paths from source to sink reduces to max flow with all capacities set to 1. The flow value equals the number of disjoint paths, and the paths can be extracted from the flow assignment.

def find_disjoint_paths(graph, source, sink)
  network = FlowNetwork.new
  
  graph.each do |u, neighbors|
    neighbors.each do |v|
      network.add_edge(u, v, 1)
    end
  end
  
  solver = EdmondsKarp.new(network)
  num_paths = solver.max_flow(source, sink)
  
  # Extract paths from flow
  paths = []
  num_paths.times do
    path = trace_path(network, source, sink)
    paths << path if path
  end
  
  paths
end

def trace_path(network, source, sink)
  path = [source]
  current = source
  
  while current != sink
    next_vertex = network.graph[current].find do |neighbor, _|
      network.flow[[current, neighbor]] > 0
    end
    
    return nil unless next_vertex
    
    neighbor = next_vertex[0]
    path << neighbor
    network.flow[[current, neighbor]] -= 1
    current = neighbor
  end
  
  path
end

Circulation Pattern: A circulation is a flow with no source or sink where flow is conserved at all vertices. Additionally, each edge may have both a lower bound l(u, v) and upper bound u(u, v) on flow: l(u, v) ≤ f(u, v) ≤ u(u, v).

Circulation with lower bounds transforms to standard max flow by:

Send l(u, v) flow on each edge initially
Create new edge capacities: c'(u, v) = u(u, v) - l(u, v)
Add a source and sink to balance excess/deficit at vertices

Cost Flow Pattern: Minimum cost maximum flow assigns costs to edges and seeks the maximum flow with minimum total cost. While beyond standard max flow, the pattern appears frequently in optimization problems like:

Shipping products to minimize transportation costs
Assigning tasks to minimize completion time
Network routing to minimize latency

The successive shortest paths algorithm solves min-cost max flow by repeatedly finding augmenting paths with minimum cost (using Bellman-Ford or Dijkstra).

Reference

Algorithm Comparison

Algorithm	Time Complexity	Space	Best Use Case
Ford-Fulkerson	O(E times max_flow)	O(V + E)	Small integer capacities
Edmonds-Karp	O(V times E squared)	O(V + E)	General purpose, guaranteed polynomial
Dinic	O(V squared times E)	O(V + E)	Dense graphs, bipartite matching
Push-Relabel	O(V squared times E)	O(V + E)	Dense graphs, practical performance
Capacity Scaling	O(E squared times log U)	O(V + E)	Small maximum capacity U

Key Terminology

Term	Definition
Flow Network	Directed graph with source, sink, and edge capacities
Capacity	Maximum flow allowed on an edge
Flow	Actual amount flowing on an edge
Source	Vertex where flow originates
Sink	Vertex where flow terminates
Augmenting Path	Path from source to sink in residual network
Residual Network	Graph showing remaining capacity after current flow
Residual Capacity	Remaining capacity on an edge given current flow
Cut	Partition of vertices into source-side and sink-side sets
Capacity of Cut	Sum of capacities from source-side to sink-side
Minimum Cut	Cut with minimum capacity
Blocking Flow	Flow where every source-to-sink path has saturated edge
Level Graph	Subgraph with edges on shortest paths from source

Problem Transformations

Problem	Transformation
Bipartite Matching	Add source to left vertices, sink from right vertices, capacity 1
Edge Connectivity	All edge capacities 1, max flow equals min edges to remove
Vertex Connectivity	Split vertices, all capacities 1
Project Selection	Positive profits from source, negative to sink, dependencies infinite
Image Segmentation	Pixels as vertices, source for foreground, sink for background
Multiple Sources	Add supersource with infinite capacity to each source
Multiple Sinks	Add supersink with infinite capacity from each sink
Vertex Capacities	Split vertex into in and out nodes with capacity edge
Disjoint Paths	All capacities 1, flow value equals number of paths

Complexity Classes

Graph Type	V	E	Best Algorithm	Time
Sparse	n	O(n)	Edmonds-Karp	O(n cubed)
Dense	n	O(n squared)	Push-Relabel	O(n cubed)
Bipartite	n	m	Dinic	O(m times sqrt n)
Unit Capacity	n	m	Dinic	O(min(n to 2/3, m to 1/2) times m)
Small Capacity U	n	m	Scaling	O(m squared times log U)

Implementation Checklist

Component	Considerations
Graph Representation	Hash of hashes, adjacency list, edge list
Flow Storage	Hash of edge pairs to flow values
Residual Capacity	Forward and backward edges, handle both directions
Path Finding	BFS for Edmonds-Karp, DFS for Dinic, custom for others
Flow Update	Increment forward, decrement backward along path
Termination	No augmenting path exists in residual network
Flow Value	Sum of flow from source or into sink
Min Cut	BFS/DFS from source in residual graph after max flow

Network Flow