Overview
Network flow addresses a fundamental class of optimization problems in computer science and operations research. The core problem involves a directed graph where edges have capacity constraints, and the goal is to determine the maximum amount of flow that can be sent from a source node to a sink node while respecting these constraints.
The concept originated from transportation and distribution problems. Consider a pipeline network where oil flows from production facilities to refineries, or a road network where traffic moves from suburbs to city centers. Each pipe or road has a maximum capacity, and the network flow problem determines the maximum throughput possible given these constraints.
Network flow problems appear across diverse domains: matching problems in bipartite graphs, image segmentation in computer vision, airline crew scheduling, network reliability analysis, and resource allocation in distributed systems. The algorithms developed for network flow also provide solutions to seemingly unrelated problems through problem transformation.
A flow network consists of a directed graph G = (V, E) with:
- A source vertex s where flow originates
- A sink vertex t where flow terminates
- Capacity function c(u, v) ≥ 0 for each edge (u, v)
- Flow function f(u, v) representing actual flow on each edge
The maximum flow problem asks: what is the maximum amount of flow that can be sent from s to t?
# Simple flow network representation
network = {
's' => { 'a' => 10, 'b' => 5 },
'a' => { 'b' => 15, 't' => 10 },
'b' => { 't' => 10 }
}
# Maximum flow from 's' to 't' is 15
Key Principles
The foundation of network flow rests on several mathematical principles that govern how flow behaves in a network.
Capacity Constraints: For each edge (u, v) in the network, the flow cannot exceed the edge's capacity: 0 ≤ f(u, v) ≤ c(u, v). This constraint ensures physical or logical limits are respected. In a water pipe network, you cannot pump more water than the pipe's diameter allows.
Flow Conservation: At every vertex except the source and sink, the total incoming flow equals the total outgoing flow. Mathematically, for all vertices v ∈ V - {s, t}: Σ f(u, v) = Σ f(v, w) where the first sum is over all incoming edges and the second over all outgoing edges. This principle ensures flow is neither created nor destroyed at intermediate nodes.
Skew Symmetry: The flow from u to v is the negative of flow from v to u: f(u, v) = -f(v, u). This property handles the mathematical representation of flow in both directions and simplifies certain algorithmic implementations.
The value of a flow |f| equals the total flow leaving the source (or equivalently, entering the sink): |f| = Σ f(s, v) for all v where (s, v) ∈ E.
A residual network G_f represents remaining capacity after accounting for current flow. For each edge (u, v) with flow f(u, v) and capacity c(u, v), the residual network contains:
- Forward edge (u, v) with residual capacity c_f(u, v) = c(u, v) - f(u, v)
- Backward edge (v, u) with residual capacity c_f(v, u) = f(u, v)
The backward edge allows the algorithm to "undo" flow sent earlier if a better path is found.
An augmenting path is a path from s to t in the residual network where every edge has positive residual capacity. The existence of an augmenting path indicates the current flow can be increased. The minimum residual capacity along an augmenting path determines how much additional flow can be pushed through that path.
A cut (S, T) partitions vertices into two sets where s ∈ S and t ∈ T. The capacity of a cut equals the sum of capacities of edges going from S to T. A minimum cut is a cut with minimum capacity among all cuts.
The Max-Flow Min-Cut Theorem states that the maximum value of flow in a network equals the minimum capacity of all cuts. This fundamental result connects flow maximization with cut minimization and provides the theoretical foundation for proving algorithm correctness.
# Residual capacity calculation
def residual_capacity(capacity, flow, u, v)
# Forward edge: remaining capacity
forward = capacity[[u, v]] - flow[[u, v]]
# Backward edge: flow that can be cancelled
backward = flow[[v, u]]
forward + backward
end
Ruby Implementation
Implementing network flow algorithms in Ruby requires representing the graph structure and flow values, then applying an augmenting path algorithm until no more augmenting paths exist.
Graph Representation: A hash-based adjacency list provides a clean representation where keys are vertices and values are hashes mapping neighbors to capacities.
class FlowNetwork
attr_reader :graph, :flow
def initialize
@graph = Hash.new { |h, k| h[k] = {} }
@flow = Hash.new(0)
end
def add_edge(from, to, capacity)
@graph[from][to] = capacity
@graph[to][from] ||= 0 # Ensure reverse edge exists
end
def residual_capacity(u, v)
capacity = @graph[u][v] || 0
capacity - @flow[[u, v]] + @flow[[v, u]]
end
end
Ford-Fulkerson Method: This method repeatedly finds augmenting paths and pushes flow along them. The method terminates when no augmenting path exists. The specific implementation depends on how augmenting paths are found.
class FordFulkerson
def initialize(network)
@network = network
@flow = Hash.new(0)
end
def max_flow(source, sink)
total_flow = 0
while path = find_path(source, sink)
# Find minimum residual capacity along path
flow_increment = path.each_cons(2).map { |u, v|
@network.residual_capacity(u, v)
}.min
# Update flow along path
path.each_cons(2) do |u, v|
@flow[[u, v]] += flow_increment
@flow[[v, u]] -= flow_increment
end
total_flow += flow_increment
end
@network.flow = @flow
total_flow
end
private
def find_path(source, sink)
visited = Set.new
queue = [[source, [source]]]
while !queue.empty?
current, path = queue.shift
return path if current == sink
next if visited.include?(current)
visited.add(current)
@network.graph[current].each do |neighbor, _|
if @network.residual_capacity(current, neighbor) > 0
queue << [neighbor, path + [neighbor]]
end
end
end
nil
end
end
Edmonds-Karp Algorithm: This variant of Ford-Fulkerson uses breadth-first search to find the shortest augmenting path (in terms of number of edges). This guarantees polynomial time complexity.
class EdmondsKarp
def initialize(network)
@network = network
@flow = Hash.new(0)
end
def max_flow(source, sink)
total_flow = 0
while path_data = bfs_path(source, sink)
path, min_capacity = path_data
# Update flow
path.each_cons(2) do |u, v|
@flow[[u, v]] += min_capacity
@flow[[v, u]] -= min_capacity
end
total_flow += min_capacity
end
@network.flow = @flow
total_flow
end
private
def bfs_path(source, sink)
parent = {}
visited = Set.new([source])
queue = [source]
while !queue.empty?
current = queue.shift
@network.graph[current].each do |neighbor, capacity|
residual = @network.residual_capacity(current, neighbor)
if residual > 0 && !visited.include?(neighbor)
parent[neighbor] = current
visited.add(neighbor)
return build_path(parent, source, sink) if neighbor == sink
queue << neighbor
end
end
end
nil
end
def build_path(parent, source, sink)
path = [sink]
current = sink
while current != source
current = parent[current]
path.unshift(current)
end
min_capacity = path.each_cons(2).map { |u, v|
@network.residual_capacity(u, v)
}.min
[path, min_capacity]
end
end
Handling Multiple Sources or Sinks: Transform the problem by adding a supersource connected to all sources with infinite capacity, or a supersink connected from all sinks with infinite capacity.
def add_supersource(network, sources)
supersource = :supersource
sources.each do |source|
network.add_edge(supersource, source, Float::INFINITY)
end
supersource
end
def add_supersink(network, sinks)
supersink = :supersink
sinks.each do |sink|
network.add_edge(sink, supersink, Float::INFINITY)
end
supersink
end
Minimum Cut Calculation: After computing maximum flow, the minimum cut can be found by identifying vertices reachable from the source in the residual graph.
def minimum_cut(network, source, sink)
reachable = find_reachable(network, source)
cut_edges = []
reachable.each do |u|
network.graph[u].each do |v, capacity|
if !reachable.include?(v) && capacity > 0
cut_edges << [u, v, capacity]
end
end
end
cut_edges
end
def find_reachable(network, source)
visited = Set.new
queue = [source]
while !queue.empty?
current = queue.shift
next if visited.include?(current)
visited.add(current)
network.graph[current].each do |neighbor, _|
if network.residual_capacity(current, neighbor) > 0
queue << neighbor
end
end
end
visited
end
Practical Examples
Network flow algorithms solve a wide range of real-world problems through direct modeling or problem transformation.
Bipartite Matching: Given a bipartite graph with sets L and R, find the maximum number of edges such that no two edges share a vertex. Transform to max flow by adding a source connected to all L vertices and a sink connected from all R vertices, with all edges having capacity 1.
def max_bipartite_matching(left, right, edges)
network = FlowNetwork.new
source = :source
sink = :sink
# Connect source to left vertices
left.each do |l|
network.add_edge(source, l, 1)
end
# Add edges between left and right
edges.each do |l, r|
network.add_edge(l, r, 1)
end
# Connect right vertices to sink
right.each do |r|
network.add_edge(r, sink, 1)
end
solver = EdmondsKarp.new(network)
matching_size = solver.max_flow(source, sink)
# Extract matching edges
matching = []
left.each do |l|
right.each do |r|
if network.flow[[l, r]] == 1
matching << [l, r]
end
end
end
{ size: matching_size, edges: matching }
end
# Example: Job assignment
programmers = [:alice, :bob, :charlie]
projects = [:web, :mobile, :backend]
skills = [
[:alice, :web], [:alice, :backend],
[:bob, :web], [:bob, :mobile],
[:charlie, :backend], [:charlie, :mobile]
]
result = max_bipartite_matching(programmers, projects, skills)
# => { size: 3, edges: [[:alice, :backend], [:bob, :mobile], [:charlie, :web]] }
Network Reliability: Determine the minimum number of edges whose removal disconnects source from sink. This equals the value of maximum flow when all edges have capacity 1.
def edge_connectivity(graph, source, sink)
network = FlowNetwork.new
graph.each do |u, neighbors|
neighbors.each do |v|
network.add_edge(u, v, 1)
end
end
solver = EdmondsKarp.new(network)
solver.max_flow(source, sink)
end
# Network with redundant paths
network_graph = {
'A' => ['B', 'C'],
'B' => ['D', 'E'],
'C' => ['D', 'E'],
'D' => ['F'],
'E' => ['F']
}
connectivity = edge_connectivity(network_graph, 'A', 'F')
# => 2 (must remove at least 2 edges to disconnect A from F)
Project Selection: Given projects with profits and dependencies, select a subset maximizing total profit. Transform to min-cut problem where cut capacity represents lost profit from rejected projects.
def project_selection(projects, dependencies)
network = FlowNetwork.new
source = :source
sink = :sink
total_profit = 0
# Projects with positive profit connect from source
projects.each do |project, profit|
if profit > 0
network.add_edge(source, project, profit)
total_profit += profit
else
network.add_edge(project, sink, -profit)
end
end
# Dependencies: if we select dependent, must select dependency
dependencies.each do |dependent, dependency|
network.add_edge(dependency, dependent, Float::INFINITY)
end
solver = EdmondsKarp.new(network)
min_cut_value = solver.max_flow(source, sink)
# Maximum profit = total positive profit - min cut
max_profit = total_profit - min_cut_value
# Selected projects are those reachable from source
selected = find_reachable(network, source) - [source]
{ profit: max_profit, projects: selected }
end
projects_data = {
:A => 100, :B => 200, :C => -50, :D => 150
}
deps = [[:A, :C], [:B, :C], [:D, :A]]
selection = project_selection(projects_data, deps)
# => { profit: 250, projects: [:B, :C, :D, :A] }
Image Segmentation: Partition image pixels into foreground and background by minimizing boundary cost. Each pixel becomes a vertex, with edges to neighbors weighted by similarity. Source connects to likely foreground pixels, sink to likely background pixels.
def image_segmentation(pixels, foreground_weights, background_weights, neighbor_weights)
network = FlowNetwork.new
source = :source
sink = :sink
pixels.each do |pixel|
# Foreground likelihood
if foreground_weights[pixel] > 0
network.add_edge(source, pixel, foreground_weights[pixel])
end
# Background likelihood
if background_weights[pixel] > 0
network.add_edge(pixel, sink, background_weights[pixel])
end
end
# Neighbor similarity (cost to separate)
neighbor_weights.each do |(p1, p2), weight|
network.add_edge(p1, p2, weight)
network.add_edge(p2, p1, weight)
end
solver = EdmondsKarp.new(network)
solver.max_flow(source, sink)
foreground = find_reachable(network, source) - [source]
background = pixels - foreground
{ foreground: foreground, background: background }
end
Performance Considerations
Different network flow algorithms exhibit vastly different performance characteristics depending on network structure and edge capacities.
Ford-Fulkerson Method: The runtime depends on how augmenting paths are found and the maximum flow value. In the worst case with integer capacities, if the maximum flow is F and there are E edges, the algorithm runs in O(E * F) time. Each augmenting path increases flow by at least 1, so at most F iterations occur, and each path search takes O(E) time.
This analysis reveals a critical weakness: performance degrades when maximum flow value is large. Consider a network where maximum flow equals 1,000,000 but only 3 augmenting paths are needed if chosen wisely. Ford-Fulkerson might find 1,000,000 augmenting paths if it makes poor choices, each increasing flow by only 1.
Edmonds-Karp Algorithm: Using BFS to find shortest augmenting paths guarantees O(V * E^2) time complexity, independent of flow values. The algorithm performs at most O(V * E) augmenting path iterations because BFS ensures each iteration increases the distance from source to some vertex. Each BFS takes O(E) time.
This represents a significant improvement when capacities are large integers or irrational numbers. The algorithm's performance depends only on network size, not capacity values.
require 'benchmark'
def benchmark_flow_algorithms(network, source, sink)
Benchmark.bm(20) do |x|
x.report("Ford-Fulkerson:") {
ff = FordFulkerson.new(network.dup)
ff.max_flow(source, sink)
}
x.report("Edmonds-Karp:") {
ek = EdmondsKarp.new(network.dup)
ek.max_flow(source, sink)
}
end
end
Dinic's Algorithm: This advanced algorithm runs in O(V^2 * E) time by using level graphs and blocking flows. The algorithm builds a level graph using BFS, then finds blocking flows using DFS. A blocking flow is a flow where every path from source to sink contains at least one saturated edge.
Dinic's algorithm performs better than Edmonds-Karp on dense graphs or graphs with special structure. For bipartite matching, Dinic's achieves O(E * √V) time, significantly faster than the general bound.
Push-Relabel Algorithms: These algorithms achieve O(V^2 * E) or O(V^3) time through a different approach. Instead of finding augmenting paths, they maintain a preflow (which violates conservation at some vertices) and push excess flow toward the sink. The relabel operation adjusts vertex heights to enable pushing flow.
Push-relabel algorithms excel on dense graphs and often outperform augmenting path methods in practice despite similar theoretical complexity.
Capacity Scaling: When edge capacities are integers, capacity scaling reduces the effective maximum flow value by processing high-capacity edges first. The algorithm runs in O(E^2 * log U) time where U is the maximum capacity. For networks where U is small compared to the maximum flow, this provides substantial speedup.
Practical Performance Factors: Real-world performance depends on graph density, capacity distribution, and source-sink distance:
- Sparse graphs (E ≈ V): Edmonds-Karp or Dinic's algorithm performs well
- Dense graphs (E ≈ V^2): Push-relabel algorithms often faster
- Small integer capacities: Capacity scaling effective
- Unit capacity networks: Specialized algorithms like Hopcroft-Karp for bipartite matching
- Long paths from source to sink: Algorithms with better distance handling perform better
# Performance comparison on different graph types
def compare_on_graph_types
# Sparse graph
sparse = create_sparse_network(1000, 2000)
# Dense graph
dense = create_dense_network(100, 4000)
# Unit capacity graph
unit = create_unit_capacity_network(500, 1500)
[
["Sparse (V=1000, E=2000)", sparse],
["Dense (V=100, E=4000)", dense],
["Unit capacity (V=500, E=1500)", unit]
].each do |name, network|
puts "\n#{name}:"
benchmark_flow_algorithms(network, :s, :t)
end
end
Common Patterns
Several recurring patterns appear when modeling problems as network flow.
Source-Sink Transformation Pattern: Problems with multiple sources or sinks transform to single source-sink by adding supersource or supersink nodes. The supersource connects to all original sources with infinite capacity, ensuring it never constrains flow. Similarly, all original sinks connect to the supersink with infinite capacity.
This pattern applies when modeling scenarios like:
- Multiple warehouses (sources) shipping to multiple stores (sinks)
- Multiple power plants (sources) supplying multiple cities (sinks)
- Multiple data sources feeding into multiple processing systems
Vertex Capacity Pattern: When vertices have capacity constraints (not just edges), split each vertex v into v_in and v_out with an edge (v_in, v_out) of capacity equal to the vertex capacity. All incoming edges connect to v_in, and all outgoing edges leave from v_out.
def add_vertex_capacity(network, vertex, capacity)
v_in = "#{vertex}_in".to_sym
v_out = "#{vertex}_out".to_sym
# Capacity edge through vertex
network.add_edge(v_in, v_out, capacity)
# Redirect edges
network.graph[vertex].each do |neighbor, edge_capacity|
network.add_edge(v_out, neighbor, edge_capacity)
network.graph[vertex].delete(neighbor)
end
network.graph.each do |u, neighbors|
if neighbors.key?(vertex)
capacity = neighbors[vertex]
network.add_edge(u, v_in, capacity)
neighbors.delete(vertex)
end
end
end
Disjoint Path Pattern: Finding maximum number of edge-disjoint paths from source to sink reduces to max flow with all capacities set to 1. The flow value equals the number of disjoint paths, and the paths can be extracted from the flow assignment.
def find_disjoint_paths(graph, source, sink)
network = FlowNetwork.new
graph.each do |u, neighbors|
neighbors.each do |v|
network.add_edge(u, v, 1)
end
end
solver = EdmondsKarp.new(network)
num_paths = solver.max_flow(source, sink)
# Extract paths from flow
paths = []
num_paths.times do
path = trace_path(network, source, sink)
paths << path if path
end
paths
end
def trace_path(network, source, sink)
path = [source]
current = source
while current != sink
next_vertex = network.graph[current].find do |neighbor, _|
network.flow[[current, neighbor]] > 0
end
return nil unless next_vertex
neighbor = next_vertex[0]
path << neighbor
network.flow[[current, neighbor]] -= 1
current = neighbor
end
path
end
Circulation Pattern: A circulation is a flow with no source or sink where flow is conserved at all vertices. Additionally, each edge may have both a lower bound l(u, v) and upper bound u(u, v) on flow: l(u, v) ≤ f(u, v) ≤ u(u, v).
Circulation with lower bounds transforms to standard max flow by:
- Send l(u, v) flow on each edge initially
- Create new edge capacities: c'(u, v) = u(u, v) - l(u, v)
- Add a source and sink to balance excess/deficit at vertices
Cost Flow Pattern: Minimum cost maximum flow assigns costs to edges and seeks the maximum flow with minimum total cost. While beyond standard max flow, the pattern appears frequently in optimization problems like:
- Shipping products to minimize transportation costs
- Assigning tasks to minimize completion time
- Network routing to minimize latency
The successive shortest paths algorithm solves min-cost max flow by repeatedly finding augmenting paths with minimum cost (using Bellman-Ford or Dijkstra).
Reference
Algorithm Comparison
| Algorithm | Time Complexity | Space | Best Use Case |
|---|---|---|---|
| Ford-Fulkerson | O(E times max_flow) | O(V + E) | Small integer capacities |
| Edmonds-Karp | O(V times E squared) | O(V + E) | General purpose, guaranteed polynomial |
| Dinic | O(V squared times E) | O(V + E) | Dense graphs, bipartite matching |
| Push-Relabel | O(V squared times E) | O(V + E) | Dense graphs, practical performance |
| Capacity Scaling | O(E squared times log U) | O(V + E) | Small maximum capacity U |
Key Terminology
| Term | Definition |
|---|---|
| Flow Network | Directed graph with source, sink, and edge capacities |
| Capacity | Maximum flow allowed on an edge |
| Flow | Actual amount flowing on an edge |
| Source | Vertex where flow originates |
| Sink | Vertex where flow terminates |
| Augmenting Path | Path from source to sink in residual network |
| Residual Network | Graph showing remaining capacity after current flow |
| Residual Capacity | Remaining capacity on an edge given current flow |
| Cut | Partition of vertices into source-side and sink-side sets |
| Capacity of Cut | Sum of capacities from source-side to sink-side |
| Minimum Cut | Cut with minimum capacity |
| Blocking Flow | Flow where every source-to-sink path has saturated edge |
| Level Graph | Subgraph with edges on shortest paths from source |
Problem Transformations
| Problem | Transformation |
|---|---|
| Bipartite Matching | Add source to left vertices, sink from right vertices, capacity 1 |
| Edge Connectivity | All edge capacities 1, max flow equals min edges to remove |
| Vertex Connectivity | Split vertices, all capacities 1 |
| Project Selection | Positive profits from source, negative to sink, dependencies infinite |
| Image Segmentation | Pixels as vertices, source for foreground, sink for background |
| Multiple Sources | Add supersource with infinite capacity to each source |
| Multiple Sinks | Add supersink with infinite capacity from each sink |
| Vertex Capacities | Split vertex into in and out nodes with capacity edge |
| Disjoint Paths | All capacities 1, flow value equals number of paths |
Complexity Classes
| Graph Type | V | E | Best Algorithm | Time |
|---|---|---|---|---|
| Sparse | n | O(n) | Edmonds-Karp | O(n cubed) |
| Dense | n | O(n squared) | Push-Relabel | O(n cubed) |
| Bipartite | n | m | Dinic | O(m times sqrt n) |
| Unit Capacity | n | m | Dinic | O(min(n to 2/3, m to 1/2) times m) |
| Small Capacity U | n | m | Scaling | O(m squared times log U) |
Implementation Checklist
| Component | Considerations |
|---|---|
| Graph Representation | Hash of hashes, adjacency list, edge list |
| Flow Storage | Hash of edge pairs to flow values |
| Residual Capacity | Forward and backward edges, handle both directions |
| Path Finding | BFS for Edmonds-Karp, DFS for Dinic, custom for others |
| Flow Update | Increment forward, decrement backward along path |
| Termination | No augmenting path exists in residual network |
| Flow Value | Sum of flow from source or into sink |
| Min Cut | BFS/DFS from source in residual graph after max flow |