Overview
Graph traversal algorithms explore nodes and edges in a graph data structure. Breadth-First Search (BFS) and Depth-First Search (DFS) represent the two primary traversal strategies, each visiting all reachable nodes from a starting point but differing fundamentally in their exploration order.
BFS explores nodes level by level, visiting all neighbors of the current node before moving to neighbors of those neighbors. This approach uses a queue data structure to maintain the order of exploration, ensuring that nodes closer to the starting point are visited before more distant nodes.
DFS explores as far as possible along each branch before backtracking. This approach uses either recursion or an explicit stack, diving deep into the graph structure before exploring breadth. The algorithm visits a node, then immediately proceeds to an unvisited neighbor, continuing this pattern until reaching a dead end.
Both algorithms guarantee visiting every reachable node exactly once when properly implemented with visited tracking. The choice between them depends on the specific problem requirements, graph characteristics, and resource constraints.
Graph traversal forms the foundation for numerous algorithms including pathfinding, cycle detection, topological sorting, connected component identification, and network analysis. Web crawlers use BFS to discover pages, social networks employ graph traversal to compute relationships, and compilers use DFS for syntax tree analysis.
# Simple graph representation
graph = {
'A' => ['B', 'C'],
'B' => ['D', 'E'],
'C' => ['F'],
'D' => [],
'E' => ['F'],
'F' => []
}
# BFS visits: A, B, C, D, E, F
# DFS visits: A, B, D, E, F, C (one possible order)
Key Principles
Graph traversal operates on graph structures consisting of vertices (nodes) and edges (connections between nodes). The traversal process marks nodes as visited to prevent infinite loops in graphs with cycles and ensures each node is processed exactly once.
BFS employs a queue following First-In-First-Out (FIFO) ordering. The algorithm initializes by enqueuing the starting node and marking it as visited. During each iteration, it dequeues a node, processes it, then enqueues all unvisited neighbors. This creates a wave-like expansion pattern radiating outward from the starting point. The queue ensures that nodes at distance d from the start are all visited before any nodes at distance d+1.
# BFS exploration pattern for tree
# 1
# / \
# 2 3
# / \ \
# 4 5 6
#
# Visit order: 1, 2, 3, 4, 5, 6
# Level 0: 1
# Level 1: 2, 3
# Level 2: 4, 5, 6
DFS employs a stack following Last-In-First-Out (LIFO) ordering, either explicitly or through recursive call stack. The algorithm visits a node, marks it as visited, then immediately proceeds to an unvisited neighbor. When no unvisited neighbors remain, it backtracks to the previous node and continues exploration. This creates a path-like exploration pattern diving deep before exploring breadth.
# DFS exploration pattern for same tree
# Visit order (one possibility): 1, 2, 4, 5, 3, 6
# Path: 1 -> 2 -> 4 (backtrack) -> 5 (backtrack) -> 3 -> 6
The visited set tracks which nodes have been explored, preventing duplicate processing and infinite loops in cyclic graphs. Without visited tracking, cycles cause infinite traversal as the algorithm revisits the same nodes repeatedly.
# Graph with cycle
cycle_graph = {
'A' => ['B'],
'B' => ['C'],
'C' => ['A'] # cycle back to A
}
# Without visited tracking: A -> B -> C -> A -> B -> C -> ...
# With visited tracking: A -> B -> C (stops)
The traversal frontier represents nodes discovered but not yet processed. BFS maintains this frontier as a queue, while DFS maintains it as a stack. The frontier structure determines exploration order and distinguishes the two algorithms.
Parent tracking during traversal enables path reconstruction from the starting node to any visited node. Each node records which node discovered it, creating a tree structure overlaid on the original graph. This proves essential for shortest path finding and navigation applications.
Graph representation affects traversal implementation. Adjacency lists store each node's neighbors in a collection, while adjacency matrices use a 2D array indicating edge presence. Adjacency lists generally provide better performance for sparse graphs, while matrices suit dense graphs or frequent edge existence queries.
Design Considerations
Selecting between BFS and DFS depends on problem requirements, graph characteristics, and resource constraints. Neither algorithm is universally superior; each excels in specific scenarios.
BFS guarantees finding the shortest path in unweighted graphs because it explores nodes in order of increasing distance from the start. When the first path to the target is discovered, no shorter path exists. This property makes BFS the correct choice for minimum hop count problems, such as finding degrees of separation in social networks or minimum moves in game state spaces.
# Finding shortest path between two users
def shortest_connection(graph, start_user, target_user)
queue = [[start_user]]
visited = Set.new([start_user])
until queue.empty?
path = queue.shift
node = path.last
return path if node == target_user
graph[node].each do |neighbor|
unless visited.include?(neighbor)
visited.add(neighbor)
queue.push(path + [neighbor])
end
end
end
nil
end
DFS requires less memory for deep graphs, storing only the current path from root to the current node plus unexplored siblings. Memory usage grows with graph depth rather than breadth. For graphs with high branching factors but limited depth, DFS uses significantly less memory than BFS.
DFS naturally suits problems requiring path enumeration or backtracking. Topological sorting, finding strongly connected components, and solving puzzles with constraint satisfaction all benefit from DFS's path-oriented exploration. The recursive structure of DFS aligns with problems that build solutions incrementally and backtrack when constraints are violated.
# Finding all paths between two nodes
def find_all_paths(graph, start, target, path = [], visited = Set.new)
path = path + [start]
visited = visited.dup.add(start)
return [path] if start == target
paths = []
graph[start].each do |neighbor|
unless visited.include?(neighbor)
paths.concat(find_all_paths(graph, neighbor, target, path, visited))
end
end
paths
end
BFS performs better for shallow targets in graphs with high branching factors. If the target resides near the starting point, BFS discovers it quickly without exploring unnecessary deep branches. Search engines crawling web pages use BFS to find content close to seed URLs before exploring distant pages.
DFS performs better for deep targets in graphs with low branching factors. In decision trees or game state exploration where solutions lie deep in the search space, DFS reaches them without expanding the massive frontiers that BFS would create.
Cycle detection requires DFS to track nodes in the current recursion path, distinguishing between back edges (creating cycles) and cross edges (connecting different branches). BFS cannot efficiently detect cycles because it lacks the path context that recursive DFS naturally maintains.
Memory-constrained environments favor iterative DFS with explicit stack size limits, while thread-safe implementations favor BFS because queue operations are easier to synchronize than recursive stack manipulation.
Ruby Implementation
Ruby provides built-in data structures that facilitate both BFS and DFS implementation. Arrays serve as both queues and stacks through shift/push operations, while Sets efficiently track visited nodes.
require 'set'
class Graph
def initialize
@adjacency_list = Hash.new { |h, k| h[k] = [] }
end
def add_edge(from, to)
@adjacency_list[from] << to
end
def neighbors(node)
@adjacency_list[node]
end
def bfs(start, &block)
visited = Set.new
queue = [start]
visited.add(start)
until queue.empty?
node = queue.shift
block.call(node) if block
neighbors(node).each do |neighbor|
unless visited.include?(neighbor)
visited.add(neighbor)
queue.push(neighbor)
end
end
end
visited
end
def dfs_iterative(start, &block)
visited = Set.new
stack = [start]
until stack.empty?
node = stack.pop
next if visited.include?(node)
visited.add(node)
block.call(node) if block
neighbors(node).reverse_each do |neighbor|
stack.push(neighbor) unless visited.include?(neighbor)
end
end
visited
end
def dfs_recursive(start, visited = Set.new, &block)
return if visited.include?(start)
visited.add(start)
block.call(start) if block
neighbors(start).each do |neighbor|
dfs_recursive(neighbor, visited, &block)
end
visited
end
end
The BFS implementation uses shift to dequeue from the front and push to enqueue at the back, maintaining FIFO ordering. Nodes are marked as visited when enqueued rather than when dequeued, preventing duplicate entries in the queue.
The iterative DFS uses pop to retrieve the most recently added node, creating LIFO ordering. The reverse iteration over neighbors maintains the same left-to-right exploration order as recursive DFS, though exploration order varies based on neighbor processing sequence.
# Building and traversing a graph
graph = Graph.new
graph.add_edge('A', 'B')
graph.add_edge('A', 'C')
graph.add_edge('B', 'D')
graph.add_edge('B', 'E')
graph.add_edge('C', 'F')
graph.add_edge('E', 'F')
puts "BFS traversal:"
graph.bfs('A') { |node| puts node }
# Output: A B C D E F
puts "\nDFS traversal (iterative):"
graph.dfs_iterative('A') { |node| puts node }
# Output: A C F B E D (one possible order)
puts "\nDFS traversal (recursive):"
graph.dfs_recursive('A') { |node| puts node }
# Output: A B D E F C (one possible order)
Implementing shortest path finding requires tracking parent relationships during BFS:
class Graph
def shortest_path(start, target)
return [start] if start == target
visited = Set.new([start])
queue = [start]
parent = { start => nil }
until queue.empty?
node = queue.shift
return reconstruct_path(parent, start, target) if node == target
neighbors(node).each do |neighbor|
unless visited.include?(neighbor)
visited.add(neighbor)
parent[neighbor] = node
queue.push(neighbor)
end
end
end
nil
end
private
def reconstruct_path(parent, start, target)
path = []
current = target
while current
path.unshift(current)
current = parent[current]
end
path
end
end
DFS lends itself to detecting cycles through color marking: white (unvisited), gray (currently being processed), and black (fully processed). A gray node encountering another gray node indicates a back edge and thus a cycle:
class Graph
def has_cycle?
color = Hash.new(:white)
@adjacency_list.keys.each do |node|
return true if dfs_cycle_check(node, color)
end
false
end
private
def dfs_cycle_check(node, color)
return false if color[node] == :black
return true if color[node] == :gray
color[node] = :gray
neighbors(node).each do |neighbor|
return true if dfs_cycle_check(neighbor, color)
end
color[node] = :black
false
end
end
Practical Examples
Graph traversal solves numerous real-world problems across different domains. Web crawlers use BFS to systematically discover and index web pages starting from seed URLs:
require 'set'
require 'uri'
class WebCrawler
def initialize(max_depth: 3)
@max_depth = max_depth
@visited = Set.new
end
def crawl(start_url)
queue = [[start_url, 0]]
@visited.add(start_url)
results = []
until queue.empty?
url, depth = queue.shift
next if depth > @max_depth
page_data = fetch_page(url)
results << page_data
extract_links(page_data[:content]).each do |link|
normalized = normalize_url(link, url)
unless @visited.include?(normalized)
@visited.add(normalized)
queue.push([normalized, depth + 1])
end
end
end
results
end
private
def fetch_page(url)
{ url: url, content: "page content", links: [] }
end
def extract_links(content)
[]
end
def normalize_url(link, base)
URI.join(base, link).to_s
rescue
nil
end
end
Social networks determine relationship distances and suggest connections using BFS to compute degrees of separation:
class SocialNetwork
def initialize
@friendships = Hash.new { |h, k| h[k] = Set.new }
end
def add_friendship(user1, user2)
@friendships[user1].add(user2)
@friendships[user2].add(user1)
end
def degrees_of_separation(user1, user2)
return 0 if user1 == user2
visited = Set.new([user1])
queue = [[user1, 0]]
until queue.empty?
user, distance = queue.shift
@friendships[user].each do |friend|
return distance + 1 if friend == user2
unless visited.include?(friend)
visited.add(friend)
queue.push([friend, distance + 1])
end
end
end
Float::INFINITY
end
def suggest_friends(user, max_distance: 2)
suggestions = Set.new
visited = Set.new([user])
queue = [[user, 0]]
until queue.empty?
current, distance = queue.shift
next if distance >= max_distance
@friendships[current].each do |friend|
unless visited.include?(friend)
visited.add(friend)
suggestions.add(friend) if distance > 0
queue.push([friend, distance + 1])
end
end
end
suggestions
end
end
File system traversal uses DFS to recursively explore directory structures:
class FileSystemTraversal
def initialize
@results = []
end
def find_files(directory, pattern: nil, max_depth: nil)
@results = []
dfs_traverse(directory, pattern, 0, max_depth)
@results
end
private
def dfs_traverse(path, pattern, depth, max_depth)
return if max_depth && depth > max_depth
return unless File.directory?(path)
Dir.entries(path).each do |entry|
next if entry == '.' || entry == '..'
full_path = File.join(path, entry)
if File.directory?(full_path)
dfs_traverse(full_path, pattern, depth + 1, max_depth)
elsif pattern.nil? || entry.match?(pattern)
@results << full_path
end
end
rescue Errno::EACCES
# Skip directories without permission
end
end
# Usage
traversal = FileSystemTraversal.new
ruby_files = traversal.find_files('/project', pattern: /\.rb$/, max_depth: 5)
Dependency resolution for build systems and package managers employs DFS to determine installation order through topological sorting:
class DependencyResolver
def initialize
@dependencies = Hash.new { |h, k| h[k] = [] }
end
def add_dependency(package, depends_on)
@dependencies[package] << depends_on
end
def installation_order(package)
visited = Set.new
stack = []
return nil unless topological_sort(package, visited, Set.new, stack)
stack.reverse
end
private
def topological_sort(package, visited, current_path, stack)
return true if visited.include?(package)
return false if current_path.include?(package) # Cycle detected
current_path.add(package)
@dependencies[package].each do |dependency|
unless topological_sort(dependency, visited, current_path, stack)
return false
end
end
current_path.delete(package)
visited.add(package)
stack.push(package)
true
end
end
Performance Considerations
BFS and DFS exhibit identical time complexity but different space complexity characteristics. Both algorithms visit each vertex once and examine each edge once, resulting in O(V + E) time complexity where V represents vertices and E represents edges. This linear time relationship holds for both adjacency list and adjacency matrix representations, though matrix representations incur higher constant factors.
BFS space complexity reaches O(V) in the worst case when the queue stores an entire level of the graph. The maximum queue size occurs at the widest level, which in a complete binary tree contains roughly V/2 nodes. For graphs with high branching factors like social networks where nodes have hundreds of connections, BFS memory consumption becomes substantial.
# BFS memory usage grows with breadth
# Graph with branching factor b and depth d
# Maximum queue size: b^d nodes at deepest level
#
# Example: Social network with average 100 friends
# Depth 3: 100^3 = 1,000,000 nodes in queue
DFS space complexity reaches O(V) in the worst case but typically uses O(h) where h represents maximum path length from root to any node. For balanced trees and sparse graphs, DFS uses significantly less memory than BFS. However, in pathological cases like linked lists or highly unbalanced trees, both algorithms converge to O(V) space usage.
# DFS memory usage grows with depth
# Recursive DFS: stack frames = path length
# Iterative DFS: explicit stack = path length + siblings
#
# Example: Binary tree with depth d
# DFS stack size: d frames (recursive) or d + log(V) nodes (iterative)
The visited set dominates memory usage for both algorithms in large graphs, requiring O(V) space regardless of traversal strategy. Hash-based sets provide O(1) lookup but consume more memory than bitmap representations for dense integer-keyed graphs.
Cache performance differs between algorithms. BFS exhibits poor cache locality because it jumps between distant graph regions as it expands each level. DFS maintains better cache locality by following edges deeper into related subgraphs before backtracking. This difference affects performance significantly for large graphs exceeding CPU cache capacity.
require 'benchmark'
# Performance comparison on large graph
graph = Graph.new
# Create graph with 10000 nodes
10000.times do |i|
rand(1..10).times do
graph.add_edge(i, rand(10000))
end
end
Benchmark.bm(15) do |x|
x.report("BFS:") { graph.bfs(0) }
x.report("DFS iterative:") { graph.dfs_iterative(0) }
x.report("DFS recursive:") { graph.dfs_recursive(0) }
end
Recursive DFS risks stack overflow for deep graphs exceeding the language runtime's stack size limit. Ruby's default stack size typically handles thousands of recursive calls, but graphs with depth exceeding this limit crash. Iterative implementations avoid this limitation at the cost of explicit stack management.
For weighted graphs requiring shortest paths, neither BFS nor DFS suffices. Dijkstra's algorithm or A* search provide correct results at higher computational cost. BFS finds shortest paths only when all edges have equal weight.
Common Patterns
Visited tracking prevents infinite loops and duplicate processing. Three approaches exist: boolean sets, color marking, and timestamp tracking. Boolean sets suffice for simple traversal, while color marking (white/gray/black) enables cycle detection and classification of edge types.
# Boolean visited set (simple traversal)
visited = Set.new
# Color marking (cycle detection)
color = Hash.new(:white) # :white, :gray, :black
# Timestamp tracking (edge classification)
discovery = {}
finish = {}
time = 0
Iterative deepening DFS combines DFS's memory efficiency with BFS's optimality for shortest paths. The algorithm performs repeated DFS searches with increasing depth limits, guaranteeing to find the shortest path while using only O(d) memory where d represents solution depth.
def iterative_deepening_dfs(graph, start, target)
depth = 0
loop do
result = depth_limited_dfs(graph, start, target, depth)
return result if result
depth += 1
break if depth > graph.size
end
nil
end
def depth_limited_dfs(graph, node, target, limit, visited = Set.new)
return [node] if node == target
return nil if limit <= 0
visited.add(node)
graph.neighbors(node).each do |neighbor|
next if visited.include?(neighbor)
path = depth_limited_dfs(graph, neighbor, target, limit - 1, visited)
return [node] + path if path
end
visited.delete(node)
nil
end
Bidirectional search runs simultaneous BFS from both start and target nodes, terminating when the frontiers intersect. This technique reduces search space from O(b^d) to O(b^(d/2)) for branching factor b and depth d.
def bidirectional_search(graph, start, target)
return [start] if start == target
forward_queue = [start]
backward_queue = [target]
forward_visited = { start => nil }
backward_visited = { target => nil }
until forward_queue.empty? || backward_queue.empty?
# Expand smaller frontier
if forward_queue.size <= backward_queue.size
node = forward_queue.shift
graph.neighbors(node).each do |neighbor|
if backward_visited.key?(neighbor)
return build_path(forward_visited, backward_visited, neighbor)
end
unless forward_visited.key?(neighbor)
forward_visited[neighbor] = node
forward_queue.push(neighbor)
end
end
else
node = backward_queue.shift
graph.neighbors(node).each do |neighbor|
if forward_visited.key?(neighbor)
return build_path(forward_visited, backward_visited, neighbor)
end
unless backward_visited.key?(neighbor)
backward_visited[neighbor] = node
backward_queue.push(neighbor)
end
end
end
end
nil
end
Parent tracking during traversal enables path reconstruction and tree construction. Each discovered node records its predecessor, allowing path extraction through backward traversal.
# Track parents during traversal
parent = { start => nil }
# Later reconstruct path
def build_path(parent, start, target)
path = []
current = target
while current
path.unshift(current)
current = parent[current]
end
path
end
Level tracking during BFS groups nodes by distance from the start, enabling level-order processing or computing exact distances.
def bfs_with_levels(graph, start)
visited = Set.new([start])
queue = [[start, 0]]
levels = Hash.new { |h, k| h[k] = [] }
until queue.empty?
node, level = queue.shift
levels[level] << node
graph.neighbors(node).each do |neighbor|
unless visited.include?(neighbor)
visited.add(neighbor)
queue.push([neighbor, level + 1])
end
end
end
levels
end
Multiple component traversal processes disconnected graphs by initiating traversal from each unvisited node, identifying all connected components.
def connected_components(graph)
visited = Set.new
components = []
graph.nodes.each do |node|
next if visited.include?(node)
component = Set.new
dfs(graph, node, visited, component)
components << component
end
components
end
def dfs(graph, node, visited, component)
return if visited.include?(node)
visited.add(node)
component.add(node)
graph.neighbors(node).each do |neighbor|
dfs(graph, neighbor, visited, component)
end
end
Reference
Algorithm Comparison
| Characteristic | BFS | DFS |
|---|---|---|
| Data Structure | Queue (FIFO) | Stack or recursion (LIFO) |
| Time Complexity | O(V + E) | O(V + E) |
| Space Complexity | O(V) worst case, O(w) typical | O(V) worst case, O(h) typical |
| Shortest Path | Finds shortest path in unweighted graphs | Does not guarantee shortest path |
| Memory Usage | High for wide graphs | High for deep graphs |
| Implementation | Iterative with queue | Recursive or iterative with stack |
| Use Case | Shortest paths, level-order | Topological sort, cycle detection |
Time Complexity by Graph Representation
| Representation | BFS | DFS | Space |
|---|---|---|---|
| Adjacency List | O(V + E) | O(V + E) | O(V + E) |
| Adjacency Matrix | O(V²) | O(V²) | O(V²) |
| Edge List | O(V * E) | O(V * E) | O(E) |
Space Complexity Components
| Component | BFS | DFS Recursive | DFS Iterative |
|---|---|---|---|
| Visited Set | O(V) | O(V) | O(V) |
| Frontier | O(V) | O(h) stack frames | O(h) explicit stack |
| Total | O(V) | O(V + h) | O(V + h) |
Common Applications
| Problem | Algorithm | Reason |
|---|---|---|
| Shortest path in unweighted graph | BFS | Level-by-level guarantees minimum distance |
| Path existence | Either | Both work equally well |
| All paths enumeration | DFS | Backtracking naturally enumerates paths |
| Topological sorting | DFS | Finish times determine ordering |
| Cycle detection | DFS | Gray nodes identify back edges |
| Connected components | Either | Both identify components |
| Strongly connected components | DFS | Required for Tarjan's and Kosaraju's algorithms |
| Bipartite checking | BFS | Level tracking enables color assignment |
| Maze solving | DFS | Memory efficient for deep mazes |
| Web crawling | BFS | Prioritizes pages closer to seed URLs |
Graph Traversal Methods
| Method | Return Type | Description |
|---|---|---|
| bfs(start) | Set | Returns all nodes reachable from start using BFS |
| dfs_iterative(start) | Set | Returns all nodes reachable from start using iterative DFS |
| dfs_recursive(start) | Set | Returns all nodes reachable from start using recursive DFS |
| shortest_path(start, target) | Array or nil | Returns shortest path between nodes or nil if unreachable |
| has_cycle? | Boolean | Detects cycle presence in directed graph |
Edge Types in DFS
| Edge Type | Definition | Detection |
|---|---|---|
| Tree Edge | Part of DFS spanning tree | Leads to unvisited (white) node |
| Back Edge | Creates cycle | Leads to ancestor (gray) node |
| Forward Edge | Shortcut in tree | Leads to descendant (black) node |
| Cross Edge | Between subtrees | Leads to unrelated (black) node |
Performance Characteristics
| Scenario | Preferred Algorithm | Space Usage | Notes |
|---|---|---|---|
| Wide shallow graph | DFS | O(h) where h is small | Queue would be very large |
| Deep narrow graph | DFS | O(h) where h is large | Still better than BFS queue |
| Need shortest path | BFS | O(w) where w is width | Required for correctness |
| Path enumeration | DFS | O(h) | Natural backtracking |
| Memory constrained | DFS iterative | O(h) | Bounded stack depth |
| Stack overflow risk | BFS | O(w) | No recursion depth limit |