Overview
Greedy algorithms solve optimization problems by making the locally optimal choice at each step. This approach selects the best immediate option without reconsidering previous decisions. The algorithm builds a solution incrementally, choosing what appears optimal at each decision point.
The greedy method applies to problems with specific structural properties. Not all optimization problems admit greedy solutions. When applicable, greedy algorithms often provide simpler and faster solutions than dynamic programming or exhaustive search approaches.
A classic example demonstrates the greedy approach: the coin change problem for standard currency denominations. To make 41 cents using U.S. coins (quarters, dimes, nickels, pennies), the greedy algorithm selects the largest coin that doesn't exceed the remaining amount:
def greedy_coin_change(amount, coins)
coins.sort.reverse!
result = []
coins.each do |coin|
while amount >= coin
result << coin
amount -= coin
end
end
result
end
greedy_coin_change(41, [1, 5, 10, 25])
# => [25, 10, 5, 1]
This produces an optimal solution: one quarter, one dime, one nickel, and one penny. The greedy choice (always taking the largest coin) happens to work for standard denominations. However, greedy algorithms don't guarantee optimal solutions for all problem instances.
Greedy algorithms appear throughout computer science: in graph algorithms (Dijkstra's shortest path, Prim's minimum spanning tree), compression (Huffman coding), scheduling problems, and resource allocation. Understanding when greedy choices lead to optimal solutions distinguishes expert algorithm designers from novice programmers.
Key Principles
The greedy method requires two fundamental properties for correctness:
Greedy Choice Property: A globally optimal solution can be constructed by making locally optimal choices. At each decision point, the algorithm chooses what appears best at that moment without considering future consequences. This choice must not preclude finding an optimal solution.
Optimal Substructure: An optimal solution contains optimal solutions to subproblems. After making a greedy choice, the remaining problem must be another instance of the same type that can be solved optimally.
These properties distinguish problems solvable by greedy algorithms from those requiring dynamic programming. Dynamic programming reconsiders previous choices through overlapping subproblems. Greedy algorithms never look back.
The activity selection problem illustrates these principles. Given activities with start and finish times, select the maximum number of non-overlapping activities:
Activity = Struct.new(:name, :start, :finish)
def activity_selection(activities)
# Sort by finish time (greedy choice criterion)
sorted = activities.sort_by(&:finish)
selected = [sorted.first]
sorted.each do |activity|
# If activity starts after last selected finishes
if activity.start >= selected.last.finish
selected << activity
end
end
selected
end
activities = [
Activity.new('A', 1, 4),
Activity.new('B', 3, 5),
Activity.new('C', 0, 6),
Activity.new('D', 5, 7),
Activity.new('E', 3, 9),
Activity.new('F', 5, 9),
Activity.new('G', 6, 10),
Activity.new('H', 8, 11),
Activity.new('I', 8, 12),
Activity.new('J', 2, 14),
Activity.new('K', 12, 16)
]
activity_selection(activities)
# => [A, D, H, K]
The greedy choice selects the activity finishing earliest. This choice maximizes remaining time for subsequent activities. After selecting an activity, the remaining problem (finding maximum activities compatible with the chosen one) exhibits the same structure.
Proof of Correctness: Greedy algorithm correctness requires proof. The exchange argument represents a common technique. Assume an optimal solution differs from the greedy solution. Show that swapping elements from the optimal solution with greedy choices produces an equally good or better solution. Continue until the optimal solution matches the greedy solution.
For activity selection, suppose an optimal solution chooses activity k instead of the greedy choice (earliest-finishing activity f). Since f finishes before k, replacing k with f cannot reduce the number of selected activities. The remaining time available stays the same or increases.
Matroid Theory: Mathematical matroid structures formalize when greedy algorithms work. A matroid consists of a ground set and a collection of independent sets satisfying specific properties. Problems expressible as matroid optimization admit greedy solutions. This theory provides rigorous foundations but exceeds requirements for most practical applications.
The difference between local and global optimality creates the primary challenge. A locally optimal choice may lead to a globally suboptimal solution. Consider the coin change problem with denominations [1, 3, 4] for amount 6. The greedy approach chooses 4, then two 1s (three coins total). The optimal solution uses two 3s (two coins).
Ruby Implementation
Ruby's expressive syntax suits implementing greedy algorithms. The language provides built-in methods for sorting, iterating, and building collections that greedy algorithms rely on.
Fractional Knapsack: Given items with weights and values, and a knapsack capacity, select items (or fractions) to maximize total value:
class Item
attr_reader :name, :weight, :value
def initialize(name, weight, value)
@name = name
@weight = weight
@value = value
end
def value_per_unit
value.to_f / weight
end
end
def fractional_knapsack(items, capacity)
# Greedy choice: highest value per unit weight
sorted = items.sort_by { |item| -item.value_per_unit }
total_value = 0.0
remaining = capacity
selection = []
sorted.each do |item|
if item.weight <= remaining
# Take entire item
selection << { item: item.name, fraction: 1.0 }
total_value += item.value
remaining -= item.weight
elsif remaining > 0
# Take fraction
fraction = remaining.to_f / item.weight
selection << { item: item.name, fraction: fraction }
total_value += item.value * fraction
remaining = 0
end
break if remaining == 0
end
{ total_value: total_value, selection: selection }
end
items = [
Item.new('A', 10, 60),
Item.new('B', 20, 100),
Item.new('C', 30, 120)
]
fractional_knapsack(items, 50)
# => {:total_value=>240.0, :selection=>[
# {:item=>"A", :fraction=>1.0},
# {:item=>"B", :fraction=>1.0},
# {:item=>"C", :fraction=>0.6666666666666666}
# ]}
The sort_by method with negative value creates descending order. Ruby's Struct and class system organize problem data cleanly. The fractional knapsack admits a greedy solution, while the 0/1 knapsack (where items cannot be divided) requires dynamic programming.
Huffman Coding: Build an optimal prefix-free code for data compression by repeatedly combining lowest-frequency nodes:
class HuffmanNode
attr_accessor :char, :freq, :left, :right
def initialize(char, freq, left = nil, right = nil)
@char = char
@freq = freq
@left = left
@right = right
end
def leaf?
left.nil? && right.nil?
end
end
def huffman_encoding(frequencies)
# Priority queue (min-heap simulation with array)
nodes = frequencies.map { |char, freq| HuffmanNode.new(char, freq) }
while nodes.size > 1
nodes.sort_by!(&:freq)
# Greedy choice: combine two minimum frequency nodes
left = nodes.shift
right = nodes.shift
parent = HuffmanNode.new(nil, left.freq + right.freq, left, right)
nodes << parent
end
build_codes(nodes.first)
end
def build_codes(node, prefix = '', codes = {})
if node.leaf?
codes[node.char] = prefix.empty? ? '0' : prefix
else
build_codes(node.left, prefix + '0', codes) if node.left
build_codes(node.right, prefix + '1', codes) if node.right
end
codes
end
frequencies = { 'a' => 45, 'b' => 13, 'c' => 12, 'd' => 16, 'e' => 9, 'f' => 5 }
huffman_encoding(frequencies)
# => {"a"=>"0", "c"=>"100", "b"=>"101", "f"=>"1100", "e"=>"1101", "d"=>"111"}
The greedy choice combines the two least frequent characters or subtrees at each step. This produces an optimal prefix-free code minimizing expected code length. Ruby's recursive tree traversal builds the final encoding efficiently.
Minimum Spanning Tree (Prim's Algorithm): Connect all vertices with minimum total edge weight:
def prims_mst(graph, start)
mst_edges = []
visited = Set.new([start])
edges = []
# Add all edges from start vertex
graph[start].each do |neighbor, weight|
edges << [weight, start, neighbor]
end
while visited.size < graph.size && !edges.empty?
edges.sort!
# Greedy choice: minimum weight edge to unvisited vertex
weight, from, to = edges.shift
next if visited.include?(to)
visited.add(to)
mst_edges << [from, to, weight]
# Add new edges from newly visited vertex
graph[to].each do |neighbor, edge_weight|
edges << [edge_weight, to, neighbor] unless visited.include?(neighbor)
end
end
mst_edges
end
graph = {
'A' => { 'B' => 4, 'C' => 2 },
'B' => { 'A' => 4, 'C' => 1, 'D' => 5 },
'C' => { 'A' => 2, 'B' => 1, 'D' => 8, 'E' => 10 },
'D' => { 'B' => 5, 'C' => 8, 'E' => 2 },
'E' => { 'C' => 10, 'D' => 2 }
}
prims_mst(graph, 'A')
# => [["A", "C", 2], ["C", "B", 1], ["B", "D", 5], ["D", "E", 2]]
The greedy choice selects the minimum-weight edge connecting visited vertices to unvisited vertices. This produces a minimum spanning tree. Using a priority queue instead of array sorting improves performance for large graphs.
Practical Examples
Job Scheduling with Deadlines: Schedule jobs to maximize total profit, where each job has a deadline and profit:
Job = Struct.new(:id, :deadline, :profit)
def job_scheduling(jobs)
# Greedy choice: sort by profit descending
sorted_jobs = jobs.sort_by { |job| -job.profit }
max_deadline = jobs.map(&:deadline).max
schedule = Array.new(max_deadline)
total_profit = 0
sorted_jobs.each do |job|
# Find latest available slot before deadline
slot = [job.deadline - 1, max_deadline - 1].min
while slot >= 0
if schedule[slot].nil?
schedule[slot] = job
total_profit += job.profit
break
end
slot -= 1
end
end
{ schedule: schedule.compact, total_profit: total_profit }
end
jobs = [
Job.new('J1', 2, 100),
Job.new('J2', 1, 19),
Job.new('J3', 2, 27),
Job.new('J4', 1, 25),
Job.new('J5', 3, 15)
]
job_scheduling(jobs)
# => {:schedule=>[#<struct Job id="J2", deadline=1, profit=19>,
# #<struct Job id="J1", deadline=2, profit=100>,
# #<struct Job id="J5", deadline=3, profit=15>],
# :total_profit=>134}
The greedy approach selects jobs in order of decreasing profit, assigning each to the latest available slot before its deadline. This maximizes total profit while respecting deadlines.
Egyptian Fractions: Represent a fraction as a sum of distinct unit fractions (fractions with numerator 1):
def egyptian_fraction(numerator, denominator)
result = []
while numerator > 0
# Greedy choice: largest unit fraction not exceeding current fraction
unit_denominator = (denominator.to_f / numerator).ceil
result << "1/#{unit_denominator}"
# Subtract unit fraction from remaining
numerator = numerator * unit_denominator - denominator
denominator = denominator * unit_denominator
# Reduce fraction
gcd = numerator.gcd(denominator)
numerator /= gcd
denominator /= gcd
end
result
end
egyptian_fraction(6, 14)
# => ["1/3", "1/11", "1/231"]
egyptian_fraction(2, 3)
# => ["1/2", "1/6"]
The greedy algorithm selects the largest unit fraction at each step. While not always producing the shortest representation, this approach guarantees termination and provides a simple solution method.
Load Balancing: Distribute jobs across processors to minimize maximum load:
def load_balancing(jobs, num_processors)
processors = Array.new(num_processors, 0)
assignment = {}
# Greedy choice: sort jobs by decreasing size
jobs.sort.reverse.each do |job|
# Assign to processor with minimum current load
min_processor = processors.each_with_index.min_by { |load, _| load }[1]
processors[min_processor] += job
assignment[job] = min_processor
end
{
max_load: processors.max,
loads: processors,
assignment: assignment
}
end
jobs = [7, 5, 4, 4, 3, 3, 2]
load_balancing(jobs, 3)
# => {:max_load=>10,
# :loads=>[10, 9, 9],
# :assignment=>{7=>0, 5=>1, 4=>2, 4=>0, 3=>1, 3=>2, 2=>0}}
The greedy approach assigns each job to the currently least-loaded processor. While not always optimal, this provides a good approximation with 4/3 approximation ratio for the makespan minimization problem.
Interval Partitioning: Partition intervals into minimum number of groups where no two intervals in the same group overlap:
Interval = Struct.new(:start, :finish)
def interval_partitioning(intervals)
sorted = intervals.sort_by(&:start)
groups = []
sorted.each do |interval|
# Try to place in existing group
placed = false
groups.each do |group|
# Check if interval fits (starts after last in group finishes)
if interval.start >= group.last.finish
group << interval
placed = true
break
end
end
# Create new group if needed
groups << [interval] unless placed
end
groups
end
intervals = [
Interval.new(1, 3),
Interval.new(2, 5),
Interval.new(4, 7),
Interval.new(6, 9),
Interval.new(8, 10)
]
result = interval_partitioning(intervals)
# => [[#<struct Interval start=1, finish=3>,
# #<struct Interval start=4, finish=7>,
# #<struct Interval start=8, finish=10>],
# [#<struct Interval start=2, finish=5>,
# #<struct Interval start=6, finish=9>]]
The greedy strategy assigns each interval to the first compatible group. The number of groups equals the maximum number of intervals overlapping at any point (depth of the interval set).
Common Patterns
Selection Problems: Choose a subset of elements maximizing or minimizing an objective function. The greedy choice criterion determines which element to add next. Examples include activity selection, job scheduling, and fractional knapsack.
Pattern structure:
- Define greedy choice criterion (earliest finish, highest value density, etc.)
- Sort elements by this criterion
- Iterate through sorted elements, selecting those compatible with previous choices
- Return selected subset
Merging Problems: Repeatedly combine elements based on a criterion until one remains. Huffman coding and merging sorted sequences follow this pattern. Minimum cost typically requires combining smallest elements first.
Pattern structure:
- Initialize collection of elements
- While more than one element remains:
- Select elements to merge based on criterion
- Create merged element
- Add back to collection
- Return final element or trace of merges
Graph Traversal Problems: Build spanning structures (trees, paths) by selecting edges greedily. Minimum spanning tree algorithms (Prim's, Kruskal's) and shortest path algorithms (Dijkstra's) exemplify this pattern.
Pattern structure:
- Initialize partial solution (single vertex or empty edge set)
- Maintain set of candidate edges/vertices
- Repeatedly select best candidate
- Update solution and candidates
- Continue until solution complete
Scheduling and Sequencing: Order tasks or activities to optimize completion time, profit, or resource usage. Job sequencing, task scheduling, and minimizing latency follow this pattern.
Pattern structure:
- Define priority ordering for tasks
- Sort tasks by priority
- Process tasks in order, resolving conflicts greedily
- Return schedule or sequence
Approximation Algorithms: Greedy approaches provide fast approximations for NP-hard problems. Set cover, vertex cover, and traveling salesman problem admit greedy approximations with proven performance guarantees.
The set cover problem demonstrates greedy approximation:
def greedy_set_cover(universe, sets)
uncovered = universe.dup
selected = []
until uncovered.empty?
# Greedy choice: set covering most uncovered elements
best_set = sets.max_by { |set| (set & uncovered).size }
selected << best_set
uncovered -= best_set
end
selected
end
universe = (1..10).to_a
sets = [
[1, 2, 3, 8, 9, 10],
[1, 2, 3, 4, 5],
[4, 5, 6, 7],
[5, 6, 7, 8, 9, 10]
]
greedy_set_cover(universe, sets)
# => [[1, 2, 3, 8, 9, 10], [4, 5, 6, 7]]
The greedy set cover achieves logarithmic approximation ratio. While not optimal, this provides practical solutions when exact algorithms prove too slow.
Performance Considerations
Greedy algorithms typically achieve better time complexity than dynamic programming solutions. The absence of overlapping subproblems eliminates memoization overhead. Most greedy algorithms run in O(n log n) time, dominated by sorting.
Activity Selection: O(n log n) time for sorting activities by finish time. The selection phase runs in O(n) time with a single scan. Space complexity: O(1) excluding input storage.
Huffman Coding: O(n log n) time using priority queue. Each of n characters requires log n time for heap operations. Efficient heap implementation (binary heap or Fibonacci heap) maintains this complexity. Space: O(n) for tree nodes.
Minimum Spanning Tree:
- Prim's algorithm: O(E log V) with binary heap priority queue, O(E + V log V) with Fibonacci heap
- Kruskal's algorithm: O(E log E) dominated by edge sorting
Dijkstra's Shortest Path: O((V + E) log V) with binary heap, O(E + V log V) with Fibonacci heap. The greedy choice extracts minimum distance vertex repeatedly.
Comparison with dynamic programming shows performance differences:
require 'benchmark'
# Greedy approach
def greedy_coin_change(amount, coins)
coins.sort.reverse!
count = 0
coins.each do |coin|
count += amount / coin
amount %= coin
end
count
end
# Dynamic programming approach
def dp_coin_change(amount, coins)
dp = Array.new(amount + 1, Float::INFINITY)
dp[0] = 0
1.upto(amount) do |i|
coins.each do |coin|
dp[i] = [dp[i], dp[i - coin] + 1].min if i >= coin
end
end
dp[amount]
end
coins = [1, 5, 10, 25]
Benchmark.bm do |x|
x.report("Greedy:") { 100000.times { greedy_coin_change(99, coins) } }
x.report("DP: ") { 100000.times { dp_coin_change(99, coins) } }
end
# Greedy: significantly faster, O(n) vs O(n*m) for DP
Greedy algorithms excel when applicable. The challenge lies in recognizing when greedy choices produce optimal solutions. No general test exists; problem-specific analysis determines correctness.
Space Complexity: Greedy algorithms often use O(1) extra space beyond input storage. Dynamic programming requires O(n) or O(n*m) space for memoization tables. This difference matters for large-scale problems with memory constraints.
Trade-offs: Greedy algorithms sacrifice generality for speed. They work only on specific problem classes. Dynamic programming handles broader problem sets but runs slower. Approximation algorithms using greedy strategies sacrifice optimality for tractability on NP-hard problems.
Common Pitfalls
Assuming Greedy Works Without Proof: The most dangerous pitfall. Greedy algorithms feel intuitive but fail on many problems. The 0/1 knapsack problem demonstrates this:
# This greedy approach FAILS for 0/1 knapsack
def incorrect_01_knapsack(items, capacity)
sorted = items.sort_by { |item| -item.value_per_unit }
total_value = 0
sorted.each do |item|
if item.weight <= capacity
total_value += item.value
capacity -= item.weight
end
end
total_value
end
# Consider: capacity = 50, items = [(60, 10), (100, 20), (120, 30)]
# Greedy selects first two items: value = 160
# Optimal selects second and third: value = 220
The greedy approach based on value density fails because items cannot be fractioned. This problem requires dynamic programming for optimal solutions.
Incorrect Greedy Choice Criterion: Selecting the wrong criterion produces suboptimal results. In interval scheduling, choosing activities by shortest duration or earliest start time fails:
# WRONG: selecting shortest activities
def wrong_activity_selection(activities)
sorted = activities.sort_by { |a| a.finish - a.start }
selected = [sorted.first]
sorted.each do |activity|
if activity.start >= selected.last.finish
selected << activity
end
end
selected
end
# This produces suboptimal solutions for many inputs
Only sorting by finish time guarantees optimality. The greedy choice criterion requires careful analysis and proof.
Ignoring Edge Cases: Empty inputs, single elements, or all elements conflicting require special handling:
def safe_activity_selection(activities)
return [] if activities.empty?
return activities if activities.size == 1
sorted = activities.sort_by(&:finish)
selected = [sorted.first]
sorted.each do |activity|
selected << activity if activity.start >= selected.last.finish
end
selected
end
Precision Issues with Floating Point: Greedy algorithms comparing fractional values encounter floating point errors:
# Potential precision issue
def fractional_knapsack_unsafe(items, capacity)
sorted = items.sort_by { |item| -item.value.to_f / item.weight }
# Comparisons may fail due to floating point representation
end
# Better: use Rational for exact arithmetic
def fractional_knapsack_safe(items, capacity)
sorted = items.sort_by { |item|
-Rational(item.value, item.weight)
}
# ...
end
Not Verifying Solution Feasibility: Greedy algorithms may produce invalid solutions when problem constraints prevent finding any valid solution:
def coin_change_with_validation(amount, coins)
result = []
remaining = amount
coins.sort.reverse.each do |coin|
while remaining >= coin
result << coin
remaining -= coin
end
end
# Verify solution is valid
if remaining > 0
raise "No solution exists for amount #{amount}"
end
result
end
# For amount = 3, coins = [2], this correctly raises error
Mutating Shared Data: Sorting input arrays in-place causes side effects:
# BAD: mutates input
def activity_selection_bad(activities)
activities.sort_by!(&:finish)
# ...
end
# GOOD: creates new sorted array
def activity_selection_good(activities)
sorted = activities.sort_by(&:finish)
# ...
end
Confusing Greedy with Heuristics: Greedy algorithms either produce optimal solutions or have proven approximation ratios. Random or rule-of-thumb approaches without guarantees are heuristics, not greedy algorithms. The distinction matters for correctness claims.
Reference
Classic Greedy Algorithms
| Algorithm | Problem | Greedy Choice | Time Complexity | Space |
|---|---|---|---|---|
| Activity Selection | Maximum non-overlapping activities | Earliest finish time | O(n log n) | O(1) |
| Fractional Knapsack | Maximize value within weight limit | Highest value density | O(n log n) | O(1) |
| Huffman Coding | Optimal prefix-free code | Minimum frequency nodes | O(n log n) | O(n) |
| Dijkstra's Algorithm | Shortest paths from source | Minimum distance vertex | O((V+E) log V) | O(V) |
| Prim's Algorithm | Minimum spanning tree | Minimum weight edge to tree | O(E log V) | O(V) |
| Kruskal's Algorithm | Minimum spanning tree | Minimum weight edge overall | O(E log E) | O(V) |
| Job Scheduling | Maximize profit with deadlines | Highest profit first | O(n^2) | O(n) |
| Interval Partitioning | Minimum groups for intervals | Earliest start time | O(n log n) | O(n) |
Greedy vs Dynamic Programming Decision
| Factor | Prefer Greedy | Prefer Dynamic Programming |
|---|---|---|
| Optimal substructure | Required | Required |
| Greedy choice property | Must hold | Not required |
| Overlapping subproblems | Not present | Present |
| Time complexity | O(n log n) typical | O(n^2) or higher typical |
| Space complexity | O(1) typical | O(n) or higher |
| Proof difficulty | High (must prove correctness) | Lower (recurrence often clear) |
| Problem types | Scheduling, graph optimization | Counting, sequence problems |
Greedy Choice Criteria
| Problem Type | Common Criteria | Examples |
|---|---|---|
| Interval Scheduling | Earliest finish, latest start, shortest duration | Activity selection, meeting rooms |
| Resource Allocation | Value density, efficiency ratio | Fractional knapsack, load balancing |
| Graph Problems | Minimum weight, minimum distance | MST, shortest path |
| Sequencing | Profit, deadline, processing time | Job scheduling, task ordering |
| Compression | Frequency, probability | Huffman coding, data compression |
| Partitioning | First fit, best fit | Bin packing, interval partitioning |
Complexity Patterns
| Pattern | Sorting | Selection Phase | Total | Example |
|---|---|---|---|---|
| Sort and scan | O(n log n) | O(n) | O(n log n) | Activity selection |
| Priority queue | O(n) build | O(n log n) operations | O(n log n) | Huffman coding |
| Graph traversal | O(E log E) edge sort | O(E) union-find | O(E log E) | Kruskal's MST |
| Repeated minimum | None | O(n^2) naive | O(n^2) | Job scheduling basic |
| Heap operations | O(n) heapify | O(n log n) extractions | O(n log n) | Dijkstra's algorithm |
Verification Checklist
| Check | Description | Method |
|---|---|---|
| Greedy choice property | Local optimum leads to global | Exchange argument proof |
| Optimal substructure | Subproblems solved optimally | Induction on problem size |
| Correctness proof | Algorithm produces optimal solution | Mathematical proof or counterexample |
| Edge cases | Empty input, single element, all conflicts | Test boundary conditions |
| Feasibility | Solution exists and is valid | Validate constraints satisfied |
| Approximation ratio | For NP-hard problems | Theoretical analysis or empirical testing |
Common Mistakes
| Mistake | Impact | Solution |
|---|---|---|
| No correctness proof | Incorrect solutions accepted | Prove greedy choice and optimal substructure |
| Wrong greedy criterion | Suboptimal results | Analyze problem structure carefully |
| Floating point comparison | Precision errors | Use Rational or integer arithmetic |
| Mutating input | Side effects | Create copies before sorting |
| Ignoring infeasibility | Invalid solutions returned | Check if solution exists |
| Confusing with heuristics | No optimality guarantees | Distinguish proven algorithms from rules of thumb |