CrackedRuby CrackedRuby

Traveling Salesman Problem Approaches

Overview

The Traveling Salesman Problem (TSP) requires finding the shortest possible route that visits each city exactly once and returns to the starting city. This combinatorial optimization problem belongs to the NP-hard complexity class, meaning no known polynomial-time algorithm exists for finding optimal solutions to all instances.

The problem has direct applications in logistics, circuit board manufacturing, DNA sequencing, and network routing. A delivery company with 20 stops faces 2,432,902,008,176,640,000 possible routes. Testing each route at one billion routes per second would require over 77,000 years.

The mathematical formulation defines a complete graph with n vertices (cities) and weighted edges representing distances. The objective function minimizes the total distance:

minimize: Σ(i=1 to n) distance(city[i], city[i+1])
constraint: each city visited exactly once
constraint: return to starting city

For a symmetric TSP, distance(A,B) equals distance(B,A). Asymmetric variants model one-way streets or different costs based on direction.

The gap between finding any solution and finding the optimal solution drives the variety of approaches. A simple nearest neighbor heuristic completes in seconds but may produce routes 25% longer than optimal. An exact branch-and-bound algorithm guarantees optimality but becomes intractable beyond 20-30 cities.

Key Principles

The state space grows factorially with city count. For n cities, there are (n-1)!/2 distinct tours in symmetric TSP. This factorial growth explains why brute force becomes impossible:

  • 10 cities: 181,440 tours
  • 15 cities: 43,589,145,600 tours
  • 20 cities: 60,822,550,204,416,000 tours

Graph Representation

TSP operates on weighted graphs where vertices represent cities and edge weights represent costs. The distance matrix stores all pairwise distances:

     A    B    C    D
A    0   10   15   20
B   10    0   35   25
C   15   35    0   30
D   20   25   30    0

A tour corresponds to a Hamiltonian cycle—a path visiting each vertex exactly once and returning to the start.

Optimization Objectives

The primary objective minimizes total tour length. Secondary objectives may include:

  • Minimizing maximum edge weight (minimax TSP)
  • Time windows for city visits (TSP with time constraints)
  • Multiple salesmen (mTSP)
  • Pickup and delivery constraints (vehicle routing)

Solution Quality Metrics

Approximation ratio compares a heuristic solution to the optimal solution. A 1.5-approximation algorithm produces tours at most 50% longer than optimal. Tour improvement measures the percentage reduction from initial to final tour length.

Lower Bounds

Computing lower bounds helps evaluate solution quality without knowing the optimum. Common lower bound techniques:

  • Minimum spanning tree: Sum of MST edges plus two shortest edges not in MST
  • Assignment problem relaxation: Solve as an assignment problem, ignoring subtour constraints
  • Held-Karp bound: Strongest bound, computed via Lagrangian relaxation

Local Optimality

Many algorithms produce locally optimal solutions—tours that cannot be improved by small modifications. A 2-opt local optimum has no improving edge swaps. Local optima may differ significantly from global optima.

Implementation Approaches

TSP approaches fall into three categories based on solution guarantees and computational requirements.

Exact Algorithms

Exact algorithms guarantee finding the optimal tour but have exponential worst-case complexity.

Dynamic programming (Held-Karp) computes optimal subtours systematically. For each subset S of cities and each city j in S, it stores the minimum cost to visit all cities in S exactly once, ending at j. The recurrence relation:

C(S, j) = min{C(S-{j}, i) + distance(i,j)} for all i in S-{j}

Time complexity O(n² × 2^n) and space complexity O(n × 2^n) limit this to 20-25 cities despite being more efficient than brute force.

Branch-and-bound explores the solution tree while pruning branches that cannot improve the best known solution. Each node represents a partial tour. The algorithm:

  1. Computes a lower bound for completing the partial tour
  2. Compares to the best complete tour found
  3. Prunes if lower bound exceeds best tour length
  4. Otherwise, branches to explore extensions

Effective bounding functions determine performance. Assignment problem relaxation provides tight bounds efficiently.

Integer linear programming formulates TSP with binary variables x_ij indicating whether edge (i,j) appears in the tour:

minimize: ΣΣ distance(i,j) × x_ij
subject to: Σ x_ij = 1 for all j (enter each city once)
            Σ x_ij = 1 for all i (leave each city once)
            no subtours (subtour elimination constraints)

Modern ILP solvers handle 100+ cities using cutting planes and branch-and-cut.

Constructive Heuristics

Constructive heuristics build tours incrementally without guaranteeing optimality.

Nearest neighbor starts at an arbitrary city and repeatedly visits the nearest unvisited city. Complexity O(n²) makes it fast but produces tours averaging 25% above optimal.

Greedy insertion selects edges in ascending distance order, adding them to the tour if they maintain tour validity. This produces better solutions than nearest neighbor but remains O(n² log n).

Christofides algorithm guarantees solutions within 50% of optimal for metric TSP:

  1. Compute minimum spanning tree
  2. Find minimum weight perfect matching on odd-degree vertices
  3. Combine to form Eulerian graph
  4. Extract Hamiltonian cycle

Complexity O(n³) and 1.5-approximation ratio make this the best guaranteed approximation algorithm for metric TSP.

Local Search and Metaheuristics

Local search iteratively improves tours by making local modifications.

2-opt removes two edges and reconnects the tour. For each pair of edges (i,i+1) and (j,j+1), it tests whether reversing the segment between them reduces tour length. Complexity O(n²) per iteration typically requires multiple iterations.

3-opt considers removing three edges and reconnecting, testing seven possible reconnections. Higher computational cost O(n³) per iteration often produces better solutions.

Lin-Kernighan uses variable-depth search, determining the number of edge swaps dynamically. This produces high-quality solutions but has complex implementation.

Simulated annealing accepts worse solutions with probability decreasing over time, escaping local optima:

if new_tour better than current:
    accept
else:
    accept with probability exp(-(new_cost - current_cost) / temperature)
reduce temperature

Temperature schedule determines performance. Geometric cooling multiplies temperature by 0.95-0.99 each iteration.

Genetic algorithms maintain a population of tours, combining good tours to create offspring. Crossover operators preserve tour segments from parents. Edge recombination crossover maintains edge relationships.

Ant colony optimization simulates ants depositing pheromones on good paths. Artificial ants construct tours probabilistically, favoring edges with strong pheromones and short distances. Pheromone evaporation prevents premature convergence.

Ruby Implementation

Nearest Neighbor Heuristic

class NearestNeighborTSP
  def initialize(distance_matrix)
    @distances = distance_matrix
    @n = distance_matrix.size
  end
  
  def solve(start_city = 0)
    unvisited = (0...@n).to_a
    current = start_city
    tour = [current]
    unvisited.delete(current)
    total_distance = 0
    
    while unvisited.any?
      nearest = unvisited.min_by { |city| @distances[current][city] }
      total_distance += @distances[current][nearest]
      tour << nearest
      current = nearest
      unvisited.delete(nearest)
    end
    
    # Return to start
    total_distance += @distances[current][start_city]
    tour << start_city
    
    { tour: tour, distance: total_distance }
  end
end

# Usage
distances = [
  [0, 10, 15, 20],
  [10, 0, 35, 25],
  [15, 35, 0, 30],
  [20, 25, 30, 0]
]

solver = NearestNeighborTSP.new(distances)
result = solver.solve
# => { tour: [0, 1, 3, 2, 0], distance: 80 }

2-Opt Local Search

class TwoOptTSP
  def initialize(distance_matrix)
    @distances = distance_matrix
    @n = distance_matrix.size
  end
  
  def optimize(initial_tour)
    tour = initial_tour.dup
    improved = true
    
    while improved
      improved = false
      
      (0...@n-1).each do |i|
        (i+1...@n).each do |j|
          new_tour = two_opt_swap(tour, i, j)
          new_distance = calculate_distance(new_tour)
          
          if new_distance < calculate_distance(tour)
            tour = new_tour
            improved = true
          end
        end
      end
    end
    
    { tour: tour, distance: calculate_distance(tour) }
  end
  
  private
  
  def two_opt_swap(tour, i, j)
    # Reverse segment between i and j
    tour[0..i] + tour[i+1..j].reverse + tour[j+1..-1]
  end
  
  def calculate_distance(tour)
    total = 0
    (0...tour.size-1).each do |i|
      total += @distances[tour[i]][tour[i+1]]
    end
    total
  end
end

# Usage with initial nearest neighbor solution
nn_solver = NearestNeighborTSP.new(distances)
initial = nn_solver.solve[:tour]

opt_solver = TwoOptTSP.new(distances)
result = opt_solver.optimize(initial)
# => { tour: [0, 1, 3, 2, 0], distance: 80 }

Dynamic Programming (Held-Karp)

class HeldKarpTSP
  def initialize(distance_matrix)
    @distances = distance_matrix
    @n = distance_matrix.size
    @memo = {}
  end
  
  def solve
    # Bitmask representing all cities except start (city 0)
    all_cities = (1 << (@n - 1)) - 1
    
    # Find minimum cost to visit all cities, ending at each city
    min_cost = Float::INFINITY
    final_city = -1
    
    (1...@n).each do |end_city|
      cost = tsp(all_cities, end_city) + @distances[end_city][0]
      if cost < min_cost
        min_cost = cost
        final_city = end_city
      end
    end
    
    tour = reconstruct_tour(all_cities, final_city)
    { tour: tour, distance: min_cost }
  end
  
  private
  
  def tsp(mask, pos)
    # Base case: only one city left
    return @distances[0][pos] if mask == 0
    
    key = [mask, pos]
    return @memo[key] if @memo.key?(key)
    
    min_cost = Float::INFINITY
    
    # Try visiting each unvisited city next
    (1...@n).each do |city|
      # Check if city is in the subset
      bit = 1 << (city - 1)
      next unless (mask & bit) != 0
      
      # Remove city from subset and recurse
      new_mask = mask & ~bit
      cost = @distances[pos][city] + tsp(new_mask, city)
      min_cost = [min_cost, cost].min
    end
    
    @memo[key] = min_cost
  end
  
  def reconstruct_tour(mask, pos)
    tour = [0, pos]
    current_mask = mask & ~(1 << (pos - 1))
    current_pos = pos
    
    while current_mask > 0
      next_city = nil
      min_cost = Float::INFINITY
      
      (1...@n).each do |city|
        bit = 1 << (city - 1)
        next unless (current_mask & bit) != 0
        
        new_mask = current_mask & ~bit
        cost = @distances[current_pos][city] + tsp(new_mask, city)
        
        if cost < min_cost
          min_cost = cost
          next_city = city
        end
      end
      
      tour << next_city
      current_mask &= ~(1 << (next_city - 1))
      current_pos = next_city
    end
    
    tour << 0
    tour
  end
end

# Usage (practical only for small instances)
small_distances = [
  [0, 10, 15, 20],
  [10, 0, 35, 25],
  [15, 35, 0, 30],
  [20, 25, 30, 0]
]

solver = HeldKarpTSP.new(small_distances)
result = solver.solve
# => { tour: [0, 1, 3, 2, 0], distance: 80 }

Simulated Annealing

class SimulatedAnnealingTSP
  def initialize(distance_matrix, initial_temp: 1000, cooling_rate: 0.995, iterations: 10000)
    @distances = distance_matrix
    @n = distance_matrix.size
    @initial_temp = initial_temp
    @cooling_rate = cooling_rate
    @iterations = iterations
  end
  
  def solve
    current_tour = (0...@n).to_a + [0]
    current_cost = calculate_cost(current_tour)
    
    best_tour = current_tour.dup
    best_cost = current_cost
    
    temperature = @initial_temp
    
    @iterations.times do
      # Generate neighbor by swapping two random cities
      new_tour = generate_neighbor(current_tour)
      new_cost = calculate_cost(new_tour)
      
      # Accept if better, or with probability based on temperature
      if new_cost < current_cost || 
         rand < Math.exp((current_cost - new_cost) / temperature)
        current_tour = new_tour
        current_cost = new_cost
        
        if current_cost < best_cost
          best_tour = current_tour.dup
          best_cost = current_cost
        end
      end
      
      temperature *= @cooling_rate
    end
    
    { tour: best_tour, distance: best_cost }
  end
  
  private
  
  def generate_neighbor(tour)
    new_tour = tour.dup
    # Swap two random positions (excluding start/end)
    i = rand(1...@n)
    j = rand(1...@n)
    new_tour[i], new_tour[j] = new_tour[j], new_tour[i]
    new_tour
  end
  
  def calculate_cost(tour)
    total = 0
    (0...tour.size-1).each do |i|
      total += @distances[tour[i]][tour[i+1]]
    end
    total
  end
end

# Usage
solver = SimulatedAnnealingTSP.new(distances)
result = solver.solve
# => { tour: [0, 1, 3, 2, 0], distance: 80 }

Genetic Algorithm

class GeneticAlgorithmTSP
  def initialize(distance_matrix, population_size: 100, generations: 500)
    @distances = distance_matrix
    @n = distance_matrix.size
    @population_size = population_size
    @generations = generations
  end
  
  def solve
    population = initialize_population
    
    @generations.times do
      population = evolve(population)
    end
    
    best_tour = population.min_by { |tour| fitness(tour) }
    { tour: best_tour + [best_tour[0]], distance: fitness(best_tour) }
  end
  
  private
  
  def initialize_population
    Array.new(@population_size) { (0...@n).to_a.shuffle }
  end
  
  def evolve(population)
    new_population = []
    
    # Elitism: keep best 10%
    elite_size = @population_size / 10
    sorted = population.sort_by { |tour| fitness(tour) }
    new_population.concat(sorted[0...elite_size])
    
    # Generate rest through crossover and mutation
    while new_population.size < @population_size
      parent1 = tournament_select(population)
      parent2 = tournament_select(population)
      child = crossover(parent1, parent2)
      child = mutate(child)
      new_population << child
    end
    
    new_population
  end
  
  def tournament_select(population, tournament_size: 5)
    tournament = population.sample(tournament_size)
    tournament.min_by { |tour| fitness(tour) }
  end
  
  def crossover(parent1, parent2)
    start = rand(@n)
    finish = rand(start...@n)
    
    child = Array.new(@n)
    child[start..finish] = parent1[start..finish]
    
    # Fill remaining positions from parent2
    pos = 0
    parent2.each do |city|
      next if child.include?(city)
      while child[pos]
        pos += 1
      end
      child[pos] = city
    end
    
    child
  end
  
  def mutate(tour, mutation_rate: 0.1)
    if rand < mutation_rate
      i, j = rand(@n), rand(@n)
      tour[i], tour[j] = tour[j], tour[i]
    end
    tour
  end
  
  def fitness(tour)
    total = 0
    (0...@n).each do |i|
      total += @distances[tour[i]][tour[(i+1) % @n]]
    end
    total
  end
end

# Usage
solver = GeneticAlgorithmTSP.new(distances)
result = solver.solve
# => { tour: [0, 1, 3, 2, 0], distance: 80 }

Performance Considerations

Complexity Analysis

Algorithm time complexities demonstrate the fundamental trade-off between solution quality and computation time:

  • Brute force: O(n!) - infeasible beyond 12 cities
  • Dynamic programming: O(n² × 2^n) - practical to 20-25 cities
  • Branch-and-bound: O(n!) worst case, often much better with pruning
  • Nearest neighbor: O(n²) - fast but poor quality
  • 2-opt: O(n²) per iteration, typically O(kn²) total
  • Christofides: O(n³) - best guaranteed approximation
  • Genetic algorithms: O(g × p × n) where g=generations, p=population size
  • Simulated annealing: O(i × n) where i=iterations

Scalability Limits

Different approaches hit practical limits at different problem sizes:

Held-Karp dynamic programming uses exponential memory O(n × 2^n). For 25 cities, this requires storing 838 million states. Memory becomes the limiting factor before time.

Branch-and-bound performance depends critically on bound quality and branching strategy. Well-tuned implementations solve 100-city instances but may take hours. Poor bounding functions reduce this to 30-40 cities.

Modern ILP solvers using cutting planes handle symmetric TSP instances with several hundred cities. The current record for exact solution exceeds 85,000 cities using advanced techniques, though this required distributed computing and months of runtime.

Heuristics scale much better. 2-opt handles thousands of cities in reasonable time. Genetic algorithms and simulated annealing solve 10,000+ city instances, though solution quality relative to optimum becomes harder to assess.

Memory Usage Patterns

Dynamic programming stores partial solutions for each subset-endpoint pair. Bitmask representation uses 64-bit integers, limiting practical implementation to 64 cities even with sufficient memory.

Population-based metaheuristics maintain multiple candidate solutions. Genetic algorithms with 1000-member populations solving 1000-city problems store millions of values. Proper memory management prevents excessive allocation.

Tour representation affects memory. Storing tours as arrays of integers uses n×4 bytes per tour. Edge-set representations using bitmasks reduce storage but complicate operations.

Optimization Techniques

Distance matrix precomputation trades memory for speed. Calculating distances on-demand saves memory but adds overhead when the same distance gets computed repeatedly.

Early termination improves practical performance. Setting a target solution quality allows stopping once achieved. Time limits prevent unbounded execution on hard instances.

Parallel evaluation accelerates population-based methods. Evaluating fitness for each genetic algorithm individual represents independent work. Ruby parallel libraries enable concurrent evaluation on multi-core systems.

Incremental distance calculation avoids recalculating entire tour costs. When swapping edges, only four edge distances change. Updating the total incrementally reduces O(n) operations to O(1).

Cache Efficiency

Distance matrix access patterns affect cache performance. Row-major iteration maximizes cache hits. Random access patterns in metaheuristics cause cache misses.

For large instances, distance matrices exceed cache size. Blocked algorithms process subsets fitting in cache before moving to new data. Tiled 2-opt divides the tour into segments processable in cache.

Benchmark Comparisons

TSPLIB provides standard test instances for performance comparison. Problems range from 14 cities (burma14) to 85,900 cities (pla85900). Solutions are known for verification.

Typical performance on a 100-city instance:

  • Nearest neighbor: <1 second, 25% above optimal
  • 2-opt on random tour: 2-5 seconds, 5-8% above optimal
  • Genetic algorithm (500 generations): 30-60 seconds, 3-5% above optimal
  • Lin-Kernighan heuristic: 10-20 seconds, 1-2% above optimal
  • Concorde exact solver: 5-300 seconds, optimal

For 1000 cities, exact methods become impractical. Lin-Kernighan produces near-optimal solutions in minutes. Simpler heuristics complete in seconds but sacrifice 5-10% optimality.

Practical Examples

Route Optimization for Delivery Fleet

A delivery service must visit 50 customers daily, minimizing total distance. Customer locations and depot position are known.

# Load customer coordinates
customers = [
  { id: 0, lat: 40.7128, lon: -74.0060, name: "Depot" },
  { id: 1, lat: 40.7589, lon: -73.9851, name: "Customer A" },
  # ... 48 more customers
]

# Calculate distance matrix using Haversine formula
class DistanceCalculator
  EARTH_RADIUS = 6371.0 # kilometers
  
  def self.haversine(lat1, lon1, lat2, lon2)
    dlat = (lat2 - lat1) * Math::PI / 180
    dlon = (lon2 - lon1) * Math::PI / 180
    
    a = Math.sin(dlat/2)**2 + 
        Math.cos(lat1 * Math::PI / 180) * 
        Math.cos(lat2 * Math::PI / 180) * 
        Math.sin(dlon/2)**2
    
    c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a))
    EARTH_RADIUS * c
  end
  
  def self.build_matrix(locations)
    n = locations.size
    matrix = Array.new(n) { Array.new(n, 0) }
    
    n.times do |i|
      n.times do |j|
        next if i == j
        matrix[i][j] = haversine(
          locations[i][:lat], locations[i][:lon],
          locations[j][:lat], locations[j][:lon]
        )
      end
    end
    
    matrix
  end
end

distances = DistanceCalculator.build_matrix(customers)

# Use hybrid approach: nearest neighbor + 2-opt
nn_solver = NearestNeighborTSP.new(distances)
initial = nn_solver.solve
puts "Initial tour: #{initial[:distance].round(2)} km"

opt_solver = TwoOptTSP.new(distances)
optimized = opt_solver.optimize(initial[:tour])
puts "Optimized tour: #{optimized[:distance].round(2)} km"
puts "Improvement: #{((initial[:distance] - optimized[:distance]) / initial[:distance] * 100).round(2)}%"

# Extract turn-by-turn directions
tour_sequence = optimized[:tour].map { |idx| customers[idx][:name] }
puts "Route: #{tour_sequence.join(' -> ')}"

This approach balances speed and quality. Nearest neighbor constructs an initial solution in under one second. 2-opt refinement typically improves this by 10-15% in a few seconds, producing a practical route for same-day use.

Circuit Board Drilling

Manufacturing printed circuit boards requires drilling thousands of holes. Minimizing drill head travel time reduces production cost. Holes have (x,y) coordinates in millimeters.

class CircuitBoardRouter
  def initialize(hole_coordinates)
    @holes = hole_coordinates
    @distances = build_euclidean_matrix
  end
  
  def route_drilling
    # Use simulated annealing for good solutions quickly
    solver = SimulatedAnnealingTSP.new(
      @distances,
      initial_temp: 5000,
      cooling_rate: 0.998,
      iterations: 50000
    )
    
    result = solver.solve
    
    # Convert to physical coordinates
    {
      sequence: result[:tour].map { |idx| @holes[idx] },
      total_distance: result[:distance],
      estimated_time: result[:distance] / 50.0 # mm per second
    }
  end
  
  private
  
  def build_euclidean_matrix
    n = @holes.size
    matrix = Array.new(n) { Array.new(n, 0) }
    
    n.times do |i|
      n.times do |j|
        next if i == j
        dx = @holes[i][:x] - @holes[j][:x]
        dy = @holes[i][:y] - @holes[j][:y]
        matrix[i][j] = Math.sqrt(dx**2 + dy**2)
      end
    end
    
    matrix
  end
end

# 1000 holes on 200mm x 200mm board
holes = Array.new(1000) do |i|
  { id: i, x: rand * 200, y: rand * 200 }
end

router = CircuitBoardRouter.new(holes)
result = router.route_drilling

puts "Total drill distance: #{result[:total_distance].round(2)} mm"
puts "Estimated time: #{result[:estimated_time].round(2)} seconds"

For 1000 holes, simulated annealing finds near-optimal solutions in 5-10 seconds. Exact methods would require hours. The 2-3% suboptimality translates to mere seconds of extra drilling time, acceptable for the speed of solution.

Multi-Depot Vehicle Routing

A company has three warehouses and must service 60 locations. Each warehouse handles a subset of customers.

class MultiDepotTSP
  def initialize(distance_matrix, depot_assignments)
    @distances = distance_matrix
    @assignments = depot_assignments
  end
  
  def solve_all_routes
    routes = {}
    
    @assignments.each do |depot, customers|
      # Build subproblem for this depot
      indices = [depot] + customers
      sub_distances = extract_submatrix(indices)
      
      # Solve subproblem
      solver = GeneticAlgorithmTSP.new(sub_distances)
      result = solver.solve
      
      # Map back to original indices
      routes[depot] = {
        sequence: result[:tour].map { |i| indices[i] },
        distance: result[:distance]
      }
    end
    
    routes
  end
  
  private
  
  def extract_submatrix(indices)
    n = indices.size
    submatrix = Array.new(n) { Array.new(n) }
    
    n.times do |i|
      n.times do |j|
        submatrix[i][j] = @distances[indices[i]][indices[j]]
      end
    end
    
    submatrix
  end
end

# Depot 0: customers 1-20
# Depot 21: customers 22-40
# Depot 41: customers 42-60
assignments = {
  0 => (1..20).to_a,
  21 => (22..40).to_a,
  41 => (42..60).to_a
}

solver = MultiDepotTSP.new(full_distance_matrix, assignments)
routes = solver.solve_all_routes

routes.each do |depot, route|
  puts "Depot #{depot}: #{route[:distance].round(2)} km"
end

Decomposing into independent subproblems reduces complexity dramatically. Three 20-city problems solve faster than one 60-city problem, though the solution may not be globally optimal if customer-depot assignments are suboptimal.

Design Considerations

When to Use Exact Algorithms

Exact algorithms make sense when:

  • Problem size remains under 20-25 cities for dynamic programming
  • Optimality guarantees justify computational cost
  • Solutions will be used repeatedly or have high value
  • Legal or regulatory requirements demand provably optimal solutions
  • Time is available for extended computation

Branch-and-bound with good bounds extends this to 50-100 cities. ILP solvers handle larger instances but require commercial licenses for best performance.

Avoid exact methods when:

  • Problem size exceeds 100 cities
  • Solutions are needed in real-time or near-real-time
  • Approximate solutions provide sufficient value
  • Problem instances change frequently

Heuristic Selection Criteria

Nearest neighbor provides quick baseline solutions. Use for:

  • Initial solutions for local search
  • Very large instances where any reasonable solution suffices
  • Real-time applications requiring sub-second response
  • Situations where solution quality matters less than consistency

Christofides offers guaranteed approximation quality. Choose when:

  • Solution quality bounds are required
  • Problem is metric TSP (triangle inequality holds)
  • Moderate computation time is acceptable
  • Implementation simplicity matters

Metaheuristic Trade-offs

Simulated annealing excels at escaping local optima. Benefits:

  • Simple implementation
  • Few parameters to tune
  • Consistent performance across problem types
  • Embarrassingly parallel for multiple runs

Drawbacks include sensitivity to cooling schedule and lack of solution diversity.

Genetic algorithms maintain population diversity. Advantages:

  • Explore multiple solution regions simultaneously
  • Natural parallelization across population
  • Flexible through different operators

Disadvantages include higher memory use and more parameters requiring tuning.

Ant colony optimization suits dynamic problems. Strengths:

  • Adapts to changing distance matrices
  • Probabilistic exploration prevents premature convergence
  • Positive feedback reinforces good solutions

Weaknesses include slower convergence and complex parameter interactions.

Hybrid Approaches

Combining methods often outperforms single approaches:

Nearest neighbor + 2-opt provides fast, quality solutions. Constructive heuristic builds initial tour, local search refines it. This combination handles 100-1000 cities effectively.

Genetic algorithm with local search improves population members. After crossover, applying 2-opt to offspring enhances overall quality. Extra computational cost is often worthwhile.

Multiple independent runs with different random seeds produces solution sets. Taking the best of 10 simulated annealing runs costs 10x time but significantly improves expected quality.

Problem Characteristics Impact

Clustered cities benefit from cluster-first, route-second decomposition. Identify city clusters using k-means, solve TSP within each cluster, connect clusters optimally. This reduces an n-city problem to k smaller problems plus a k-city problem.

Asymmetric distance matrices eliminate some optimizations. 2-opt moves that reverse tour segments don't preserve distances in asymmetric TSP. Alternative operators like 2.5-opt handle asymmetry.

Triangle inequality violations prevent Christofides guarantees. When direct routes may be longer than indirect routes, constructive heuristics perform unpredictably.

Time Constraints vs Quality Requirements

Real-time constraints (<1 second) limit choices to nearest neighbor and simple construction heuristics. Solution quality sacrifices are necessary.

Interactive time (<10 seconds) enables 2-opt and limited metaheuristic iterations. Multiple rounds of improvement are feasible.

Batch processing (minutes to hours) allows genetic algorithms, extensive local search, or exact methods on moderate instances.

Instance-Specific Tuning

Repeated solving of similar instances justifies parameter tuning. Metaheuristic parameters optimized for specific distance distributions improve results. Machine learning can select algorithms based on instance features.

One-off instances receive less tuning benefit. Default parameters and robust algorithms make more sense.

Reference

Algorithm Complexity Comparison

Algorithm Time Complexity Space Complexity Solution Quality Typical Size Limit
Brute Force O(n!) O(n) Optimal 12 cities
Dynamic Programming O(n² × 2^n) O(n × 2^n) Optimal 25 cities
Branch and Bound O(n!) worst case O(n) Optimal 100 cities
Nearest Neighbor O(n²) O(n) 125% average Unlimited
Greedy Insertion O(n² log n) O(n) 115% average Unlimited
Christofides O(n³) O(n²) 150% worst case 10000 cities
2-Opt O(n²) per iteration O(n) Local optimum Unlimited
3-Opt O(n³) per iteration O(n) Better local opt 5000 cities
Lin-Kernighan O(n²·³) average O(n) Near optimal Unlimited
Simulated Annealing O(iterations × n) O(n) High quality Unlimited
Genetic Algorithm O(gen × pop × n) O(pop × n) High quality Unlimited
Ant Colony O(ants × iter × n²) O(n²) High quality Unlimited

Distance Calculation Methods

Distance Type Formula Use Case Properties
Euclidean sqrt((x₁-x₂)² + (y₁-y₂)²) Planar coordinates Symmetric, metric
Manhattan abs(x₁-x₂) + abs(y₁-y₂) Grid-based movement Symmetric, metric
Haversine Great circle distance Geographic coordinates Symmetric, metric
Geodesic Ellipsoid distance Precise geography Symmetric, metric
Asymmetric Different by direction One-way streets Non-symmetric
Time-based Traffic-adjusted Real-world routing Non-metric

Parameter Guidelines

Algorithm Parameter Typical Range Impact
Simulated Annealing Initial temperature 1000-10000 Higher = more exploration
Simulated Annealing Cooling rate 0.95-0.999 Lower = faster convergence
Simulated Annealing Iterations 10000-100000 More = better quality
Genetic Algorithm Population size 50-500 Larger = more diversity
Genetic Algorithm Generations 100-1000 More = better convergence
Genetic Algorithm Mutation rate 0.01-0.2 Higher = more exploration
Genetic Algorithm Elite percentage 0.05-0.2 Higher = faster convergence
Ant Colony Ant count 10-100 More = better exploration
Ant Colony Pheromone evaporation 0.01-0.3 Higher = less persistence
Ant Colony Alpha 1-3 Pheromone importance
Ant Colony Beta 2-5 Distance importance

Algorithm Selection Guide

Problem Size Time Available Quality Required Recommended Approach
1-15 cities Any Optimal Dynamic Programming
16-25 cities >1 minute Optimal Dynamic Programming
26-100 cities >10 minutes Optimal Branch and Bound or ILP
Any size <1 second Any Nearest Neighbor
10-1000 cities <10 seconds Good NN + 2-Opt
100-10000 cities <1 minute High Genetic Algorithm
Any size <5 minutes Highest possible Multiple SA runs
Clustered cities Any Good Cluster decomposition
Dynamic distances Any Good Ant Colony

Implementation Checklist

Requirement Considerations
Distance calculation Precompute matrix vs on-demand, symmetric vs asymmetric
Tour representation Array of cities, edge set, adjacency structure
Initial solution Random, nearest neighbor, greedy insertion
Termination criteria Fixed iterations, time limit, quality threshold, no improvement
Solution validation Check each city visited once, tour is closed
Performance monitoring Track iterations, best solution, convergence rate
Parallel opportunities Multiple runs, population evaluation, local search
Memory management Matrix size, population storage, memoization cache