CrackedRuby - Traveling Salesman Problem Approaches

Overview

The Traveling Salesman Problem (TSP) requires finding the shortest possible route that visits each city exactly once and returns to the starting city. This combinatorial optimization problem belongs to the NP-hard complexity class, meaning no known polynomial-time algorithm exists for finding optimal solutions to all instances.

The problem has direct applications in logistics, circuit board manufacturing, DNA sequencing, and network routing. A delivery company with 20 stops faces 2,432,902,008,176,640,000 possible routes. Testing each route at one billion routes per second would require over 77,000 years.

The mathematical formulation defines a complete graph with n vertices (cities) and weighted edges representing distances. The objective function minimizes the total distance:

minimize: Σ(i=1 to n) distance(city[i], city[i+1])
constraint: each city visited exactly once
constraint: return to starting city

For a symmetric TSP, distance(A,B) equals distance(B,A). Asymmetric variants model one-way streets or different costs based on direction.

The gap between finding any solution and finding the optimal solution drives the variety of approaches. A simple nearest neighbor heuristic completes in seconds but may produce routes 25% longer than optimal. An exact branch-and-bound algorithm guarantees optimality but becomes intractable beyond 20-30 cities.

Key Principles

The state space grows factorially with city count. For n cities, there are (n-1)!/2 distinct tours in symmetric TSP. This factorial growth explains why brute force becomes impossible:

10 cities: 181,440 tours
15 cities: 43,589,145,600 tours
20 cities: 60,822,550,204,416,000 tours

Graph Representation

TSP operates on weighted graphs where vertices represent cities and edge weights represent costs. The distance matrix stores all pairwise distances:

     A    B    C    D
A    0   10   15   20
B   10    0   35   25
C   15   35    0   30
D   20   25   30    0

A tour corresponds to a Hamiltonian cycle—a path visiting each vertex exactly once and returning to the start.

Optimization Objectives

The primary objective minimizes total tour length. Secondary objectives may include:

Minimizing maximum edge weight (minimax TSP)
Time windows for city visits (TSP with time constraints)
Multiple salesmen (mTSP)
Pickup and delivery constraints (vehicle routing)

Solution Quality Metrics

Approximation ratio compares a heuristic solution to the optimal solution. A 1.5-approximation algorithm produces tours at most 50% longer than optimal. Tour improvement measures the percentage reduction from initial to final tour length.

Lower Bounds

Computing lower bounds helps evaluate solution quality without knowing the optimum. Common lower bound techniques:

Minimum spanning tree: Sum of MST edges plus two shortest edges not in MST
Assignment problem relaxation: Solve as an assignment problem, ignoring subtour constraints
Held-Karp bound: Strongest bound, computed via Lagrangian relaxation

Local Optimality

Many algorithms produce locally optimal solutions—tours that cannot be improved by small modifications. A 2-opt local optimum has no improving edge swaps. Local optima may differ significantly from global optima.

Implementation Approaches

TSP approaches fall into three categories based on solution guarantees and computational requirements.

Exact Algorithms

Exact algorithms guarantee finding the optimal tour but have exponential worst-case complexity.

Dynamic programming (Held-Karp) computes optimal subtours systematically. For each subset S of cities and each city j in S, it stores the minimum cost to visit all cities in S exactly once, ending at j. The recurrence relation:

C(S, j) = min{C(S-{j}, i) + distance(i,j)} for all i in S-{j}

Time complexity O(n² × 2^n) and space complexity O(n × 2^n) limit this to 20-25 cities despite being more efficient than brute force.

Branch-and-bound explores the solution tree while pruning branches that cannot improve the best known solution. Each node represents a partial tour. The algorithm:

Computes a lower bound for completing the partial tour
Compares to the best complete tour found
Prunes if lower bound exceeds best tour length
Otherwise, branches to explore extensions

Effective bounding functions determine performance. Assignment problem relaxation provides tight bounds efficiently.

Integer linear programming formulates TSP with binary variables x_ij indicating whether edge (i,j) appears in the tour:

minimize: ΣΣ distance(i,j) × x_ij
subject to: Σ x_ij = 1 for all j (enter each city once)
            Σ x_ij = 1 for all i (leave each city once)
            no subtours (subtour elimination constraints)

Modern ILP solvers handle 100+ cities using cutting planes and branch-and-cut.

Constructive Heuristics

Constructive heuristics build tours incrementally without guaranteeing optimality.

Nearest neighbor starts at an arbitrary city and repeatedly visits the nearest unvisited city. Complexity O(n²) makes it fast but produces tours averaging 25% above optimal.

Greedy insertion selects edges in ascending distance order, adding them to the tour if they maintain tour validity. This produces better solutions than nearest neighbor but remains O(n² log n).

Christofides algorithm guarantees solutions within 50% of optimal for metric TSP:

Compute minimum spanning tree
Find minimum weight perfect matching on odd-degree vertices
Combine to form Eulerian graph
Extract Hamiltonian cycle

Complexity O(n³) and 1.5-approximation ratio make this the best guaranteed approximation algorithm for metric TSP.

Local Search and Metaheuristics

Local search iteratively improves tours by making local modifications.

2-opt removes two edges and reconnects the tour. For each pair of edges (i,i+1) and (j,j+1), it tests whether reversing the segment between them reduces tour length. Complexity O(n²) per iteration typically requires multiple iterations.

3-opt considers removing three edges and reconnecting, testing seven possible reconnections. Higher computational cost O(n³) per iteration often produces better solutions.

Lin-Kernighan uses variable-depth search, determining the number of edge swaps dynamically. This produces high-quality solutions but has complex implementation.

Simulated annealing accepts worse solutions with probability decreasing over time, escaping local optima:

if new_tour better than current:
    accept
else:
    accept with probability exp(-(new_cost - current_cost) / temperature)
reduce temperature

Temperature schedule determines performance. Geometric cooling multiplies temperature by 0.95-0.99 each iteration.

Genetic algorithms maintain a population of tours, combining good tours to create offspring. Crossover operators preserve tour segments from parents. Edge recombination crossover maintains edge relationships.

Ant colony optimization simulates ants depositing pheromones on good paths. Artificial ants construct tours probabilistically, favoring edges with strong pheromones and short distances. Pheromone evaporation prevents premature convergence.

Ruby Implementation

Nearest Neighbor Heuristic

class NearestNeighborTSP
  def initialize(distance_matrix)
    @distances = distance_matrix
    @n = distance_matrix.size
  end
  
  def solve(start_city = 0)
    unvisited = (0...@n).to_a
    current = start_city
    tour = [current]
    unvisited.delete(current)
    total_distance = 0
    
    while unvisited.any?
      nearest = unvisited.min_by { |city| @distances[current][city] }
      total_distance += @distances[current][nearest]
      tour << nearest
      current = nearest
      unvisited.delete(nearest)
    end
    
    # Return to start
    total_distance += @distances[current][start_city]
    tour << start_city
    
    { tour: tour, distance: total_distance }
  end
end

# Usage
distances = [
  [0, 10, 15, 20],
  [10, 0, 35, 25],
  [15, 35, 0, 30],
  [20, 25, 30, 0]
]

solver = NearestNeighborTSP.new(distances)
result = solver.solve
# => { tour: [0, 1, 3, 2, 0], distance: 80 }

2-Opt Local Search

class TwoOptTSP
  def initialize(distance_matrix)
    @distances = distance_matrix
    @n = distance_matrix.size
  end
  
  def optimize(initial_tour)
    tour = initial_tour.dup
    improved = true
    
    while improved
      improved = false
      
      (0...@n-1).each do |i|
        (i+1...@n).each do |j|
          new_tour = two_opt_swap(tour, i, j)
          new_distance = calculate_distance(new_tour)
          
          if new_distance < calculate_distance(tour)
            tour = new_tour
            improved = true
          end
        end
      end
    end
    
    { tour: tour, distance: calculate_distance(tour) }
  end
  
  private
  
  def two_opt_swap(tour, i, j)
    # Reverse segment between i and j
    tour[0..i] + tour[i+1..j].reverse + tour[j+1..-1]
  end
  
  def calculate_distance(tour)
    total = 0
    (0...tour.size-1).each do |i|
      total += @distances[tour[i]][tour[i+1]]
    end
    total
  end
end

# Usage with initial nearest neighbor solution
nn_solver = NearestNeighborTSP.new(distances)
initial = nn_solver.solve[:tour]

opt_solver = TwoOptTSP.new(distances)
result = opt_solver.optimize(initial)
# => { tour: [0, 1, 3, 2, 0], distance: 80 }

Dynamic Programming (Held-Karp)

class HeldKarpTSP
  def initialize(distance_matrix)
    @distances = distance_matrix
    @n = distance_matrix.size
    @memo = {}
  end
  
  def solve
    # Bitmask representing all cities except start (city 0)
    all_cities = (1 << (@n - 1)) - 1
    
    # Find minimum cost to visit all cities, ending at each city
    min_cost = Float::INFINITY
    final_city = -1
    
    (1...@n).each do |end_city|
      cost = tsp(all_cities, end_city) + @distances[end_city][0]
      if cost < min_cost
        min_cost = cost
        final_city = end_city
      end
    end
    
    tour = reconstruct_tour(all_cities, final_city)
    { tour: tour, distance: min_cost }
  end
  
  private
  
  def tsp(mask, pos)
    # Base case: only one city left
    return @distances[0][pos] if mask == 0
    
    key = [mask, pos]
    return @memo[key] if @memo.key?(key)
    
    min_cost = Float::INFINITY
    
    # Try visiting each unvisited city next
    (1...@n).each do |city|
      # Check if city is in the subset
      bit = 1 << (city - 1)
      next unless (mask & bit) != 0
      
      # Remove city from subset and recurse
      new_mask = mask & ~bit
      cost = @distances[pos][city] + tsp(new_mask, city)
      min_cost = [min_cost, cost].min
    end
    
    @memo[key] = min_cost
  end
  
  def reconstruct_tour(mask, pos)
    tour = [0, pos]
    current_mask = mask & ~(1 << (pos - 1))
    current_pos = pos
    
    while current_mask > 0
      next_city = nil
      min_cost = Float::INFINITY
      
      (1...@n).each do |city|
        bit = 1 << (city - 1)
        next unless (current_mask & bit) != 0
        
        new_mask = current_mask & ~bit
        cost = @distances[current_pos][city] + tsp(new_mask, city)
        
        if cost < min_cost
          min_cost = cost
          next_city = city
        end
      end
      
      tour << next_city
      current_mask &= ~(1 << (next_city - 1))
      current_pos = next_city
    end
    
    tour << 0
    tour
  end
end

# Usage (practical only for small instances)
small_distances = [
  [0, 10, 15, 20],
  [10, 0, 35, 25],
  [15, 35, 0, 30],
  [20, 25, 30, 0]
]

solver = HeldKarpTSP.new(small_distances)
result = solver.solve
# => { tour: [0, 1, 3, 2, 0], distance: 80 }

Simulated Annealing

class SimulatedAnnealingTSP
  def initialize(distance_matrix, initial_temp: 1000, cooling_rate: 0.995, iterations: 10000)
    @distances = distance_matrix
    @n = distance_matrix.size
    @initial_temp = initial_temp
    @cooling_rate = cooling_rate
    @iterations = iterations
  end
  
  def solve
    current_tour = (0...@n).to_a + [0]
    current_cost = calculate_cost(current_tour)
    
    best_tour = current_tour.dup
    best_cost = current_cost
    
    temperature = @initial_temp
    
    @iterations.times do
      # Generate neighbor by swapping two random cities
      new_tour = generate_neighbor(current_tour)
      new_cost = calculate_cost(new_tour)
      
      # Accept if better, or with probability based on temperature
      if new_cost < current_cost || 
         rand < Math.exp((current_cost - new_cost) / temperature)
        current_tour = new_tour
        current_cost = new_cost
        
        if current_cost < best_cost
          best_tour = current_tour.dup
          best_cost = current_cost
        end
      end
      
      temperature *= @cooling_rate
    end
    
    { tour: best_tour, distance: best_cost }
  end
  
  private
  
  def generate_neighbor(tour)
    new_tour = tour.dup
    # Swap two random positions (excluding start/end)
    i = rand(1...@n)
    j = rand(1...@n)
    new_tour[i], new_tour[j] = new_tour[j], new_tour[i]
    new_tour
  end
  
  def calculate_cost(tour)
    total = 0
    (0...tour.size-1).each do |i|
      total += @distances[tour[i]][tour[i+1]]
    end
    total
  end
end

# Usage
solver = SimulatedAnnealingTSP.new(distances)
result = solver.solve
# => { tour: [0, 1, 3, 2, 0], distance: 80 }

Genetic Algorithm

class GeneticAlgorithmTSP
  def initialize(distance_matrix, population_size: 100, generations: 500)
    @distances = distance_matrix
    @n = distance_matrix.size
    @population_size = population_size
    @generations = generations
  end
  
  def solve
    population = initialize_population
    
    @generations.times do
      population = evolve(population)
    end
    
    best_tour = population.min_by { |tour| fitness(tour) }
    { tour: best_tour + [best_tour[0]], distance: fitness(best_tour) }
  end
  
  private
  
  def initialize_population
    Array.new(@population_size) { (0...@n).to_a.shuffle }
  end
  
  def evolve(population)
    new_population = []
    
    # Elitism: keep best 10%
    elite_size = @population_size / 10
    sorted = population.sort_by { |tour| fitness(tour) }
    new_population.concat(sorted[0...elite_size])
    
    # Generate rest through crossover and mutation
    while new_population.size < @population_size
      parent1 = tournament_select(population)
      parent2 = tournament_select(population)
      child = crossover(parent1, parent2)
      child = mutate(child)
      new_population << child
    end
    
    new_population
  end
  
  def tournament_select(population, tournament_size: 5)
    tournament = population.sample(tournament_size)
    tournament.min_by { |tour| fitness(tour) }
  end
  
  def crossover(parent1, parent2)
    start = rand(@n)
    finish = rand(start...@n)
    
    child = Array.new(@n)
    child[start..finish] = parent1[start..finish]
    
    # Fill remaining positions from parent2
    pos = 0
    parent2.each do |city|
      next if child.include?(city)
      while child[pos]
        pos += 1
      end
      child[pos] = city
    end
    
    child
  end
  
  def mutate(tour, mutation_rate: 0.1)
    if rand < mutation_rate
      i, j = rand(@n), rand(@n)
      tour[i], tour[j] = tour[j], tour[i]
    end
    tour
  end
  
  def fitness(tour)
    total = 0
    (0...@n).each do |i|
      total += @distances[tour[i]][tour[(i+1) % @n]]
    end
    total
  end
end

# Usage
solver = GeneticAlgorithmTSP.new(distances)
result = solver.solve
# => { tour: [0, 1, 3, 2, 0], distance: 80 }

Performance Considerations

Complexity Analysis

Algorithm time complexities demonstrate the fundamental trade-off between solution quality and computation time:

Brute force: O(n!) - infeasible beyond 12 cities
Dynamic programming: O(n² × 2^n) - practical to 20-25 cities
Branch-and-bound: O(n!) worst case, often much better with pruning
Nearest neighbor: O(n²) - fast but poor quality
2-opt: O(n²) per iteration, typically O(kn²) total
Christofides: O(n³) - best guaranteed approximation
Genetic algorithms: O(g × p × n) where g=generations, p=population size
Simulated annealing: O(i × n) where i=iterations

Scalability Limits

Different approaches hit practical limits at different problem sizes:

Held-Karp dynamic programming uses exponential memory O(n × 2^n). For 25 cities, this requires storing 838 million states. Memory becomes the limiting factor before time.

Branch-and-bound performance depends critically on bound quality and branching strategy. Well-tuned implementations solve 100-city instances but may take hours. Poor bounding functions reduce this to 30-40 cities.

Modern ILP solvers using cutting planes handle symmetric TSP instances with several hundred cities. The current record for exact solution exceeds 85,000 cities using advanced techniques, though this required distributed computing and months of runtime.

Heuristics scale much better. 2-opt handles thousands of cities in reasonable time. Genetic algorithms and simulated annealing solve 10,000+ city instances, though solution quality relative to optimum becomes harder to assess.

Memory Usage Patterns

Dynamic programming stores partial solutions for each subset-endpoint pair. Bitmask representation uses 64-bit integers, limiting practical implementation to 64 cities even with sufficient memory.

Population-based metaheuristics maintain multiple candidate solutions. Genetic algorithms with 1000-member populations solving 1000-city problems store millions of values. Proper memory management prevents excessive allocation.

Tour representation affects memory. Storing tours as arrays of integers uses n×4 bytes per tour. Edge-set representations using bitmasks reduce storage but complicate operations.

Optimization Techniques

Distance matrix precomputation trades memory for speed. Calculating distances on-demand saves memory but adds overhead when the same distance gets computed repeatedly.

Early termination improves practical performance. Setting a target solution quality allows stopping once achieved. Time limits prevent unbounded execution on hard instances.

Parallel evaluation accelerates population-based methods. Evaluating fitness for each genetic algorithm individual represents independent work. Ruby parallel libraries enable concurrent evaluation on multi-core systems.

Incremental distance calculation avoids recalculating entire tour costs. When swapping edges, only four edge distances change. Updating the total incrementally reduces O(n) operations to O(1).

Cache Efficiency

Distance matrix access patterns affect cache performance. Row-major iteration maximizes cache hits. Random access patterns in metaheuristics cause cache misses.

For large instances, distance matrices exceed cache size. Blocked algorithms process subsets fitting in cache before moving to new data. Tiled 2-opt divides the tour into segments processable in cache.

Benchmark Comparisons

TSPLIB provides standard test instances for performance comparison. Problems range from 14 cities (burma14) to 85,900 cities (pla85900). Solutions are known for verification.

Typical performance on a 100-city instance:

Nearest neighbor: <1 second, 25% above optimal
2-opt on random tour: 2-5 seconds, 5-8% above optimal
Genetic algorithm (500 generations): 30-60 seconds, 3-5% above optimal
Lin-Kernighan heuristic: 10-20 seconds, 1-2% above optimal
Concorde exact solver: 5-300 seconds, optimal

For 1000 cities, exact methods become impractical. Lin-Kernighan produces near-optimal solutions in minutes. Simpler heuristics complete in seconds but sacrifice 5-10% optimality.

Practical Examples

Route Optimization for Delivery Fleet

A delivery service must visit 50 customers daily, minimizing total distance. Customer locations and depot position are known.

# Load customer coordinates
customers = [
  { id: 0, lat: 40.7128, lon: -74.0060, name: "Depot" },
  { id: 1, lat: 40.7589, lon: -73.9851, name: "Customer A" },
  # ... 48 more customers
]

# Calculate distance matrix using Haversine formula
class DistanceCalculator
  EARTH_RADIUS = 6371.0 # kilometers
  
  def self.haversine(lat1, lon1, lat2, lon2)
    dlat = (lat2 - lat1) * Math::PI / 180
    dlon = (lon2 - lon1) * Math::PI / 180
    
    a = Math.sin(dlat/2)**2 + 
        Math.cos(lat1 * Math::PI / 180) * 
        Math.cos(lat2 * Math::PI / 180) * 
        Math.sin(dlon/2)**2
    
    c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a))
    EARTH_RADIUS * c
  end
  
  def self.build_matrix(locations)
    n = locations.size
    matrix = Array.new(n) { Array.new(n, 0) }
    
    n.times do |i|
      n.times do |j|
        next if i == j
        matrix[i][j] = haversine(
          locations[i][:lat], locations[i][:lon],
          locations[j][:lat], locations[j][:lon]
        )
      end
    end
    
    matrix
  end
end

distances = DistanceCalculator.build_matrix(customers)

# Use hybrid approach: nearest neighbor + 2-opt
nn_solver = NearestNeighborTSP.new(distances)
initial = nn_solver.solve
puts "Initial tour: #{initial[:distance].round(2)} km"

opt_solver = TwoOptTSP.new(distances)
optimized = opt_solver.optimize(initial[:tour])
puts "Optimized tour: #{optimized[:distance].round(2)} km"
puts "Improvement: #{((initial[:distance] - optimized[:distance]) / initial[:distance] * 100).round(2)}%"

# Extract turn-by-turn directions
tour_sequence = optimized[:tour].map { |idx| customers[idx][:name] }
puts "Route: #{tour_sequence.join(' -> ')}"

This approach balances speed and quality. Nearest neighbor constructs an initial solution in under one second. 2-opt refinement typically improves this by 10-15% in a few seconds, producing a practical route for same-day use.

Circuit Board Drilling

Manufacturing printed circuit boards requires drilling thousands of holes. Minimizing drill head travel time reduces production cost. Holes have (x,y) coordinates in millimeters.

class CircuitBoardRouter
  def initialize(hole_coordinates)
    @holes = hole_coordinates
    @distances = build_euclidean_matrix
  end
  
  def route_drilling
    # Use simulated annealing for good solutions quickly
    solver = SimulatedAnnealingTSP.new(
      @distances,
      initial_temp: 5000,
      cooling_rate: 0.998,
      iterations: 50000
    )
    
    result = solver.solve
    
    # Convert to physical coordinates
    {
      sequence: result[:tour].map { |idx| @holes[idx] },
      total_distance: result[:distance],
      estimated_time: result[:distance] / 50.0 # mm per second
    }
  end
  
  private
  
  def build_euclidean_matrix
    n = @holes.size
    matrix = Array.new(n) { Array.new(n, 0) }
    
    n.times do |i|
      n.times do |j|
        next if i == j
        dx = @holes[i][:x] - @holes[j][:x]
        dy = @holes[i][:y] - @holes[j][:y]
        matrix[i][j] = Math.sqrt(dx**2 + dy**2)
      end
    end
    
    matrix
  end
end

# 1000 holes on 200mm x 200mm board
holes = Array.new(1000) do |i|
  { id: i, x: rand * 200, y: rand * 200 }
end

router = CircuitBoardRouter.new(holes)
result = router.route_drilling

puts "Total drill distance: #{result[:total_distance].round(2)} mm"
puts "Estimated time: #{result[:estimated_time].round(2)} seconds"

For 1000 holes, simulated annealing finds near-optimal solutions in 5-10 seconds. Exact methods would require hours. The 2-3% suboptimality translates to mere seconds of extra drilling time, acceptable for the speed of solution.

Multi-Depot Vehicle Routing

A company has three warehouses and must service 60 locations. Each warehouse handles a subset of customers.

class MultiDepotTSP
  def initialize(distance_matrix, depot_assignments)
    @distances = distance_matrix
    @assignments = depot_assignments
  end
  
  def solve_all_routes
    routes = {}
    
    @assignments.each do |depot, customers|
      # Build subproblem for this depot
      indices = [depot] + customers
      sub_distances = extract_submatrix(indices)
      
      # Solve subproblem
      solver = GeneticAlgorithmTSP.new(sub_distances)
      result = solver.solve
      
      # Map back to original indices
      routes[depot] = {
        sequence: result[:tour].map { |i| indices[i] },
        distance: result[:distance]
      }
    end
    
    routes
  end
  
  private
  
  def extract_submatrix(indices)
    n = indices.size
    submatrix = Array.new(n) { Array.new(n) }
    
    n.times do |i|
      n.times do |j|
        submatrix[i][j] = @distances[indices[i]][indices[j]]
      end
    end
    
    submatrix
  end
end

# Depot 0: customers 1-20
# Depot 21: customers 22-40
# Depot 41: customers 42-60
assignments = {
  0 => (1..20).to_a,
  21 => (22..40).to_a,
  41 => (42..60).to_a
}

solver = MultiDepotTSP.new(full_distance_matrix, assignments)
routes = solver.solve_all_routes

routes.each do |depot, route|
  puts "Depot #{depot}: #{route[:distance].round(2)} km"
end

Decomposing into independent subproblems reduces complexity dramatically. Three 20-city problems solve faster than one 60-city problem, though the solution may not be globally optimal if customer-depot assignments are suboptimal.

Design Considerations

When to Use Exact Algorithms

Exact algorithms make sense when:

Problem size remains under 20-25 cities for dynamic programming
Optimality guarantees justify computational cost
Solutions will be used repeatedly or have high value
Legal or regulatory requirements demand provably optimal solutions
Time is available for extended computation

Branch-and-bound with good bounds extends this to 50-100 cities. ILP solvers handle larger instances but require commercial licenses for best performance.

Avoid exact methods when:

Problem size exceeds 100 cities
Solutions are needed in real-time or near-real-time
Approximate solutions provide sufficient value
Problem instances change frequently

Heuristic Selection Criteria

Nearest neighbor provides quick baseline solutions. Use for:

Initial solutions for local search
Very large instances where any reasonable solution suffices
Real-time applications requiring sub-second response
Situations where solution quality matters less than consistency

Christofides offers guaranteed approximation quality. Choose when:

Solution quality bounds are required
Problem is metric TSP (triangle inequality holds)
Moderate computation time is acceptable
Implementation simplicity matters

Metaheuristic Trade-offs

Simulated annealing excels at escaping local optima. Benefits:

Simple implementation
Few parameters to tune
Consistent performance across problem types
Embarrassingly parallel for multiple runs

Drawbacks include sensitivity to cooling schedule and lack of solution diversity.

Genetic algorithms maintain population diversity. Advantages:

Explore multiple solution regions simultaneously
Natural parallelization across population
Flexible through different operators

Disadvantages include higher memory use and more parameters requiring tuning.

Ant colony optimization suits dynamic problems. Strengths:

Adapts to changing distance matrices
Probabilistic exploration prevents premature convergence
Positive feedback reinforces good solutions

Weaknesses include slower convergence and complex parameter interactions.

Hybrid Approaches

Combining methods often outperforms single approaches:

Nearest neighbor + 2-opt provides fast, quality solutions. Constructive heuristic builds initial tour, local search refines it. This combination handles 100-1000 cities effectively.

Genetic algorithm with local search improves population members. After crossover, applying 2-opt to offspring enhances overall quality. Extra computational cost is often worthwhile.

Multiple independent runs with different random seeds produces solution sets. Taking the best of 10 simulated annealing runs costs 10x time but significantly improves expected quality.

Problem Characteristics Impact

Clustered cities benefit from cluster-first, route-second decomposition. Identify city clusters using k-means, solve TSP within each cluster, connect clusters optimally. This reduces an n-city problem to k smaller problems plus a k-city problem.

Asymmetric distance matrices eliminate some optimizations. 2-opt moves that reverse tour segments don't preserve distances in asymmetric TSP. Alternative operators like 2.5-opt handle asymmetry.

Triangle inequality violations prevent Christofides guarantees. When direct routes may be longer than indirect routes, constructive heuristics perform unpredictably.

Time Constraints vs Quality Requirements

Real-time constraints (<1 second) limit choices to nearest neighbor and simple construction heuristics. Solution quality sacrifices are necessary.

Interactive time (<10 seconds) enables 2-opt and limited metaheuristic iterations. Multiple rounds of improvement are feasible.

Batch processing (minutes to hours) allows genetic algorithms, extensive local search, or exact methods on moderate instances.

Instance-Specific Tuning

Repeated solving of similar instances justifies parameter tuning. Metaheuristic parameters optimized for specific distance distributions improve results. Machine learning can select algorithms based on instance features.

One-off instances receive less tuning benefit. Default parameters and robust algorithms make more sense.

Reference

Algorithm Complexity Comparison

Algorithm	Time Complexity	Space Complexity	Solution Quality	Typical Size Limit
Brute Force	O(n!)	O(n)	Optimal	12 cities
Dynamic Programming	O(n² × 2^n)	O(n × 2^n)	Optimal	25 cities
Branch and Bound	O(n!) worst case	O(n)	Optimal	100 cities
Nearest Neighbor	O(n²)	O(n)	125% average	Unlimited
Greedy Insertion	O(n² log n)	O(n)	115% average	Unlimited
Christofides	O(n³)	O(n²)	150% worst case	10000 cities
2-Opt	O(n²) per iteration	O(n)	Local optimum	Unlimited
3-Opt	O(n³) per iteration	O(n)	Better local opt	5000 cities
Lin-Kernighan	O(n²·³) average	O(n)	Near optimal	Unlimited
Simulated Annealing	O(iterations × n)	O(n)	High quality	Unlimited
Genetic Algorithm	O(gen × pop × n)	O(pop × n)	High quality	Unlimited
Ant Colony	O(ants × iter × n²)	O(n²)	High quality	Unlimited

Distance Calculation Methods

Distance Type	Formula	Use Case	Properties
Euclidean	sqrt((x₁-x₂)² + (y₁-y₂)²)	Planar coordinates	Symmetric, metric
Manhattan	abs(x₁-x₂) + abs(y₁-y₂)	Grid-based movement	Symmetric, metric
Haversine	Great circle distance	Geographic coordinates	Symmetric, metric
Geodesic	Ellipsoid distance	Precise geography	Symmetric, metric
Asymmetric	Different by direction	One-way streets	Non-symmetric
Time-based	Traffic-adjusted	Real-world routing	Non-metric

Parameter Guidelines

Algorithm	Parameter	Typical Range	Impact
Simulated Annealing	Initial temperature	1000-10000	Higher = more exploration
Simulated Annealing	Cooling rate	0.95-0.999	Lower = faster convergence
Simulated Annealing	Iterations	10000-100000	More = better quality
Genetic Algorithm	Population size	50-500	Larger = more diversity
Genetic Algorithm	Generations	100-1000	More = better convergence
Genetic Algorithm	Mutation rate	0.01-0.2	Higher = more exploration
Genetic Algorithm	Elite percentage	0.05-0.2	Higher = faster convergence
Ant Colony	Ant count	10-100	More = better exploration
Ant Colony	Pheromone evaporation	0.01-0.3	Higher = less persistence
Ant Colony	Alpha	1-3	Pheromone importance
Ant Colony	Beta	2-5	Distance importance

Algorithm Selection Guide

Problem Size	Time Available	Quality Required	Recommended Approach
1-15 cities	Any	Optimal	Dynamic Programming
16-25 cities	>1 minute	Optimal	Dynamic Programming
26-100 cities	>10 minutes	Optimal	Branch and Bound or ILP
Any size	<1 second	Any	Nearest Neighbor
10-1000 cities	<10 seconds	Good	NN + 2-Opt
100-10000 cities	<1 minute	High	Genetic Algorithm
Any size	<5 minutes	Highest possible	Multiple SA runs
Clustered cities	Any	Good	Cluster decomposition
Dynamic distances	Any	Good	Ant Colony

Implementation Checklist

Requirement	Considerations
Distance calculation	Precompute matrix vs on-demand, symmetric vs asymmetric
Tour representation	Array of cities, edge set, adjacency structure
Initial solution	Random, nearest neighbor, greedy insertion
Termination criteria	Fixed iterations, time limit, quality threshold, no improvement
Solution validation	Check each city visited once, tour is closed
Performance monitoring	Track iterations, best solution, convergence rate
Parallel opportunities	Multiple runs, population evaluation, local search
Memory management	Matrix size, population storage, memoization cache

Traveling Salesman Problem Approaches