CrackedRuby - Blue-Green Deployment

Overview

Blue-green deployment maintains two identical production environments, designated "blue" and "green," with only one serving live traffic at any time. The inactive environment receives the new release, undergoes testing, and becomes active through a traffic switch. The previous environment remains available for immediate rollback if issues emerge.

The strategy originated at organizations managing large-scale web services requiring continuous availability. Traditional deployment approaches caused downtime during updates, created prolonged rollback procedures, and increased risk from deploying directly to production. Blue-green deployment addresses these issues by separating the deployment process from the traffic switch.

The core mechanism involves maintaining infrastructure parity between environments. Both environments run identical configurations except for application versions. A router, load balancer, or DNS system controls which environment receives production traffic. Deployment occurs to the inactive environment, validation confirms functionality, and traffic switches atomically to the new version.

# Simplified deployment state representation
class DeploymentEnvironment
  attr_reader :name, :version, :status, :traffic_percentage
  
  def initialize(name)
    @name = name
    @version = nil
    @status = :inactive
    @traffic_percentage = 0
  end
  
  def deploy(version)
    @version = version
    @status = :deployed
  end
  
  def activate
    @status = :active
    @traffic_percentage = 100
  end
  
  def deactivate
    @status = :inactive
    @traffic_percentage = 0
  end
end

blue = DeploymentEnvironment.new('blue')
green = DeploymentEnvironment.new('green')

# Initial state: blue is active with v1.0
blue.deploy('v1.0')
blue.activate
# => Blue: v1.0, active, 100% traffic

# Deploy v2.0 to green environment
green.deploy('v2.0')
# => Green: v2.0, deployed, 0% traffic

# After validation, switch traffic
blue.deactivate
green.activate
# => Blue: v1.0, inactive, 0% traffic
# => Green: v2.0, active, 100% traffic

The approach differs from rolling deployments, which gradually replace instances within a single environment. Blue-green deployment treats environments as atomic units, switching all traffic simultaneously. This creates clean separation between old and new versions without intermediate mixed states.

Organizations implement blue-green deployment at different infrastructure layers. Application-level implementations manage duplicate application servers behind a load balancer. Container orchestration platforms duplicate service definitions. Cloud providers offer environment cloning at the infrastructure level. The implementation layer affects automation complexity, resource costs, and switching mechanisms.

Key Principles

Environment Isolation: Each environment operates independently with dedicated resources. Database connections, message queues, caching layers, and external service integrations exist separately. Isolation prevents cross-contamination during deployment and testing. Shared resources create dependencies that compromise the ability to validate the new environment independently.

Infrastructure Parity: Both environments maintain identical configurations for compute resources, network topology, security policies, and operational tooling. Parity ensures that behavior validated in the inactive environment accurately predicts production behavior. Configuration drift between environments causes unexpected failures during traffic switching.

Atomic Traffic Switching: The transition from old to new versions occurs as a single operation. Users experience either the old version or the new version, never intermediate states. Atomic switching requires coordination between the routing layer and application health checks. The switch completes when the routing layer confirms all new requests reach the new environment.

Stateless Application Design: Applications must handle traffic redirection without session disruption. Stateless designs store session data in external systems accessible from both environments. Stateful applications require session draining before traffic switches, complicating the deployment process. Session migration between environments adds complexity and failure points.

Rollback Capability: The previous environment remains operational and ready to receive traffic. Rollback reverses the traffic switch, returning users to the previous version. The rollback operation uses the same mechanism as the forward switch, ensuring reliability. Maintaining the previous environment incurs resource costs but provides insurance against deployment failures.

Health Verification: The new environment undergoes validation before receiving traffic. Health checks verify application responsiveness, dependency connectivity, and business logic correctness. Automated testing in the inactive environment catches issues before user exposure. Manual verification adds confidence for critical deployments but slows the process.

Database Compatibility: Schema changes must support both application versions during the transition period. Forward-compatible migrations add new structures without removing old ones. Backward-compatible code reads from both old and new schema elements. Database changes often occur separately from application deployments to decouple migration risk.

# Database migration compatibility pattern
class AddUserPreferences < ActiveRecord::Migration[7.0]
  def up
    # Add new column without removing old one
    add_column :users, :preferences_json, :jsonb, default: {}
    
    # Both versions can coexist
    # Old version: uses individual preference columns
    # New version: uses preferences_json
  end
  
  def down
    remove_column :users, :preferences_json
  end
end

# Application code supporting both schemas
class User < ApplicationRecord
  def preferences
    # New version reads from JSON column
    if preferences_json.present?
      preferences_json
    else
      # Fallback to old column structure
      {
        theme: theme_preference,
        notifications: notification_preference
      }
    end
  end
  
  def preferences=(value)
    # Write to both for transition period
    self.preferences_json = value
    self.theme_preference = value[:theme]
    self.notification_preference = value[:notifications]
  end
end

Resource Cost Acceptance: Running duplicate environments doubles infrastructure costs during deployment. Organizations accept this cost for the reliability and safety benefits. Cloud auto-scaling can reduce costs by sizing the inactive environment smaller until deployment begins. The cost calculation includes compute resources, storage, and managed service instances.

Monitoring Continuity: Observability systems must track both environments independently. Metrics collection, log aggregation, and alerting operate on environment-specific namespaces. During traffic switching, monitoring transitions smoothly from old to new environment. Correlation between environments helps identify deployment-related issues.

Network Routing Layer: A traffic routing mechanism sits between users and environments. Load balancers provide layer 7 routing with health-check integration. DNS-based routing offers simplicity but suffers from TTL-based delays. Service meshes enable fine-grained traffic control with gradual migration capabilities. The routing layer choice affects switching speed and rollback reliability.

Implementation Approaches

Load Balancer Based Switching: A load balancer manages two backend pools representing blue and green environments. The active pool receives all traffic while the inactive pool remains ready. Deployment updates the inactive pool's instances, health checks verify readiness, and the load balancer configuration switches active pools. This approach works across infrastructure types and provides subsecond switching times.

# Load balancer configuration management
class LoadBalancerManager
  def initialize(lb_client, blue_pool, green_pool)
    @lb_client = lb_client
    @blue_pool = blue_pool
    @green_pool = green_pool
  end
  
  def current_active_pool
    config = @lb_client.get_forwarding_rule
    config.backend_service == @blue_pool ? :blue : :green
  end
  
  def switch_traffic(target_pool)
    pool_name = target_pool == :blue ? @blue_pool : @green_pool
    
    # Wait for health checks
    wait_for_healthy_instances(pool_name)
    
    # Atomic switch
    @lb_client.update_forwarding_rule(
      backend_service: pool_name
    )
    
    # Verify switch completed
    sleep 2
    raise "Switch failed" unless current_active_pool == target_pool
    
    target_pool
  end
  
  private
  
  def wait_for_healthy_instances(pool_name)
    timeout = Time.now + 300
    
    loop do
      health = @lb_client.get_backend_health(pool_name)
      healthy_count = health.count { |h| h.status == 'HEALTHY' }
      required_count = health.size
      
      return if healthy_count >= required_count
      
      raise "Health check timeout" if Time.now > timeout
      sleep 5
    end
  end
end

# Usage in deployment script
manager = LoadBalancerManager.new(
  GoogleCloudLoadBalancer.new,
  'blue-backend-pool',
  'green-backend-pool'
)

# Deploy to inactive environment
current = manager.current_active_pool
target = current == :blue ? :green : :blue

puts "Deploying to #{target} environment"
deploy_application(target, new_version)

# Switch traffic
puts "Switching traffic to #{target}"
manager.switch_traffic(target)
puts "Deployment complete"

DNS-Based Routing: DNS records point to the active environment's endpoints. Deployment updates DNS to reference the new environment. DNS propagation delays affect switching speed, typically ranging from seconds to minutes depending on TTL settings. This approach offers simplicity and works across any infrastructure but lacks immediate rollback capability during DNS propagation.

Container Orchestration: Kubernetes and similar platforms manage blue-green deployments through service label selectors. Both environments run as separate deployments with identical pod templates. The service selector switches between deployment labels to route traffic. This approach integrates with existing orchestration workflows and provides native health checking.

# Kubernetes service switching script
require 'kubeclient'

class KubernetesBlueGreen
  def initialize(namespace)
    @client = Kubeclient::Client.new(
      'https://kubernetes.default.svc',
      'v1'
    )
    @namespace = namespace
  end
  
  def switch_service(service_name, target_version)
    service = @client.get_service(service_name, @namespace)
    
    # Update selector to point to target deployment
    service.spec.selector = {
      'app' => service_name,
      'version' => target_version
    }
    
    @client.update_service(service)
    
    # Verify pods are ready
    wait_for_pods(service_name, target_version)
  end
  
  private
  
  def wait_for_pods(app_name, version)
    timeout = Time.now + 300
    
    loop do
      pods = @client.get_pods(
        namespace: @namespace,
        label_selector: "app=#{app_name},version=#{version}"
      )
      
      ready_count = pods.count do |pod|
        pod.status.conditions.any? do |c|
          c.type == 'Ready' && c.status == 'True'
        end
      end
      
      return if ready_count == pods.size && pods.size > 0
      
      raise "Pod readiness timeout" if Time.now > timeout
      sleep 3
    end
  end
end

# Deployment workflow
k8s = KubernetesBlueGreen.new('production')

# Deploy new version
system("kubectl apply -f green-deployment.yaml")

# Switch service
k8s.switch_service('web-app', 'green')
puts "Traffic switched to green version"

Infrastructure as Code Cloning: Cloud infrastructure tools replicate entire environments including compute, networking, and managed services. Terraform workspaces or separate state files manage blue and green environments. Traffic routing updates after infrastructure validation. This approach provides maximum isolation but increases complexity and cost.

Application-Level Routing: Applications implement internal routing logic based on configuration flags or environment variables. A configuration service controls which application version handles requests. This approach requires application-level awareness of the deployment strategy and adds complexity to application code.

Database-First vs Application-First: Deployment sequences affect data consistency during transitions. Database-first approaches migrate schemas before application deployment, requiring backward-compatible changes. Application-first approaches deploy code that handles both old and new schemas. The sequence choice depends on change complexity and organizational risk tolerance.

Design Considerations

When to Use Blue-Green: Organizations prioritize zero-downtime deployments for customer-facing services with strict availability requirements. Applications with complex deployment processes benefit from the ability to validate completely before traffic exposure. Teams lacking confidence in automated testing use blue-green deployment to reduce risk through validation periods. Services requiring instant rollback capability justify the resource overhead.

When to Avoid Blue-Green: Resource-constrained environments struggle with doubled infrastructure costs. Stateful applications with complex data synchronization requirements face deployment complexity. Small organizations with simple deployment needs find the automation overhead excessive. Applications with frequent deployments benefit more from canary or rolling deployment strategies that avoid full environment duplication.

Blue-Green vs Rolling Deployment: Rolling deployments gradually replace instances within a single environment, reducing resource costs and deployment complexity. Blue-green provides instant rollback and complete isolation for validation. Rolling deployments mix old and new versions during the transition, requiring version compatibility. Blue-green maintains clean separation at the cost of doubled resources.

Blue-Green vs Canary Deployment: Canary deployments gradually shift traffic to new versions, enabling phased validation with production load. Blue-green switches traffic atomically without gradual migration. Canary approaches detect issues affecting small user percentages before full rollout. Blue-green assumes complete pre-production validation and optimizes for instant switching.

State Management Strategy: Stateless applications deploy easily with blue-green approaches. Session-based applications require external session storage accessible from both environments. Connection draining periods allow in-flight requests to complete before switching. WebSocket and long-polling connections need graceful handling during transitions. Applications with persistent connections benefit from gradual migration strategies.

Database Migration Complexity: Simple schema additions deploy safely with blue-green approaches. Column removals require two-phase deployments: first deploy code that ignores the column, then remove the column. Schema changes affecting application logic need careful sequencing. Data migrations with long execution times complicate atomic deployments.

Cost-Benefit Analysis: Organizations calculate cost justification by comparing doubled infrastructure costs against downtime costs. The analysis includes revenue loss during outages, customer trust impact, and engineering time spent on incident response. Services with high availability requirements justify costs easily. Development environments rarely warrant blue-green deployment overhead.

Testing Requirements: Blue-green deployment requires comprehensive pre-production validation. Automated test suites must cover critical functionality and integration points. Performance testing in the inactive environment identifies issues before user exposure. Security scanning and compliance checks complete before switching. The testing comprehensiveness determines deployment confidence.

Organizational Maturity: Teams need automation capabilities to manage environment complexity. Monitoring infrastructure must support multi-environment tracking. Incident response procedures must account for quick rollbacks. Organizations transitioning to blue-green deployment invest in tooling and process development before production adoption.

Tools & Ecosystem

Cloud Platform Native Tools: AWS Elastic Beanstalk provides built-in blue-green deployment through environment cloning and DNS switching. Google Cloud Platform supports blue-green through load balancer backend service switching. Azure App Service offers deployment slots for staging and production environments. These platforms handle infrastructure provisioning and traffic routing automatically.

Container Orchestration: Kubernetes supports blue-green through deployment objects and service selectors. Deployments maintain separate replica sets for blue and green versions. Services route traffic by updating label selectors. Helm charts can template blue-green deployments with version parameters. Kubernetes operators automate the deployment and switching process.

# Helm values for blue-green deployment
# values.yaml structure
config = {
  'activeVersion' => 'blue',
  'blue' => {
    'image' => {
      'tag' => 'v1.2.0'
    },
    'replicas' => 3
  },
  'green' => {
    'image' => {
      'tag' => 'v1.3.0'
    },
    'replicas' => 3
  }
}

# Deployment automation script
class HelmBlueGreenDeployer
  def initialize(release_name, namespace)
    @release_name = release_name
    @namespace = namespace
  end
  
  def deploy_to_inactive(new_version)
    current_values = get_current_values
    current_active = current_values['activeVersion']
    target = current_active == 'blue' ? 'green' : 'blue'
    
    # Update inactive environment version
    current_values[target]['image']['tag'] = new_version
    
    # Deploy changes
    system(
      "helm upgrade #{@release_name} ./chart " \
      "--namespace #{@namespace} " \
      "--values - " \
      "--wait",
      stdin_data: current_values.to_yaml
    )
    
    target
  end
  
  def switch_traffic(target_version)
    current_values = get_current_values
    current_values['activeVersion'] = target_version
    
    system(
      "helm upgrade #{@release_name} ./chart " \
      "--namespace #{@namespace} " \
      "--values - " \
      "--wait",
      stdin_data: current_values.to_yaml
    )
  end
  
  private
  
  def get_current_values
    values_yaml = `helm get values #{@release_name} -n #{@namespace} -o yaml`
    YAML.safe_load(values_yaml)
  end
end

Ruby Deployment Tools: Capistrano provides deployment automation with blue-green support through custom tasks. Deployments can target different environments through stage configuration. The capistrano-blue-green gem adds blue-green specific functionality. Custom Capistrano tasks handle health checks and traffic switching.

Traffic Management: HAProxy and Nginx support backend pool switching through runtime API updates. Cloud load balancers from AWS ALB, GCP Load Balancing, and Azure Application Gateway provide automated health checking. Service mesh tools like Istio enable sophisticated traffic splitting and routing. DNS management tools like Route53 support weighted routing for gradual traffic shifts.

Monitoring and Observability: Datadog, New Relic, and Prometheus support multi-environment monitoring with label-based segregation. Log aggregation platforms like Elasticsearch distinguish between environment logs. Application Performance Monitoring tools track environment-specific metrics. Alert rules need environment-aware configuration to avoid duplicate notifications.

Configuration Management: Consul and etcd store environment-specific configuration. Tools read configuration based on deployment environment identifiers. Environment variables inject configuration during container startup. Secret management systems like Vault provide environment-scoped credentials.

Automation Frameworks: Terraform manages infrastructure for both environments through workspace separation. AWS CloudFormation uses stack parameters for environment differentiation. Ansible playbooks deploy applications to environment-specific inventories. CI/CD platforms like Jenkins, GitLab CI, and GitHub Actions orchestrate multi-environment deployments.

# GitLab CI blue-green deployment
# .gitlab-ci.yml automation
class GitLabBlueGreenCI
  def self.deployment_script
    <<~YAML
      stages:
        - build
        - deploy_inactive
        - validate
        - switch_traffic
        
      variables:
        ACTIVE_ENV: "blue"  # Managed externally
      
      deploy_to_inactive:
        stage: deploy_inactive
        script:
          - |
            if [ "$ACTIVE_ENV" = "blue" ]; then
              TARGET="green"
            else
              TARGET="blue"
            fi
            
            echo "Deploying to $TARGET environment"
            ./deploy.rb --environment $TARGET --version $CI_COMMIT_TAG
        only:
          - tags
      
      validate_inactive:
        stage: validate
        script:
          - |
            if [ "$ACTIVE_ENV" = "blue" ]; then
              TARGET="green"
            else
              TARGET="blue"
            fi
            
            ./run_tests.rb --environment $TARGET --type smoke
        only:
          - tags
      
      switch_traffic:
        stage: switch_traffic
        script:
          - ./switch_traffic.rb --to $TARGET
        when: manual
        only:
          - tags
    YAML
  end
end

Health Check Tools: Custom health check endpoints verify application readiness. External monitoring services like Pingdom validate environment accessibility. Container orchestration health probes ensure pod readiness. Application-level checks validate database connectivity and external service availability.

Feature Flag Systems: LaunchDarkly and Split.io enable progressive feature rollout independent of deployment. Feature flags decouple deployment from feature activation. Teams can deploy code to both environments while controlling feature visibility. This reduces deployment risk by separating code deployment from feature release.

Real-World Applications

E-commerce Platform Deployments: Online retailers deploy during low-traffic periods using blue-green strategies. The inactive environment receives the new version with updated product catalog features. Automated tests verify checkout flows, payment processing, and inventory management. Traffic switches after midnight when transaction volume drops. The previous environment remains active for 24 hours to support rollback if issues emerge.

# E-commerce deployment orchestration
class EcommerceDeployment
  def initialize(config)
    @config = config
    @lb_manager = LoadBalancerManager.new(config.load_balancer)
    @monitor = MonitoringClient.new(config.monitoring)
  end
  
  def execute_deployment(version)
    current_env = @lb_manager.current_active_pool
    target_env = current_env == :blue ? :green : :blue
    
    # Pre-deployment checks
    verify_low_traffic_period
    
    # Deploy to inactive environment
    log("Deploying version #{version} to #{target_env}")
    deploy_application(target_env, version)
    
    # Run validation suite
    log("Running checkout flow tests")
    run_tests(target_env, [
      CheckoutFlowTest,
      PaymentProcessingTest,
      InventoryValidationTest,
      SearchFunctionalityTest
    ])
    
    # Monitor error rates
    wait_for_stable_metrics(target_env, duration: 300)
    
    # Switch traffic
    log("Switching traffic to #{target_env}")
    @lb_manager.switch_traffic(target_env)
    
    # Monitor post-switch metrics
    monitor_deployment_health(target_env)
    
  rescue DeploymentError => e
    log("Deployment failed: #{e.message}")
    rollback(current_env)
    raise
  end
  
  private
  
  def verify_low_traffic_period
    current_hour = Time.now.hour
    raise "Traffic too high" unless (0..5).include?(current_hour)
  end
  
  def wait_for_stable_metrics(env, duration:)
    end_time = Time.now + duration
    
    while Time.now < end_time
      metrics = @monitor.get_metrics(env, window: 60)
      
      if metrics.error_rate > 0.01
        raise DeploymentError, "Error rate too high: #{metrics.error_rate}"
      end
      
      sleep 30
    end
  end
  
  def monitor_deployment_health(env)
    # Monitor for first hour after switch
    60.times do
      metrics = @monitor.get_metrics(env, window: 60)
      
      if metrics.error_rate > 0.02
        raise DeploymentError, "Post-deployment errors detected"
      end
      
      sleep 60
    end
  end
end

Financial Services Deployments: Banks and payment processors require zero-downtime deployments due to regulatory requirements. Blue-green deployments occur during scheduled maintenance windows with customer notification. Both environments maintain PCI compliance and security certifications. Database migrations use backward-compatible patterns to support both versions during transition. Rollback procedures undergo quarterly testing to ensure reliability.

SaaS Application Updates: Multi-tenant SaaS platforms deploy new features to inactive environments with tenant-specific testing. Feature flags control which tenants see new functionality after deployment. The deployment process includes data migration validation for each tenant. Performance testing verifies resource usage patterns before full traffic switch. Customer-facing dashboards show deployment status and planned downtime.

Media Streaming Services: Video and audio streaming platforms deploy during low viewership periods. The inactive environment undergoes load testing with simulated user traffic patterns. CDN configurations update to point to new application endpoints. Encoding pipeline changes validate against sample media before production switch. Geographic routing ensures users in different regions switch incrementally.

API Service Deployments: Public API providers deploy with extensive contract testing to ensure backward compatibility. Blue-green environments run different API versions supporting various client integrations. Rate limiting and authentication systems operate across both environments. API documentation updates automatically after successful deployment. Monitoring tracks endpoint-specific error rates and latency.

Mobile Backend Services: Mobile applications cannot force user updates, requiring backends to support multiple client versions. Blue-green deployment maintains compatibility with older app versions. Background job processing continues in both environments during deployment. Push notification systems coordinate to avoid duplicate messages. Database read replicas distribute load during traffic switching.

Microservices Architecture: Individual services deploy independently using blue-green strategies. Service mesh routing manages traffic between service versions. Inter-service communication protocols support version negotiation. Distributed tracing identifies issues spanning multiple services. Circuit breakers prevent cascading failures during deployments.

Common Pitfalls

Database Migration Failures: Teams deploy schema changes that break backward compatibility. The new application version requires columns that don't exist in production. Forward-compatible migrations add structures before application deployment. Backward-compatible code handles both old and new schemas during transition periods.

# Problematic migration approach
class RemoveOldColumn < ActiveRecord::Migration[7.0]
  def change
    # Breaks blue-green deployment
    remove_column :users, :old_preference_format
  end
end

# Correct two-phase approach
# Phase 1: Deploy application ignoring old column
class User < ApplicationRecord
  # New code doesn't reference old_preference_format
  def preferences
    JSON.parse(preferences_json)
  end
end

# Phase 2: Remove column after full deployment
class RemoveOldColumnSafely < ActiveRecord::Migration[7.0]
  def up
    # Only run after both environments use new code
    remove_column :users, :old_preference_format if column_exists?(:users, :old_preference_format)
  end
  
  def down
    add_column :users, :old_preference_format, :text
  end
end

Session State Loss: Applications store sessions locally, losing user sessions during environment switches. Users get logged out unexpectedly when traffic moves to the new environment. External session storage in Redis or databases maintains sessions across environments. Session cookies must not bind to specific environment instances.

Resource Exhaustion: Both environments run simultaneously, doubling resource consumption. Organizations underestimate cloud costs for dual environments. Auto-scaling policies need adjustment to prevent runaway costs. The inactive environment can run at reduced capacity until deployment begins.

Incomplete Health Checks: Superficial health checks pass while critical functionality fails. Applications report healthy status despite broken database connections. Comprehensive health checks verify all dependencies and critical paths. Synthetic transaction testing validates end-to-end functionality.

DNS Propagation Delays: DNS-based blue-green deployments suffer from TTL-related delays. Rollback attempts during DNS propagation create mixed routing states. Some users reach old environments while others hit new versions. Load balancer-based switching provides deterministic cutover timing.

Configuration Drift: Environments diverge in configuration over time. The inactive environment misses critical configuration updates between deployments. Infrastructure as code prevents configuration drift through version control. Automated validation compares environment configurations before deployment.

Monitoring Gaps: Teams forget to update monitoring dashboards for the new environment. Alerts fire incorrectly because thresholds don't account for traffic switches. Environment-aware monitoring tracks both blue and green independently. Alert rules need logic to handle the active environment dynamically.

Insufficient Rollback Testing: Teams assume rollback works without regular testing. Traffic switching logic fails under pressure during incident response. Quarterly rollback drills verify the entire process. Automated rollback based on error rate thresholds reduces response time.

Background Job Duplication: Both environments process background jobs simultaneously, duplicating work. Job queues need environment awareness to prevent duplicate processing. Job coordination systems ensure only the active environment processes tasks. Idempotent job design handles accidental duplicate execution.

# Job coordination for blue-green environments
class EnvironmentAwareJob
  include Sidekiq::Job
  
  def perform(*args)
    # Only run in active environment
    return unless active_environment?
    
    # Idempotent processing
    process_with_lock(*args)
  end
  
  private
  
  def active_environment?
    current_env = ENV['DEPLOYMENT_ENVIRONMENT']
    active_env = get_active_environment_from_config
    current_env == active_env
  end
  
  def process_with_lock(*args)
    lock_key = "job_lock:#{self.class.name}:#{args.join(':')}"
    
    Redis.current.set(lock_key, Time.now.to_i, nx: true, ex: 3600) do
      # Process job
      execute_business_logic(*args)
    end
  end
  
  def get_active_environment_from_config
    # Read from centralized configuration service
    Consul::KV.get('deployment/active_environment')
  end
end

Deployment Window Mismanagement: Teams switch traffic during peak usage periods, amplifying incident impact. Deployment schedules should target low-traffic windows based on historical patterns. Automated deployment prevents manual errors during late-night deployments. Gradual traffic shifts reduce risk compared to atomic switches.

Reference

Deployment Process Checklist

Phase	Action	Validation
Pre-deployment	Verify environment parity	Configuration comparison passed
Pre-deployment	Run database migrations	Migration compatible with both versions
Pre-deployment	Update inactive environment	Application deployed successfully
Pre-deployment	Execute smoke tests	Critical paths functional
Pre-deployment	Verify health checks	All instances reporting healthy
Deployment	Switch traffic routing	Active environment changed
Deployment	Monitor error rates	Error rate below threshold
Deployment	Check performance metrics	Latency within acceptable range
Post-deployment	Observe for stability period	No anomalies for 1 hour minimum
Post-deployment	Document deployment	Runbook updated with lessons

Traffic Switching Methods

Method	Switch Speed	Rollback Speed	Complexity	Cost Impact
Load balancer	Subsecond	Subsecond	Medium	High
DNS	Minutes to hours	Minutes to hours	Low	Low
Service mesh	Seconds	Seconds	High	Medium
Container orchestration	Seconds	Seconds	Medium	Medium
CDN configuration	Seconds to minutes	Seconds to minutes	Medium	Medium

Environment State Matrix

State	Blue Environment	Green Environment	Active Traffic	Action Available
Initial	Production v1.0	Empty	Blue 100%	Deploy to green
Deployed	Production v1.0	Staging v2.0	Blue 100%	Validate green
Validated	Production v1.0	Ready v2.0	Blue 100%	Switch to green
Switched	Standby v1.0	Production v2.0	Green 100%	Rollback or deploy
Rolled back	Production v1.0	Failed v2.0	Blue 100%	Investigate green

Health Check Configuration

Check Type	Purpose	Timeout	Interval	Failure Threshold
HTTP endpoint	Application responsiveness	5 seconds	10 seconds	3 consecutive failures
Database connection	Data layer availability	3 seconds	30 seconds	2 consecutive failures
Dependency services	External integration	10 seconds	60 seconds	3 consecutive failures
Synthetic transaction	End-to-end functionality	30 seconds	300 seconds	1 failure
Resource utilization	Capacity validation	1 second	5 seconds	Sustained over threshold

Monitoring Metrics

Metric	Threshold	Action	Collection Interval
Error rate	Greater than 1%	Automatic rollback	60 seconds
Response time p95	Greater than 2x baseline	Alert operations	30 seconds
Request rate	Less than 50% of normal	Investigate routing	60 seconds
Memory utilization	Greater than 85%	Scale up resources	30 seconds
CPU utilization	Greater than 80%	Scale up resources	30 seconds
Database connection pool	Greater than 90% used	Alert database team	60 seconds

Ruby Deployment Script Template

#!/usr/bin/env ruby

require 'json'
require 'net/http'
require 'logger'

class BlueGreenDeployment
  attr_reader :logger
  
  def initialize
    @logger = Logger.new(STDOUT)
    @config = load_configuration
  end
  
  def deploy(version)
    current = current_active_environment
    target = inactive_environment(current)
    
    logger.info "Starting deployment to #{target}"
    
    deploy_application(target, version)
    run_health_checks(target)
    run_smoke_tests(target)
    
    logger.info "Switching traffic to #{target}"
    switch_traffic(target)
    
    monitor_post_deployment(target)
    
    logger.info "Deployment complete"
  rescue => e
    logger.error "Deployment failed: #{e.message}"
    rollback(current)
    raise
  end
  
  private
  
  def load_configuration
    JSON.parse(File.read('deployment_config.json'))
  end
  
  def current_active_environment
    # Implementation specific to routing mechanism
  end
  
  def inactive_environment(current)
    current == 'blue' ? 'green' : 'blue'
  end
  
  def deploy_application(env, version)
    # Deploy logic
  end
  
  def run_health_checks(env)
    # Health verification
  end
  
  def switch_traffic(env)
    # Traffic routing update
  end
  
  def monitor_post_deployment(env)
    # Post-deployment monitoring
  end
  
  def rollback(env)
    # Rollback procedure
  end
end

if __FILE__ == $0
  version = ARGV[0] || abort("Usage: #{$0} VERSION")
  BlueGreenDeployment.new.deploy(version)
end

Blue-Green Deployment