Overview
Deployment strategies define how software transitions from development to production environments. Each strategy balances competing concerns: minimizing downtime, reducing deployment risk, enabling fast rollback, and managing infrastructure costs. The choice of deployment strategy affects application availability, operational complexity, and the team's ability to deliver updates safely.
Traditional deployments involved taking systems offline, replacing code, and restarting services. Modern deployment strategies eliminate or minimize downtime by orchestrating multiple instances, routing traffic intelligently, and validating new versions before full rollout. These strategies emerged from the need to deploy frequently while maintaining high availability in production systems.
Deployment strategies operate on several core concepts. Instances are running copies of the application. Traffic routing directs user requests to specific instances. Health checks verify instance readiness. Rollback reverts to the previous version when problems occur. Deployment window is the time period when changes are applied.
# Health check endpoint example
class HealthController < ApplicationController
def show
database_healthy = ActiveRecord::Base.connection.active?
cache_healthy = Rails.cache.redis.ping == "PONG"
status = database_healthy && cache_healthy ? :ok : :service_unavailable
render json: {
status: status,
database: database_healthy,
cache: cache_healthy,
version: ENV['APP_VERSION']
}, status: status
end
end
The deployment strategy determines application behavior during updates. A web application with 1000 requests per second cannot tolerate strategies that cause dropped requests or extended downtime. Different applications require different strategies based on their availability requirements, traffic patterns, and architectural constraints.
Key Principles
Deployment strategies share fundamental principles that govern their operation. Availability measures the percentage of time the application serves requests successfully. Zero-downtime deployment maintains availability during updates by ensuring some instances always serve traffic. Atomicity means deployments either complete fully or roll back entirely, avoiding partial states.
Risk mitigation limits the impact of defective releases. Strategies that expose new code to small traffic percentages detect problems before they affect all users. Rollback capability enables rapid reversion when deployments introduce bugs or performance problems. Fast rollback requires preserving the previous version and maintaining the ability to redirect traffic.
Validation confirms that newly deployed code functions correctly before serving production traffic. Validation includes health checks, smoke tests, and metric monitoring. Health checks verify basic functionality like database connectivity. Smoke tests execute critical paths through the application. Metric monitoring detects anomalies in error rates, response times, or throughput.
# Deployment validation script
class DeploymentValidator
def initialize(endpoint)
@endpoint = endpoint
@http = Net::HTTP.new(URI(@endpoint).host, URI(@endpoint).port)
end
def validate
health_check && smoke_tests && metric_checks
end
def health_check
response = @http.get('/health')
response.code == '200' && JSON.parse(response.body)['status'] == 'ok'
end
def smoke_tests
critical_endpoints.all? do |path|
response = @http.get(path)
(200..299).include?(response.code.to_i)
end
end
def metric_checks
error_rate < 0.01 && p95_latency < 500
end
end
State management handles application and database state during deployments. Stateless applications simplify deployments because instances can start and stop independently. Stateful applications require coordination to avoid data loss or corruption. Database schema changes must be compatible with both old and new application versions during transitions.
Traffic shaping controls request routing during deployments. Load balancers direct traffic based on instance health, deployment stage, or request characteristics. Traffic shaping enables gradual rollouts where new code serves increasing percentages of requests.
Monitoring and observability provide visibility into deployment progress and application health. Metrics track error rates, latencies, throughput, and resource utilization. Logs capture detailed information about request processing. Distributed tracing shows request flow across services. These signals enable teams to detect problems quickly and make informed rollback decisions.
Implementation Approaches
Recreate Deployment
Recreate deployment stops all running instances, deploys new code, and starts new instances. This strategy causes downtime equal to the stop-deploy-start cycle duration. The simplicity appeals for applications that tolerate downtime or deploy during maintenance windows.
The deployment process terminates all instances simultaneously, updates code on each server, and launches the new version. Load balancers mark the application as unavailable during this window. Users receive error responses until instances restart and pass health checks.
Recreate deployments require minimal infrastructure. The application needs only production instances without spare capacity. No traffic routing logic handles multiple versions simultaneously. Database migrations run before starting new instances, knowing old code no longer executes.
This strategy fits applications with scheduled maintenance windows, internal tools with limited users, or systems where deployment simplicity outweighs availability requirements. It fails for customer-facing services requiring 24/7 availability or applications with long startup times.
Rolling Deployment
Rolling deployment updates instances incrementally, replacing a subset of instances at a time while others continue serving traffic. The deployment proceeds in waves: stop instances, update code, start instances, verify health, then proceed to the next wave. This maintains partial availability throughout the deployment.
The wave size determines deployment characteristics. Small waves (updating one instance at a time) minimize risk by limiting exposure to defects but extend deployment duration. Large waves speed deployment but increase the number of users affected by defects. A common approach updates 25% of instances per wave.
# Rolling deployment orchestration
class RollingDeployment
def initialize(instances:, wave_size:)
@instances = instances
@wave_size = wave_size
end
def deploy(version)
waves.each do |wave_instances|
deploy_wave(wave_instances, version)
verify_wave(wave_instances) || rollback_wave(wave_instances)
end
end
private
def waves
@instances.each_slice(@wave_size).to_a
end
def deploy_wave(instances, version)
instances.each do |instance|
instance.mark_unhealthy
instance.deploy(version)
instance.restart
wait_for_health(instance)
end
end
def verify_wave(instances)
sleep 60 # Observation period
instances.all? { |i| i.error_rate < 0.01 && i.healthy? }
end
end
Rolling deployments reduce capacity during deployment because some instances are offline or starting. If normal capacity is 10 instances and waves update 2 instances, capacity drops to 80% during wave updates. Applications must handle this reduced capacity without degrading performance.
Database compatibility becomes critical. Both old and new code versions run simultaneously during deployment. Schema changes must be backward-compatible, typically requiring multi-phase deployments: add columns in phase one, deploy code using new columns in phase two, remove old columns in phase three.
Blue-Green Deployment
Blue-green deployment maintains two identical production environments. One environment (blue) serves live traffic while the other (green) remains idle. Deployment updates the idle environment, validates it, then switches traffic from blue to green. The previous environment remains available for immediate rollback.
The deployment process deploys new code to the green environment while blue continues serving all traffic. Automated tests and manual validation confirm green functions correctly. Traffic switches from blue to green via load balancer reconfiguration. If problems occur, traffic switches back to blue.
# Blue-green traffic switch
class BlueGreenDeployment
def initialize(load_balancer:, blue_env:, green_env:)
@lb = load_balancer
@blue = blue_env
@green = green_env
end
def deploy(version)
inactive_env = current_env == @blue ? @green : @blue
inactive_env.deploy(version)
inactive_env.start_all_instances
return false unless validate_environment(inactive_env)
@lb.switch_traffic(from: current_env, to: inactive_env)
monitor_metrics(inactive_env, duration: 600)
end
def validate_environment(env)
env.all_instances_healthy? &&
run_smoke_tests(env) &&
performance_acceptable?(env)
end
def rollback
@lb.switch_traffic(from: current_env, to: previous_env)
end
end
Blue-green deployment requires double infrastructure capacity because both environments must handle full production load. This increases costs but provides the fastest rollback capability - merely redirecting traffic. The approach works well with containerized applications where spinning up duplicate environments is automated.
Database handling complicates blue-green deployments. Both environments typically share the same database, requiring schema compatibility between versions. Separate databases per environment enable true isolation but complicate data synchronization and increase storage costs.
Canary Deployment
Canary deployment gradually shifts traffic from the old version to the new version while monitoring metrics for problems. The deployment starts by routing a small traffic percentage (typically 5-10%) to the new version. If metrics remain healthy, traffic increases incrementally until 100% reaches the new version.
Traffic routing uses load balancer rules, service mesh configuration, or application-level routing. Requests can be routed randomly based on percentage, by specific user cohorts, or by request characteristics. Geographic routing sends traffic from one region to the new version while others remain on the old version.
# Canary routing with Rack middleware
class CanaryRouter
def initialize(app, canary_percentage:)
@app = app
@canary_percentage = canary_percentage
end
def call(env)
if route_to_canary?
env['HTTP_X_CANARY_VERSION'] = 'new'
forward_to_canary_instances(env)
else
env['HTTP_X_CANARY_VERSION'] = 'stable'
@app.call(env)
end
end
private
def route_to_canary?
rand(100) < @canary_percentage
end
def forward_to_canary_instances(env)
# Proxy request to canary instance pool
proxy = Rack::Proxy.new(backend: ENV['CANARY_BACKEND'])
proxy.call(env)
end
end
Monitoring during canary deployment compares metrics between canary and stable versions. Key metrics include error rates, response latencies, throughput, and business metrics like conversion rates. Significant deviations trigger automatic rollback or halt traffic increases.
The canary progression follows a schedule: 5% for 10 minutes, 25% for 20 minutes, 50% for 30 minutes, 100%. Schedules balance risk (faster progression exposes more users to defects) against deployment speed (slower progression delays feature delivery). Automated systems adjust progression based on metric health.
Canary deployments excel at detecting problems that only manifest under production load or with real user data. Staging environments cannot replicate production diversity, making canary validation valuable. The approach requires metric infrastructure and automation to compare versions and control traffic routing.
Feature Flag Deployment
Feature flag deployment deploys new code to production with features disabled by default. Flags control whether features activate for specific users, percentages of traffic, or globally. This separates deployment from feature release, enabling testing in production before widespread activation.
Flags can be boolean (on/off), percentage-based (active for X% of users), or targeted (active for specific user IDs, roles, or attributes). Complex flags combine multiple conditions: active for 10% of premium users in the US region.
# Feature flag implementation
class FeatureFlags
def initialize(user)
@user = user
@flags = FlagStore.new
end
def enabled?(feature)
flag = @flags.get(feature)
return false unless flag.active?
case flag.rollout_type
when :boolean
flag.value
when :percentage
user_hash % 100 < flag.percentage
when :targeted
flag.user_ids.include?(@user.id) ||
flag.roles.include?(@user.role)
end
end
private
def user_hash
Digest::MD5.hexdigest(@user.id.to_s).to_i(16)
end
end
# Usage in application code
def checkout_process
if feature_flags.enabled?(:new_payment_flow)
render :new_checkout
else
render :legacy_checkout
end
end
Feature flags enable gradual rollouts identical to canary deployments but at the application level rather than infrastructure level. Deploy code with flags disabled, enable for 5% of users, monitor metrics, increase to 25%, and so on. This provides fine-grained control without complex infrastructure routing.
Flag technical debt accumulates when old flags remain in code after full rollout. Teams must remove flags once features are fully enabled, treating flags as temporary constructs. Long-lived flags complicate code, increase test surface area, and create confusion about system behavior.
Ruby Implementation
Capistrano Deployment Automation
Capistrano provides Ruby-based deployment automation, defining deployment workflows as Ruby code. It connects to servers via SSH, executes commands, manages releases, and handles rollback. Capistrano suits traditional server deployments where applications run on VMs or bare metal.
# Capistrano deployment configuration (config/deploy.rb)
lock '~> 3.18.0'
set :application, 'my_app'
set :repo_url, 'git@github.com:username/my_app.git'
set :deploy_to, '/var/www/my_app'
set :keep_releases, 5
namespace :deploy do
desc 'Restart application'
task :restart do
on roles(:app) do
execute :touch, release_path.join('tmp/restart.txt')
end
end
desc 'Run database migrations'
task :migrate do
on primary(:db) do
within release_path do
with rails_env: fetch(:rails_env) do
execute :rake, 'db:migrate'
end
end
end
end
after :publishing, :restart
before :restart, :migrate
end
# Rolling deployment implementation
namespace :rolling do
task :deploy do
on roles(:app), in: :sequence, wait: 30 do
invoke 'deploy:updating'
invoke 'deploy:updated'
invoke 'deploy:publishing'
invoke 'deploy:published'
invoke 'deploy:restart'
# Wait and verify before next server
sleep 60
validate_instance(host.hostname) ||
raise('Health check failed')
end
end
end
def validate_instance(host)
uri = URI("https://#{host}/health")
response = Net::HTTP.get_response(uri)
response.is_a?(Net::HTTPSuccess)
rescue
false
end
Capistrano organizes deployments into releases stored in separate directories. The current release symlinks to the active version. Rollback changes the symlink to the previous release. This structure enables fast rollback without redeploying code.
Health Check Implementation
Applications must provide health check endpoints for deployment orchestration and load balancer integration. Health checks verify database connectivity, cache availability, and critical service dependencies.
# Comprehensive health check
class HealthCheck
def self.status
checks = {
database: database_check,
cache: cache_check,
storage: storage_check,
job_queue: queue_check
}
healthy = checks.values.all? { |check| check[:healthy] }
{
status: healthy ? 'healthy' : 'unhealthy',
timestamp: Time.now.iso8601,
version: ENV['APP_VERSION'],
checks: checks
}
end
private
def self.database_check
start = Time.now
ActiveRecord::Base.connection.execute('SELECT 1')
{
healthy: true,
response_time: Time.now - start
}
rescue => e
{
healthy: false,
error: e.message
}
end
def self.cache_check
start = Time.now
Rails.cache.write('health_check', Time.now.to_i)
value = Rails.cache.read('health_check')
{
healthy: value.is_a?(Integer),
response_time: Time.now - start
}
rescue => e
{
healthy: false,
error: e.message
}
end
end
Deployment Hooks and Callbacks
Ruby applications integrate deployment logic through hooks that execute at specific deployment phases. These hooks handle tasks like asset compilation, cache warming, and service notifications.
# Rails deployment hooks (config/deploy.rb)
namespace :deploy do
after :updated, :compile_assets do
on roles(:app) do
within release_path do
execute :rake, 'assets:precompile'
end
end
end
after :publishing, :warm_cache do
on roles(:app) do
execute :curl, '-s', "http://localhost/cache/warm"
end
end
after :restart, :notify_deployment do
on roles(:app) do
execute :curl, '-X POST', ENV['SLACK_WEBHOOK'],
'-d', %Q{{"text": "Deployed #{fetch(:current_revision)} to production"}}
end
end
after :rollback, :notify_rollback do
on roles(:app) do
execute :curl, '-X POST', ENV['PAGERDUTY_WEBHOOK'],
'-d', %Q{{"incident_key": "deployment", "event_type": "trigger"}}
end
end
end
Container Deployment with Ruby
Containerized Ruby applications deploy through orchestration platforms like Kubernetes. The deployment manifest defines rolling update parameters, health checks, and resource requirements.
# Dockerfile for Ruby application
FROM ruby:3.2-alpine
WORKDIR /app
COPY Gemfile Gemfile.lock ./
RUN bundle install --without development test
COPY . .
RUN RAILS_ENV=production bundle exec rake assets:precompile
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s \
CMD curl -f http://localhost:3000/health || exit 1
CMD ["bundle", "exec", "puma", "-C", "config/puma.rb"]
The Kubernetes deployment manifest configures rolling update strategy and health checks:
apiVersion: apps/v1
kind: Deployment
metadata:
name: ruby-app
spec:
replicas: 10
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 2
maxUnavailable: 1
template:
spec:
containers:
- name: app
image: myapp:v2.0
readinessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
Design Considerations
Selecting a Deployment Strategy
Application characteristics determine appropriate deployment strategies. Availability requirements, traffic patterns, infrastructure costs, and team capabilities all influence selection.
Applications requiring 99.99% uptime cannot tolerate recreate deployments. Rolling or blue-green deployments maintain availability during updates. Applications with flexible availability requirements or maintenance windows can use simpler strategies.
Traffic volume affects strategy selection. Low-traffic applications tolerate brief outages or reduced capacity during rolling deployments. High-traffic applications need strategies that maintain full capacity or support rapid rollback when problems occur.
Infrastructure costs scale with deployment complexity. Blue-green deployments double infrastructure requirements. Rolling deployments temporarily reduce capacity. Recreate deployments use minimal resources. Organizations balance availability requirements against infrastructure costs.
Team operational capabilities constrain deployment strategies. Blue-green and canary deployments require automated orchestration, monitoring, and rollback procedures. Small teams may lack resources to build and maintain complex deployment infrastructure.
Trade-offs Between Strategies
Deployment strategies trade simplicity, speed, safety, and cost. Recreate deployment offers maximum simplicity but provides no safety measures and causes downtime. Blue-green deployment provides maximum safety and instant rollback but requires double infrastructure.
Rolling deployment balances many concerns: maintains availability, limits risk exposure through incremental updates, uses minimal extra infrastructure, and supports rollback by redeploying previous versions. The gradual rollout increases deployment duration compared to strategies that switch all traffic simultaneously.
Canary deployment provides maximum safety through gradual traffic shifting and automated monitoring but requires sophisticated traffic routing and metric collection. Teams must build automation to compare version metrics and control traffic percentages.
Feature flags offer maximum flexibility, enabling deployment and feature release decoupling, but accumulate technical debt when flags remain in code indefinitely. Applications with many feature flags become harder to test because each flag combination creates a different code path.
Database Migration Strategies
Database schema changes complicate deployments because schema updates affect all application versions. Several approaches handle schema changes during deployments.
Backward-compatible migrations deploy in multiple phases. Phase one adds new columns without removing old columns. Phase two deploys application code using new columns while maintaining old column compatibility. Phase three removes old columns after confirming all instances use new columns. This approach supports all deployment strategies but extends deployment timelines.
# Phase 1: Add new column
class AddEmailVerifiedToUsers < ActiveRecord::Migration[7.0]
def change
add_column :users, :email_verified_at, :datetime
add_index :users, :email_verified_at
end
end
# Old code still works during this phase
# Phase 2: Update code to use new column
class User < ApplicationRecord
def email_verified?
email_verified_at.present?
end
end
# Phase 3: Remove old implementation after full deployment
class RemoveEmailVerifiedFromUsers < ActiveRecord::Migration[7.0]
def change
remove_column :users, :email_verified
end
end
Database deployment coordination runs migrations before or after application deployment depending on compatibility. Adding columns runs before deployment so new code finds expected schema. Removing columns runs after deployment so old code does not reference missing columns.
Blue-green with separate databases maintains separate databases for blue and green environments. This eliminates schema compatibility concerns but requires data replication or shared read replicas. The approach suits applications where environments can temporarily diverge.
Rollback Strategies
Effective rollback procedures restore service when deployments introduce defects. Rollback speed and reliability determine blast radius when problems occur.
Blue-green deployments provide instant rollback by redirecting traffic to the previous environment. Rolling deployments rollback by redeploying the previous version, which takes longer but requires no spare infrastructure. Canary deployments rollback by reducing canary traffic to zero.
Database rollbacks complicate application rollbacks. Rolling back application code without rolling back schema changes causes errors when old code expects old schema. Teams must consider database state when executing rollbacks, potentially requiring schema rollback migrations.
# Automated rollback trigger
class DeploymentMonitor
def monitor(deployment_id, duration: 600)
start_time = Time.now
while Time.now - start_time < duration
metrics = fetch_metrics(deployment_id)
if metrics[:error_rate] > threshold[:error_rate]
trigger_rollback(deployment_id, reason: 'error_rate')
return false
end
if metrics[:p95_latency] > threshold[:p95_latency]
trigger_rollback(deployment_id, reason: 'latency')
return false
end
sleep 30
end
true
end
def trigger_rollback(deployment_id, reason:)
deployment = Deployment.find(deployment_id)
deployment.rollback!
notify_team(
deployment: deployment,
reason: reason,
metrics: fetch_metrics(deployment_id)
)
end
end
Tools & Ecosystem
Deployment Automation Tools
Capistrano automates Ruby application deployments to traditional servers. It connects via SSH, executes deployment commands, manages release directories, and handles rollback. Capistrano suits applications deployed to VMs or bare metal servers.
Ansible provides general-purpose automation including deployment workflows. It uses YAML playbooks to define deployment steps and supports idempotent operations. Ansible handles infrastructure provisioning, configuration management, and application deployment.
Terraform manages infrastructure as code but integrates with deployment workflows. Teams use Terraform to provision infrastructure then trigger application deployments through other tools. The combination enables complete environment reproduction.
Container Orchestration
Kubernetes orchestrates containerized applications with built-in rolling deployment support. Deployment manifests define desired state, and Kubernetes automatically handles instance updates, health checks, and rollback when health checks fail.
Docker Swarm provides simpler container orchestration than Kubernetes with rolling update support. Swarm suits smaller deployments requiring less complexity than Kubernetes provides.
Amazon ECS offers managed container orchestration on AWS with rolling deployment and blue-green deployment support through integration with Application Load Balancer.
Traffic Management
HAProxy provides high-performance load balancing with traffic routing rules for canary deployments. Configuration defines backend server pools and routing percentages.
NGINX offers load balancing and traffic routing through configuration or dynamic reconfiguration via API. NGINX Plus adds commercial features including advanced health checks and dynamic reconfiguration.
Service meshes (Istio, Linkerd) add traffic management to Kubernetes through sidecar proxies. They enable sophisticated traffic splitting, canary deployments, and A/B testing without application changes.
Feature Flag Platforms
LaunchDarkly provides commercial feature flag management with targeting rules, percentage rollouts, and metric integration. It offers SDKs for multiple languages including Ruby.
Flipper offers open-source feature flag management for Ruby applications. It stores flags in Redis, ActiveRecord, or other backends and supports boolean, percentage, and actor-based flags.
# Flipper usage
require 'flipper'
Flipper.configure do |config|
config.adapter = Flipper::Adapters::ActiveRecord.new
end
# Enable feature for percentage
Flipper.enable_percentage_of_actors(:new_ui, 25)
# Enable for specific users
Flipper.enable_actor(:premium_feature, current_user)
# Check in application
if Flipper.enabled?(:new_ui, current_user)
render :new_ui
else
render :legacy_ui
end
Monitoring and Observability
Prometheus collects metrics from applications and infrastructure with alert rules triggering on metric thresholds. Deployment monitoring queries Prometheus for error rates and latencies.
Datadog provides commercial monitoring with deployment tracking, anomaly detection, and alert notification. It correlates deployment events with metric changes to identify deployment-related problems.
New Relic offers application performance monitoring with deployment markers. Teams compare metrics before and after deployments to detect performance regressions.
Real-World Applications
High-Traffic Web Application Deployment
A web application serving 10,000 requests per second requires deployment strategies that maintain capacity and detect problems quickly. The application uses blue-green deployment with automated validation and monitoring.
The deployment process provisions a green environment matching blue capacity. Load tests confirm green handles expected traffic. Automated smoke tests verify critical functionality. The load balancer switches 5% of traffic to green for 10 minutes while monitoring error rates and latencies. Traffic increases to 25%, 50%, then 100% if metrics remain healthy.
# Production deployment orchestration
class ProductionDeployment
def initialize(version)
@version = version
@blue_env = Environment.new('blue')
@green_env = Environment.new('green')
@lb = LoadBalancer.new
end
def deploy
prepare_green_environment
run_validation_suite || abort_deployment
execute_gradual_rollout
end
private
def prepare_green_environment
@green_env.deploy(@version)
@green_env.scale_to(instances: 50)
@green_env.warm_caches
wait_for_readiness(@green_env)
end
def run_validation_suite
load_test_results = LoadTester.run(
target: @green_env,
duration: 300,
rps: 1000
)
smoke_test_results = SmokeTests.run(@green_env)
load_test_results.success? && smoke_test_results.success?
end
def execute_gradual_rollout
[5, 25, 50, 100].each do |percentage|
@lb.route_traffic(@green_env, percentage: percentage)
monitor_period = percentage == 100 ? 600 : 300
unless monitor_metrics(duration: monitor_period)
@lb.route_traffic(@blue_env, percentage: 100)
raise DeploymentFailed
end
end
end
def monitor_metrics(duration:)
MetricMonitor.compare(
baseline: @blue_env,
canary: @green_env,
duration: duration,
thresholds: {
error_rate: 0.01,
p95_latency: 500,
p99_latency: 1000
}
)
end
end
Database migrations use the expand-contract pattern. The first deployment adds new columns and dual-writes to old and new columns. After confirming new column usage, a subsequent deployment removes old columns. This maintains compatibility during the transition.
Microservice Rolling Deployment
A microservices architecture with 20 services requires coordinated deployments that maintain service contracts. Rolling deployment updates one service at a time while others continue running.
Service deployments must maintain API compatibility because dependent services may not update simultaneously. Versioned APIs enable old and new versions to coexist. Services accept requests in old and new formats, responding in the requested format.
# Versioned API controller
class Api::V2::UsersController < ApiController
def show
user = User.find(params[:id])
render json: V2::UserSerializer.new(user).as_json
end
end
# Backward-compatible serializer
module V2
class UserSerializer
def initialize(user)
@user = user
end
def as_json
{
id: @user.id,
email: @user.email,
profile: {
name: @user.name,
# New field in v2
verified_at: @user.email_verified_at
}
}
end
end
end
Service mesh configuration controls traffic routing during deployments. The deployment updates one service instance, verifies health, then proceeds to the next instance. The mesh ensures requests route only to healthy instances.
Feature Flag Rollout
A major UI redesign deploys behind a feature flag with gradual rollout based on user cohorts. The initial deployment enables the new UI for internal employees only. After validation, rollout expands to 5% of free users and all premium users. Finally, the flag enables for all users.
# Cohort-based feature flag
class NewUiFlag
def self.enabled?(user)
return true if employee?(user)
return true if premium_user?(user)
return percentage_rollout?(user, percentage: 5) if free_user?(user)
false
end
private
def self.employee?(user)
user.email.end_with?('@company.com')
end
def self.premium_user?(user)
user.subscription_tier == 'premium'
end
def self.free_user?(user)
user.subscription_tier == 'free'
end
def self.percentage_rollout?(user, percentage:)
user_hash = Digest::MD5.hexdigest(user.id.to_s).to_i(16)
user_hash % 100 < percentage
end
end
# Usage in controller
def dashboard
if NewUiFlag.enabled?(current_user)
render :new_dashboard
else
render :legacy_dashboard
end
end
Metrics track conversion rates, error rates, and user engagement for both UI versions. A/B testing infrastructure compares cohorts to measure the impact of UI changes on business metrics. If new UI conversion rate drops significantly, the flag disables until issues are resolved.
Database Migration Deployment
A critical database schema change requires careful coordination between application and database updates. The deployment uses a three-phase approach to maintain zero downtime.
Phase one deploys application code that writes to both old and new columns while reading from old columns. This deployment uses rolling strategy, updating instances incrementally. Database migration adds new columns but does not remove old columns.
# Phase 1: Dual-write application code
class User < ApplicationRecord
before_save :sync_email_verified
def email_verified?
# Read from old column during transition
read_attribute(:email_verified)
end
def email_verified=(value)
# Write to both columns
write_attribute(:email_verified, value)
write_attribute(:email_verified_at, value ? Time.now : nil)
end
private
def sync_email_verified
if email_verified_at_changed?
self.email_verified = email_verified_at.present?
end
end
end
Phase two deploys code reading from new columns. Background job backfills new columns for existing rows. Once backfill completes and all instances run new code, phase three removes old columns.
This approach maintains database compatibility throughout deployment. Old code works with old columns. New code works with new columns. The transition period supports both until migration completes.
Reference
Strategy Comparison
| Strategy | Downtime | Rollback Speed | Infrastructure Cost | Complexity | Risk Level |
|---|---|---|---|---|---|
| Recreate | Minutes | Slow (redeploy) | Low (1x) | Low | High |
| Rolling | None | Medium (redeploy) | Low (1.1-1.2x) | Medium | Medium |
| Blue-Green | None | Instant (traffic switch) | High (2x) | Medium | Low |
| Canary | None | Fast (reduce traffic) | Medium (1.2-1.5x) | High | Very Low |
| Feature Flag | None | Instant (disable flag) | Low (1x) | High | Very Low |
Decision Matrix
| Requirement | Recommended Strategy |
|---|---|
| Zero downtime required | Rolling, Blue-Green, Canary, Feature Flag |
| Instant rollback needed | Blue-Green, Feature Flag |
| Cost optimization priority | Rolling, Recreate, Feature Flag |
| Maximum safety required | Canary, Feature Flag |
| Simple infrastructure | Recreate, Rolling |
| Gradual user exposure | Canary, Feature Flag |
| Scheduled maintenance window | Recreate |
| Database schema changes | Rolling with multi-phase migrations |
| High traffic volume | Blue-Green, Canary |
| Microservices architecture | Rolling, Canary |
Deployment Checklist
| Phase | Task | Validation |
|---|---|---|
| Pre-Deployment | Run test suite | All tests pass |
| Pre-Deployment | Review schema changes | Backward compatible |
| Pre-Deployment | Check dependency updates | No breaking changes |
| Pre-Deployment | Verify rollback procedure | Documented and tested |
| Deployment | Deploy to staging | Smoke tests pass |
| Deployment | Run load tests | Performance acceptable |
| Deployment | Update production | Health checks pass |
| Deployment | Monitor error rates | Below threshold |
| Post-Deployment | Verify critical paths | Business functions work |
| Post-Deployment | Check metric dashboards | No anomalies detected |
| Post-Deployment | Monitor for 1 hour | Metrics stable |
| Post-Deployment | Document issues | Incident log updated |
Health Check Response Format
| Field | Type | Description |
|---|---|---|
| status | string | healthy or unhealthy |
| timestamp | ISO8601 | Check execution time |
| version | string | Application version identifier |
| checks | object | Individual check results |
| checks.database | object | Database connectivity status |
| checks.cache | object | Cache system status |
| checks.storage | object | File storage status |
| checks.response_time | number | Check duration in milliseconds |
Monitoring Metrics
| Metric | Description | Alert Threshold |
|---|---|---|
| Error Rate | Percentage of failed requests | Above 1% |
| P95 Latency | 95th percentile response time | Above 500ms |
| P99 Latency | 99th percentile response time | Above 1000ms |
| Throughput | Requests per second | Below 80% of baseline |
| CPU Usage | Percentage of CPU utilized | Above 80% |
| Memory Usage | Percentage of memory utilized | Above 90% |
| Database Connections | Active database connections | Above 80% of pool |
| Queue Depth | Pending background jobs | Above 1000 |
Common Deployment Commands
| Task | Capistrano Command |
|---|---|
| Deploy current branch | cap production deploy |
| Rollback to previous release | cap production deploy:rollback |
| Check deployment status | cap production deploy:check |
| Run database migrations | cap production deploy:migrate |
| Restart application | cap production deploy:restart |
| View deployed releases | cap production releases |
| Clean old releases | cap production deploy:cleanup |
| Deploy specific branch | cap production deploy BRANCH=feature-x |