CrackedRuby logo

CrackedRuby

Docker

Overview

Docker provides a Ruby interface to the Docker Engine API for programmatic container management. The library exposes Docker's REST API through Ruby classes and methods, enabling applications to create, manage, and monitor containers, images, networks, and volumes directly from Ruby code.

The primary entry point is the Docker module, which establishes connections to Docker daemons running locally or remotely. Core functionality centers around container lifecycle management, image operations, and resource monitoring. Docker Ruby supports both blocking and non-blocking operations, with built-in connection pooling and automatic retry mechanisms for network operations.

require 'docker'

# Connect to local Docker daemon
Docker.url = 'unix:///var/run/docker.sock'

# Create and start a container
container = Docker::Container.create(
  'Image' => 'nginx:latest',
  'ExposedPorts' => { '80/tcp' => {} }
)
container.start

The library organizes Docker resources into distinct classes: Docker::Container for container operations, Docker::Image for image management, Docker::Network for network configuration, and Docker::Volume for persistent storage. Each class provides methods that map directly to Docker API endpoints while handling JSON serialization, HTTP communication, and response parsing automatically.

# List all running containers
containers = Docker::Container.all(all: false)
containers.each { |c| puts c.info['Names'].first }

# Pull an image from registry
image = Docker::Image.create('fromImage' => 'redis:alpine')
puts image.info['RepoTags']

Docker Ruby manages authentication, connection timeouts, and error recovery transparently. The library supports Docker API versions 1.16 through the current release, with automatic version detection and feature compatibility checking. Authentication credentials for private registries can be configured globally or passed per operation.

Basic Usage

Container creation requires an image specification and optional configuration parameters. The Docker::Container.create method accepts a hash matching Docker's container configuration format, including port mappings, environment variables, and volume mounts.

container = Docker::Container.create(
  'Image' => 'ubuntu:20.04',
  'Cmd' => ['/bin/bash', '-c', 'echo "Hello Docker"'],
  'Env' => ['NODE_ENV=production', 'PORT=3000'],
  'ExposedPorts' => { '3000/tcp' => {} },
  'HostConfig' => {
    'PortBindings' => { '3000/tcp' => [{ 'HostPort' => '8080' }] },
    'Memory' => 512 * 1024 * 1024  # 512MB limit
  }
)

# Start the container and wait for completion
container.start
exit_code = container.wait['StatusCode']
puts "Container exited with code: #{exit_code}"

Image operations include pulling from registries, building from Dockerfiles, and managing local image storage. The Docker::Image class provides methods for image inspection, tagging, and removal. Images can be pulled by tag or digest, with authentication handled through registry credentials.

# Pull specific image version
image = Docker::Image.create('fromImage' => 'postgres:13.4')

# Build image from Dockerfile
build_image = Docker::Image.build_from_dir('/path/to/docker/context', {
  'dockerfile' => 'Dockerfile.production',
  't' => 'myapp:latest'
})

# Tag existing image
image.tag('repo' => 'myregistry.com/postgres', 'tag' => 'v13.4')

Container inspection and log retrieval provide runtime monitoring capabilities. The info method returns complete container metadata, while logs streams or retrieves container output with timestamp and stream filtering options.

# Get container details
info = container.info
puts "Status: #{info['State']['Status']}"
puts "IP Address: #{info['NetworkSettings']['IPAddress']}"

# Stream logs with timestamps
container.logs(
  stdout: true,
  stderr: true,
  timestamps: true,
  follow: true
) do |chunk|
  puts chunk
end

Network and volume management enables complex multi-container applications. Networks can be created with custom drivers and configuration, while volumes provide persistent data storage that survives container removal.

# Create custom network
network = Docker::Network.create(
  'Name' => 'myapp-network',
  'Driver' => 'bridge',
  'IPAM' => {
    'Config' => [{ 'Subnet' => '172.20.0.0/16' }]
  }
)

# Create named volume
volume = Docker::Volume.create('Name' => 'postgres-data')

# Connect container to network and volume
container = Docker::Container.create(
  'Image' => 'postgres:13',
  'NetworkingConfig' => {
    'EndpointsConfig' => {
      'myapp-network' => {}
    }
  },
  'HostConfig' => {
    'Mounts' => [{
      'Type' => 'volume',
      'Source' => 'postgres-data',
      'Target' => '/var/lib/postgresql/data'
    }]
  }
)

Advanced Usage

Multi-stage container orchestration requires coordination between multiple containers with dependency management and service discovery. Complex applications often involve database containers, application servers, and reverse proxies that must start in specific order with proper network connectivity.

class DockerOrchestrator
  def initialize
    @network = create_application_network
    @containers = {}
  end

  def deploy_stack
    # Start database first
    @containers[:database] = create_database_container
    @containers[:database].start
    wait_for_database_ready

    # Start application containers
    @containers[:app] = create_application_container
    @containers[:app].start
    wait_for_application_ready

    # Start reverse proxy last
    @containers[:proxy] = create_proxy_container
    @containers[:proxy].start
    
    configure_health_checks
  end

  private

  def create_application_network
    Docker::Network.create(
      'Name' => 'application-tier',
      'Driver' => 'bridge',
      'Options' => {
        'com.docker.network.driver.mtu' => '1450'
      },
      'Labels' => {
        'environment' => 'production',
        'project' => 'myapp'
      }
    )
  end

  def create_database_container
    Docker::Container.create(
      'Image' => 'postgres:13-alpine',
      'Env' => [
        'POSTGRES_DB=myapp_production',
        'POSTGRES_USER=appuser',
        'POSTGRES_PASSWORD=secure_password'
      ],
      'NetworkingConfig' => {
        'EndpointsConfig' => {
          'application-tier' => {
            'Aliases' => ['database', 'postgres']
          }
        }
      },
      'HostConfig' => {
        'RestartPolicy' => { 'Name' => 'unless-stopped' },
        'Memory' => 1024 * 1024 * 1024,  # 1GB
        'Mounts' => [{
          'Type' => 'volume',
          'Source' => 'postgres-data',
          'Target' => '/var/lib/postgresql/data'
        }]
      }
    )
  end

  def wait_for_database_ready
    30.times do
      begin
        result = @containers[:database].exec(['pg_isready', '-U', 'appuser'])
        return if result[2] == 0
      rescue Docker::Error::DockerError
        # Container not ready yet
      end
      sleep 2
    end
    raise 'Database failed to become ready'
  end
end

Image building with custom contexts and multi-stage builds enables optimized production images. The build process can include custom build arguments, labels, and target stage selection for complex Dockerfile scenarios.

def build_production_image(app_path, version_tag)
  # Create build context with .dockerignore handling
  build_context = create_build_context(app_path)
  
  image = Docker::Image.build_from_tar(
    build_context,
    'dockerfile' => 'Dockerfile.production',
    't' => "myapp:#{version_tag}",
    'target' => 'production',
    'buildargs' => {
      'RUBY_VERSION' => '3.0.2',
      'NODE_VERSION' => '16.14.0',
      'RAILS_ENV' => 'production',
      'BUNDLE_WITHOUT' => 'development:test'
    }.to_json,
    'labels' => {
      'version' => version_tag,
      'build.timestamp' => Time.now.iso8601,
      'git.commit' => `git rev-parse HEAD`.strip
    }.to_json
  ) do |chunk|
    # Stream build output for progress monitoring
    if chunk.match(/"stream":"(.+)"/)
      puts $1.gsub(/\\n/, "\n")
    end
  end

  # Tag for different environments
  image.tag('repo' => 'registry.company.com/myapp', 'tag' => version_tag)
  image.tag('repo' => 'registry.company.com/myapp', 'tag' => 'latest')
  
  image
end

def create_build_context(path)
  require 'tempfile'
  require 'zlib'
  require 'archive/tar/minitar'
  
  tarfile = Tempfile.new('docker-build-context')
  
  Zlib::GzipWriter.open(tarfile.path) do |gz|
    Archive::Tar::Minitar.pack(path, gz)
  end
  
  File.open(tarfile.path, 'rb')
end

Container resource monitoring and dynamic scaling requires real-time metrics collection and automated response to load changes. This involves monitoring CPU, memory, and network usage to make scaling decisions.

class ContainerScaler
  def initialize(base_image, target_containers: 3)
    @base_image = base_image
    @target_containers = target_containers
    @running_containers = []
    @load_balancer_config = []
  end

  def monitor_and_scale
    loop do
      current_load = calculate_average_load
      
      if current_load > 0.8 && @running_containers.size < @target_containers * 2
        scale_up
      elsif current_load < 0.3 && @running_containers.size > @target_containers
        scale_down
      end
      
      update_load_balancer_config
      sleep 30  # Monitor every 30 seconds
    end
  end

  private

  def calculate_average_load
    return 0.0 if @running_containers.empty?
    
    total_cpu = @running_containers.sum do |container|
      stats = container.stats(stream: false)
      calculate_cpu_percentage(stats)
    end
    
    total_cpu / @running_containers.size
  end

  def scale_up
    new_container = Docker::Container.create(
      'Image' => @base_image,
      'Labels' => { 'scaling.group' => 'web-servers' },
      'HostConfig' => {
        'Memory' => 512 * 1024 * 1024,
        'CpuShares' => 1024,
        'RestartPolicy' => { 'Name' => 'on-failure' }
      }
    )
    
    new_container.start
    @running_containers << new_container
    puts "Scaled up: #{@running_containers.size} containers running"
  end

  def calculate_cpu_percentage(stats)
    cpu_delta = stats['cpu_stats']['cpu_usage']['total_usage'] - 
                stats['precpu_stats']['cpu_usage']['total_usage']
    system_delta = stats['cpu_stats']['system_cpu_usage'] - 
                   stats['precpu_stats']['system_cpu_usage']
    
    return 0.0 if system_delta <= 0
    
    (cpu_delta.to_f / system_delta) * 100.0
  end
end

Error Handling & Debugging

Docker operations generate various exception types that require specific handling strategies. Network timeouts, daemon connectivity issues, and resource constraints each produce different error patterns requiring distinct recovery approaches.

def robust_container_operation
  retry_count = 0
  max_retries = 3

  begin
    container = Docker::Container.create(
      'Image' => 'nginx:latest',
      'HostConfig' => { 'Memory' => 256 * 1024 * 1024 }
    )
    container.start
    
    # Wait for container to be healthy
    wait_for_health_check(container)
    
  rescue Docker::Error::TimeoutError => e
    retry_count += 1
    if retry_count <= max_retries
      puts "Timeout error (attempt #{retry_count}): #{e.message}"
      sleep(2 ** retry_count)  # Exponential backoff
      retry
    else
      raise "Container operation failed after #{max_retries} attempts: #{e.message}"
    end
    
  rescue Docker::Error::NotFoundError => e
    # Image doesn't exist, try pulling first
    puts "Image not found locally, attempting to pull: #{e.message}"
    Docker::Image.create('fromImage' => 'nginx:latest')
    retry
    
  rescue Docker::Error::ConflictError => e
    # Container name already exists
    puts "Container conflict: #{e.message}"
    existing_container = find_existing_container(e.message)
    existing_container&.remove(force: true)
    retry
    
  rescue Docker::Error::ServerError => e
    # Docker daemon errors
    if e.message.include?('insufficient memory')
      puts "Insufficient memory, reducing container limits"
      # Reduce memory allocation and retry
      container_config['HostConfig']['Memory'] = 128 * 1024 * 1024
      retry
    else
      raise "Docker daemon error: #{e.message}"
    end
    
  rescue Excon::Error::Socket => e
    # Connection to Docker daemon failed
    puts "Cannot connect to Docker daemon: #{e.message}"
    check_docker_daemon_status
    raise "Docker daemon unavailable"
  end
end

def wait_for_health_check(container, timeout: 60)
  start_time = Time.now
  
  loop do
    container.reload
    status = container.info['State']['Status']
    health = container.info['State']['Health']
    
    return true if status == 'running' && (!health || health['Status'] == 'healthy')
    
    if Time.now - start_time > timeout
      logs = container.logs(stdout: true, stderr: true, tail: 50)
      raise "Container failed health check. Logs:\n#{logs}"
    end
    
    if status == 'exited'
      exit_code = container.info['State']['ExitCode']
      logs = container.logs(stdout: true, stderr: true)
      raise "Container exited with code #{exit_code}. Logs:\n#{logs}"
    end
    
    sleep 2
  end
end

Resource exhaustion debugging requires monitoring system resources and container limits to identify bottlenecks. Memory leaks, CPU spikes, and disk space issues manifest differently in containerized environments.

class ContainerDiagnostics
  def self.diagnose_performance_issues(container)
    stats = container.stats(stream: false)
    info = container.info
    
    diagnose_memory_usage(stats, info)
    diagnose_cpu_usage(stats)
    diagnose_network_issues(stats)
    diagnose_disk_usage(container)
  rescue => e
    puts "Diagnostic error: #{e.message}"
    fallback_diagnostics(container)
  end

  private

  def self.diagnose_memory_usage(stats, info)
    memory_limit = info.dig('HostConfig', 'Memory') || 0
    memory_usage = stats.dig('memory_stats', 'usage') || 0
    
    if memory_limit > 0
      usage_percent = (memory_usage.to_f / memory_limit) * 100
      puts "Memory usage: #{usage_percent.round(2)}% (#{memory_usage / 1024 / 1024}MB / #{memory_limit / 1024 / 1024}MB)"
      
      if usage_percent > 90
        puts "WARNING: High memory usage detected"
        cache_usage = stats.dig('memory_stats', 'stats', 'cache') || 0
        puts "Cache usage: #{cache_usage / 1024 / 1024}MB"
        
        if stats.dig('memory_stats', 'failcnt', 0) > 0
          puts "CRITICAL: Memory limit exceeded, container may be killed"
        end
      end
    end
  end

  def self.diagnose_cpu_usage(stats)
    cpu_usage = calculate_cpu_percentage(stats)
    cpu_throttling = stats.dig('cpu_stats', 'throttling_data', 'throttled_periods') || 0
    
    puts "CPU usage: #{cpu_usage.round(2)}%"
    
    if cpu_throttling > 0
      throttled_time = stats.dig('cpu_stats', 'throttling_data', 'throttled_time') || 0
      puts "WARNING: CPU throttling detected (#{cpu_throttling} periods, #{throttled_time}ns throttled)"
    end
  end

  def self.diagnose_disk_usage(container)
    # Check container filesystem usage
    exec_result = container.exec(['df', '-h', '/'])
    if exec_result[2] == 0
      df_output = exec_result[0].join("\n")
      puts "Disk usage:\n#{df_output}"
      
      # Parse usage percentage
      if df_output.match(/(\d+)%/)
        usage_percent = $1.to_i
        puts "WARNING: High disk usage (#{usage_percent}%)" if usage_percent > 85
      end
    end
  rescue Docker::Error::DockerError => e
    puts "Cannot check disk usage: #{e.message}"
  end
end

Log analysis and debugging techniques help identify application and infrastructure issues within containers. Structured log parsing and correlation across multiple containers provides comprehensive troubleshooting capabilities.

def analyze_container_logs(containers, since: Time.now - 3600)
  log_entries = []
  
  containers.each do |container|
    container_logs = container.logs(
      stdout: true,
      stderr: true,
      since: since.to_i,
      timestamps: true
    )
    
    container_logs.split("\n").each do |line|
      if parsed_entry = parse_log_entry(line, container.id)
        log_entries << parsed_entry
      end
    end
  end
  
  # Sort by timestamp for chronological analysis
  log_entries.sort_by! { |entry| entry[:timestamp] }
  
  # Analyze patterns
  analyze_error_patterns(log_entries)
  analyze_performance_patterns(log_entries)
  
  log_entries
end

def parse_log_entry(log_line, container_id)
  # Parse Docker log format: timestamp stream_type log_content
  match = log_line.match(/^(\d{4}-\d{2}-\d{2}T[\d:.]+Z)\s+(.*)$/)
  return nil unless match
  
  timestamp = Time.parse(match[1])
  content = match[2]
  
  # Extract log level if present
  level = case content
           when /ERROR|FATAL/i then 'ERROR'
           when /WARN/i then 'WARN'  
           when /INFO/i then 'INFO'
           when /DEBUG/i then 'DEBUG'
           else 'UNKNOWN'
           end
  
  {
    timestamp: timestamp,
    container_id: container_id[0..11],  # Short ID
    level: level,
    content: content
  }
end

def analyze_error_patterns(log_entries)
  error_entries = log_entries.select { |e| e[:level] == 'ERROR' }
  
  # Group by error type
  error_groups = error_entries.group_by do |entry|
    # Extract error class from log content
    entry[:content].match(/([A-Z][a-zA-Z]*Error|Exception)/)?.[1] || 'Unknown'
  end
  
  error_groups.each do |error_type, entries|
    puts "#{error_type}: #{entries.count} occurrences"
    if entries.count > 10
      puts "  WARNING: High frequency error pattern detected"
      puts "  First occurrence: #{entries.first[:timestamp]}"
      puts "  Latest occurrence: #{entries.last[:timestamp]}"
    end
  end
end

Production Patterns

Production Docker deployments require robust health checking, graceful shutdown handling, and comprehensive monitoring to ensure reliability. Applications must handle container lifecycle events properly and maintain service availability during updates and scaling operations.

class ProductionContainerManager
  def initialize(image_name, replicas: 3)
    @image_name = image_name
    @replicas = replicas
    @containers = []
    @load_balancer = LoadBalancerConfig.new
    @monitoring = ContainerMonitoring.new
  end

  def deploy_service
    # Create containers with production configuration
    @replicas.times do |index|
      container = create_production_container(index)
      container.start
      
      # Wait for health check before adding to load balancer
      wait_for_readiness(container)
      @containers << container
      @load_balancer.add_backend(container)
    end
    
    # Setup monitoring and alerting
    @monitoring.start_monitoring(@containers)
    configure_log_forwarding
    
    puts "Service deployed with #{@containers.size} healthy containers"
  end

  def rolling_update(new_image)
    puts "Starting rolling update to #{new_image}"
    
    @containers.each_with_index do |old_container, index|
      # Create new container with updated image
      new_container = create_production_container(index, image: new_image)
      new_container.start
      
      # Wait for new container to be ready
      wait_for_readiness(new_container)
      
      # Add to load balancer before removing old container
      @load_balancer.add_backend(new_container)
      
      # Graceful shutdown of old container
      shutdown_gracefully(old_container)
      @load_balancer.remove_backend(old_container)
      
      # Clean up old container
      old_container.remove
      @containers[index] = new_container
      
      puts "Updated container #{index + 1}/#{@containers.size}"
    end
    
    puts "Rolling update completed successfully"
  end

  private

  def create_production_container(index, image: @image_name)
    Docker::Container.create(
      'Image' => image,
      'Env' => production_environment_variables(index),
      'Labels' => production_labels(index),
      'HostConfig' => {
        'Memory' => 1024 * 1024 * 1024,  # 1GB limit
        'CpuShares' => 1024,
        'RestartPolicy' => { 'Name' => 'unless-stopped' },
        'LogConfig' => {
          'Type' => 'json-file',
          'Config' => {
            'max-size' => '100m',
            'max-file' => '3'
          }
        },
        'SecurityOpt' => ['no-new-privileges:true'],
        'ReadonlyRootfs' => true,
        'Tmpfs' => { '/tmp' => 'rw,noexec,nosuid,size=256m' }
      },
      'Healthcheck' => {
        'Test' => ['CMD-SHELL', 'curl -f http://localhost:3000/health || exit 1'],
        'Interval' => 30000000000,  # 30 seconds in nanoseconds
        'Timeout' => 10000000000,   # 10 seconds
        'Retries' => 3,
        'StartPeriod' => 60000000000  # 60 seconds
      }
    )
  end

  def production_environment_variables(index)
    [
      'RAILS_ENV=production',
      'NODE_ENV=production',
      'PORT=3000',
      "INSTANCE_ID=#{index}",
      "HOSTNAME=app-#{index}",
      'DATABASE_URL=postgresql://user:pass@database:5432/prod',
      'REDIS_URL=redis://redis:6379/0',
      'SECRET_KEY_BASE=#{ENV['SECRET_KEY_BASE']}',
      'HONEYBADGER_API_KEY=#{ENV['HONEYBADGER_API_KEY']}'
    ]
  end

  def shutdown_gracefully(container, timeout: 30)
    puts "Gracefully shutting down container #{container.id[0..11]}"
    
    # Send SIGTERM to allow graceful shutdown
    container.kill(signal: 'TERM')
    
    # Wait for graceful shutdown
    start_time = Time.now
    loop do
      container.reload
      break if container.info['State']['Status'] != 'running'
      
      if Time.now - start_time > timeout
        puts "Graceful shutdown timeout, forcing termination"
        container.kill(signal: 'KILL')
        break
      end
      
      sleep 1
    end
  end
end

Secret management and configuration handling in production requires secure approaches to sensitive data access without exposing credentials in container images or environment variables.

class SecretManager
  def initialize(vault_client)
    @vault_client = vault_client
    @secret_cache = {}
    @cache_ttl = 300  # 5 minutes
  end

  def create_container_with_secrets(image, secret_paths)
    # Mount secrets as temporary files instead of environment variables
    secret_mounts = prepare_secret_mounts(secret_paths)
    
    container = Docker::Container.create(
      'Image' => image,
      'HostConfig' => {
        'Mounts' => secret_mounts,
        'SecurityOpt' => [
          'seccomp=unconfined',
          'apparmor=unconfined'
        ],
        'CapDrop' => ['ALL'],
        'CapAdd' => ['CHOWN', 'SETUID', 'SETGID'],
        'ReadonlyRootfs' => true,
        'Tmpfs' => {
          '/secrets' => 'rw,noexec,nosuid,size=64m,mode=0700'
        }
      },
      'User' => 'appuser:appgroup'
    )
    
    # Rotate secrets periodically
    Thread.new { rotate_secrets_periodically(container, secret_paths) }
    
    container
  end

  private

  def prepare_secret_mounts(secret_paths)
    mounts = []
    
    secret_paths.each do |secret_path|
      secret_value = fetch_secret(secret_path)
      temp_file = write_secret_to_temp_file(secret_value)
      
      mounts << {
        'Type' => 'bind',
        'Source' => temp_file,
        'Target' => "/secrets/#{File.basename(secret_path)}",
        'ReadOnly' => true
      }
    end
    
    mounts
  end

  def fetch_secret(path)
    cache_key = "secret:#{path}"
    cached = @secret_cache[cache_key]
    
    if cached && (Time.now - cached[:timestamp]) < @cache_ttl
      return cached[:value]
    end
    
    secret_value = @vault_client.logical.read(path).data[:value]
    @secret_cache[cache_key] = {
      value: secret_value,
      timestamp: Time.now
    }
    
    secret_value
  end

  def write_secret_to_temp_file(secret_value)
    temp_file = Tempfile.new('docker-secret')
    temp_file.write(secret_value)
    temp_file.close
    
    # Set restrictive permissions
    File.chmod(0600, temp_file.path)
    temp_file.path
  end
end

Monitoring and alerting systems for production containers require comprehensive metrics collection, anomaly detection, and automated response to service degradation.

class ProductionMonitoring
  def initialize(alert_manager, metrics_client)
    @alert_manager = alert_manager
    @metrics_client = metrics_client
    @monitoring_threads = []
    @alert_thresholds = {
      cpu_percent: 80.0,
      memory_percent: 85.0,
      error_rate: 5.0,
      response_time_p95: 2000.0  # milliseconds
    }
  end

  def start_monitoring(containers)
    @monitoring_threads << Thread.new { monitor_resource_usage(containers) }
    @monitoring_threads << Thread.new { monitor_application_health(containers) }
    @monitoring_threads << Thread.new { monitor_log_patterns(containers) }
    
    # Setup graceful shutdown
    trap('TERM') { shutdown_monitoring }
    trap('INT') { shutdown_monitoring }
  end

  private

  def monitor_resource_usage(containers)
    loop do
      containers.each do |container|
        next unless container_running?(container)
        
        stats = container.stats(stream: false)
        metrics = calculate_container_metrics(stats, container)
        
        # Send metrics to monitoring system
        @metrics_client.gauge('container.cpu.percent', metrics[:cpu_percent], 
                             tags: ["container:#{container.id[0..11]}"])
        @metrics_client.gauge('container.memory.percent', metrics[:memory_percent],
                             tags: ["container:#{container.id[0..11]}"])
        
        # Check alert thresholds
        check_resource_alerts(container, metrics)
      end
      
      sleep 15  # Monitor every 15 seconds
    end
  rescue => e
    puts "Resource monitoring error: #{e.message}"
    @alert_manager.send_alert("Monitoring system error: #{e.message}", severity: 'critical')
  end

  def monitor_application_health(containers)
    loop do
      containers.each do |container|
        next unless container_running?(container)
        
        health_status = check_application_health(container)
        
        @metrics_client.gauge('container.health.status',
                             health_status[:healthy] ? 1 : 0,
                             tags: ["container:#{container.id[0..11]}"])
        
        unless health_status[:healthy]
          @alert_manager.send_alert(
            "Container #{container.id[0..11]} health check failed: #{health_status[:error]}",
            severity: 'warning'
          )
        end
      end
      
      sleep 30  # Health check every 30 seconds
    end
  end

  def check_application_health(container)
    result = container.exec(['curl', '-f', '-s', 'http://localhost:3000/health'])
    
    if result[2] == 0  # Exit code 0 means success
      response = result[0].join
      health_data = JSON.parse(response) rescue {}
      
      {
        healthy: health_data['status'] == 'ok',
        response_time: health_data['response_time_ms'],
        database_connected: health_data['database'] == 'connected',
        error: nil
      }
    else
      {
        healthy: false,
        error: result[1].join.strip
      }
    end
  rescue => e
    {
      healthy: false,
      error: e.message
    }
  end

  def check_resource_alerts(container, metrics)
    container_id = container.id[0..11]
    
    if metrics[:cpu_percent] > @alert_thresholds[:cpu_percent]
      @alert_manager.send_alert(
        "High CPU usage on container #{container_id}: #{metrics[:cpu_percent].round(2)}%",
        severity: 'warning'
      )
    end
    
    if metrics[:memory_percent] > @alert_thresholds[:memory_percent]
      @alert_manager.send_alert(
        "High memory usage on container #{container_id}: #{metrics[:memory_percent].round(2)}%",
        severity: 'critical'
      )
    end
  end
end

Performance & Memory

Container resource optimization requires understanding Docker's resource allocation mechanisms, memory management patterns, and performance bottlenecks specific to containerized Ruby applications. Memory limits, CPU quotas, and I/O constraints affect application behavior differently than bare metal deployments.

class ContainerResourceOptimizer
  def self.optimize_container_config(base_config, workload_type)
    optimized_config = base_config.dup
    
    case workload_type
    when :web_server
      optimized_config['HostConfig'].merge!(
        'Memory' => 512 * 1024 * 1024,  # 512MB
        'MemorySwap' => 512 * 1024 * 1024,  # Disable swap
        'CpuShares' => 1024,
        'CpuQuota' => 100000,  # 1 CPU core
        'BlkioWeight' => 500,
        'Ulimits' => [
          { 'Name' => 'nofile', 'Soft' => 65536, 'Hard' => 65536 },
          { 'Name' => 'nproc', 'Soft' => 32768, 'Hard' => 32768 }
        ]
      )
      
    when :background_worker  
      optimized_config['HostConfig'].merge!(
        'Memory' => 1024 * 1024 * 1024,  # 1GB for processing
        'CpuShares' => 2048,  # Higher CPU priority
        'CpuQuota' => 200000,  # 2 CPU cores
        'BlkioWeight' => 300   # Lower I/O priority
      )
      
    when :database
      optimized_config['HostConfig'].merge!(
        'Memory' => 2048 * 1024 * 1024,  # 2GB
        'MemorySwappiness' => 1,  # Minimize swapping
        'CpuShares' => 2048,
        'BlkioWeight' => 1000,  # High I/O priority
        'ShmSize' => 256 * 1024 * 1024  # 256MB shared memory
      )
    end
    
    # Add resource monitoring labels
    optimized_config['Labels'] ||= {}
    optimized_config['Labels'].merge!(
      'resource.profile' => workload_type.to_s,
      'resource.memory_limit' => (optimized_config['HostConfig']['Memory'] / 1024 / 1024).to_s,
      'resource.cpu_shares' => optimized_config['HostConfig']['CpuShares'].to_s
    )
    
    optimized_config
  end

  def self.benchmark_container_performance(container, duration: 60)
    puts "Starting #{duration}s performance benchmark..."
    
    metrics = {
      cpu_samples: [],
      memory_samples: [],
      network_rx: [],
      network_tx: [],
      disk_read: [],
      disk_write: []
    }
    
    start_time = Time.now
    sample_count = 0
    
    while (Time.now - start_time) < duration
      stats = container.stats(stream: false)
      
      metrics[:cpu_samples] << calculate_cpu_percentage(stats)
      metrics[:memory_samples] << calculate_memory_usage(stats)
      metrics[:network_rx] << stats.dig('networks', 'eth0', 'rx_bytes') || 0
      metrics[:network_tx] << stats.dig('networks', 'eth0', 'tx_bytes') || 0
      metrics[:disk_read] << stats.dig('blkio_stats', 'io_service_bytes_recursive', 0, 'value') || 0
      metrics[:disk_write] << stats.dig('blkio_stats', 'io_service_bytes_recursive', 1, 'value') || 0
      
      sample_count += 1
      sleep 2
    end
    
    analyze_performance_metrics(metrics, duration, sample_count)
  end

  private

  def self.calculate_cpu_percentage(stats)
    cpu_delta = stats.dig('cpu_stats', 'cpu_usage', 'total_usage') - 
                stats.dig('precpu_stats', 'cpu_usage', 'total_usage')
    system_delta = stats.dig('cpu_stats', 'system_cpu_usage') - 
                   stats.dig('precpu_stats', 'system_cpu_usage')
    
    return 0.0 if system_delta <= 0 || cpu_delta < 0
    
    num_cpus = stats.dig('cpu_stats', 'online_cpus') || 1
    (cpu_delta.to_f / system_delta) * num_cpus * 100.0
  end

  def self.analyze_performance_metrics(metrics, duration, sample_count)
    results = {}
    
    # CPU analysis
    cpu_avg = metrics[:cpu_samples].sum / metrics[:cpu_samples].size
    cpu_max = metrics[:cpu_samples].max
    cpu_p95 = calculate_percentile(metrics[:cpu_samples], 95)
    
    # Memory analysis
    memory_avg = metrics[:memory_samples].sum / metrics[:memory_samples].size
    memory_max = metrics[:memory_samples].max
    memory_trend = calculate_trend(metrics[:memory_samples])
    
    # Network throughput
    network_rx_total = metrics[:network_rx].last - metrics[:network_rx].first
    network_tx_total = metrics[:network_tx].last - metrics[:network_tx].first
    network_rx_mbps = (network_rx_total * 8.0 / 1024 / 1024) / duration
    network_tx_mbps = (network_tx_total * 8.0 / 1024 / 1024) / duration
    
    puts "\n=== Performance Benchmark Results ==="
    puts "Duration: #{duration}s (#{sample_count} samples)"
    puts "\nCPU Usage:"
    puts "  Average: #{cpu_avg.round(2)}%"
    puts "  Maximum: #{cpu_max.round(2)}%"  
    puts "  95th percentile: #{cpu_p95.round(2)}%"
    puts "\nMemory Usage:"
    puts "  Average: #{(memory_avg / 1024 / 1024).round(2)}MB"
    puts "  Maximum: #{(memory_max / 1024 / 1024).round(2)}MB"
    puts "  Trend: #{memory_trend > 0 ? 'increasing' : 'stable/decreasing'}"
    puts "\nNetwork Throughput:"
    puts "  RX: #{network_rx_mbps.round(2)} Mbps"
    puts "  TX: #{network_tx_mbps.round(2)} Mbps"
    
    results
  end
end

Memory leak detection and analysis in containerized applications requires monitoring allocation patterns, garbage collection behavior, and container-specific memory constraints that differ from traditional Ruby deployments.

class MemoryLeakDetector
  def initialize(container)
    @container = container
    @baseline_memory = nil
    @memory_samples = []
    @gc_samples = []
    @object_samples = []
  end

  def start_monitoring(sample_interval: 30)
    Thread.new do
      loop do
        collect_memory_sample
        collect_gc_sample if ruby_application?
        analyze_trends if @memory_samples.size >= 10
        
        sleep sample_interval
      end
    rescue => e
      puts "Memory monitoring error: #{e.message}"
    end
  end

  def generate_leak_report
    return unless @memory_samples.size >= 5
    
    # Calculate memory growth rate
    memory_growth = calculate_memory_growth_rate
    gc_efficiency = calculate_gc_efficiency
    
    puts "\n=== Memory Leak Analysis ==="
    puts "Sample period: #{@memory_samples.size} samples over #{(@memory_samples.size * 30) / 60} minutes"
    puts "Memory growth rate: #{memory_growth.round(2)} MB/hour"
    puts "GC efficiency: #{gc_efficiency.round(2)}%" if gc_efficiency
    
    if memory_growth > 10  # More than 10MB/hour growth
      puts "\n⚠️  POTENTIAL MEMORY LEAK DETECTED"
      identify_leak_sources
      provide_remediation_steps
    else
      puts "\n✅ No significant memory leak detected"
    end
  end

  private

  def collect_memory_sample
    stats = @container.stats(stream: false)
    memory_usage = stats.dig('memory_stats', 'usage') || 0
    timestamp = Time.now
    
    @baseline_memory ||= memory_usage
    
    sample = {
      timestamp: timestamp,
      memory_usage: memory_usage,
      memory_limit: stats.dig('memory_stats', 'limit') || 0,
      cache_usage: stats.dig('memory_stats', 'stats', 'cache') || 0,
      rss: stats.dig('memory_stats', 'stats', 'rss') || 0
    }
    
    @memory_samples << sample
    @memory_samples.shift if @memory_samples.size > 100  # Keep last 100 samples
  end

  def collect_gc_sample
    # Execute Ruby code in container to get GC stats
    result = @container.exec(['ruby', '-e', 'puts GC.stat.to_json'])
    return unless result[2] == 0  # Success exit code
    
    gc_stats = JSON.parse(result[0].join)
    @gc_samples << {
      timestamp: Time.now,
      heap_allocated_pages: gc_stats['heap_allocated_pages'],
      heap_sorted_length: gc_stats['heap_sorted_length'],
      old_objects: gc_stats['old_objects'],
      total_allocated_objects: gc_stats['total_allocated_objects'],
      major_gc_count: gc_stats['major_gc_count'],
      minor_gc_count: gc_stats['minor_gc_count']
    }
    
    @gc_samples.shift if @gc_samples.size > 100
  rescue => e
    # Silently continue if we can't get GC stats
  end

  def ruby_application?
    # Check if container is running a Ruby application
    result = @container.exec(['pgrep', 'ruby'])
    result[2] == 0
  rescue
    false
  end

  def calculate_memory_growth_rate
    return 0 if @memory_samples.size < 2
    
    first_sample = @memory_samples.first
    last_sample = @memory_samples.last
    
    time_diff_hours = (last_sample[:timestamp] - first_sample[:timestamp]) / 3600.0
    memory_diff_mb = (last_sample[:memory_usage] - first_sample[:memory_usage]) / 1024.0 / 1024.0
    
    return 0 if time_diff_hours <= 0
    memory_diff_mb / time_diff_hours
  end

  def calculate_gc_efficiency
    return nil if @gc_samples.size < 2
    
    recent_samples = @gc_samples.last(10)
    gc_frequency = calculate_gc_frequency(recent_samples)
    heap_growth = calculate_heap_growth(recent_samples)
    
    # Simple efficiency metric: lower heap growth with reasonable GC frequency
    return nil if gc_frequency == 0
    
    efficiency = [0, 100 - (heap_growth * gc_frequency / 10.0)].max
    [efficiency, 100].min
  end

  def identify_leak_sources
    puts "\nPotential leak sources:"
    
    # Check for excessive heap growth
    if @gc_samples.size >= 5
      heap_pages_growth = @gc_samples.last[:heap_allocated_pages] - @gc_samples[-5][:heap_allocated_pages]
      if heap_pages_growth > 100
        puts "• Excessive heap page allocation (#{heap_pages_growth} pages)"
      end
      
      old_objects_growth = @gc_samples.last[:old_objects] - @gc_samples[-5][:old_objects]
      if old_objects_growth > 10000
        puts "• High old object generation (#{old_objects_growth} objects)"
      end
    end
    
    # Check memory usage patterns
    cache_ratio = @memory_samples.last[:cache_usage].to_f / @memory_samples.last[:memory_usage]
    if cache_ratio < 0.1
      puts "• Low cache usage ratio (#{(cache_ratio * 100).round(2)}%)"
    end
    
    rss_ratio = @memory_samples.last[:rss].to_f / @memory_samples.last[:memory_usage]
    if rss_ratio > 0.8
      puts "• High RSS usage ratio (#{(rss_ratio * 100).round(2)}%)"
    end
  end
end

Performance profiling and bottleneck identification requires container-aware tooling that accounts for resource limitations, shared kernel overhead, and containerized application behavior patterns.

class ContainerProfiler
  def self.profile_application(container, duration: 120)
    puts "Starting comprehensive application profiling..."
    
    profiler = new(container)
    profile_data = profiler.collect_profile_data(duration)
    profiler.analyze_bottlenecks(profile_data)
    profiler.generate_optimization_recommendations(profile_data)
  end

  def initialize(container)
    @container = container
    @profile_data = {
      cpu_samples: [],
      memory_samples: [],
      io_samples: [],
      network_samples: [],
      application_metrics: []
    }
  end

  def collect_profile_data(duration)
    threads = []
    
    # System resource sampling
    threads << Thread.new { sample_system_resources(duration) }
    
    # Application-level profiling if possible
    threads << Thread.new { sample_application_metrics(duration) }
    
    # I/O and network profiling
    threads << Thread.new { sample_io_patterns(duration) }
    
    # Wait for all profiling threads
    threads.each(&:join)
    
    @profile_data
  end

  private

  def sample_system_resources(duration)
    start_time = Time.now
    
    while (Time.now - start_time) < duration
      stats = @container.stats(stream: false)
      
      @profile_data[:cpu_samples] << {
        timestamp: Time.now,
        cpu_percent: calculate_cpu_percentage(stats),
        cpu_throttling: stats.dig('cpu_stats', 'throttling_data', 'throttled_periods') || 0,
        system_cpu_usage: stats.dig('cpu_stats', 'system_cpu_usage') || 0
      }
      
      @profile_data[:memory_samples] << {
        timestamp: Time.now,
        usage: stats.dig('memory_stats', 'usage') || 0,
        cache: stats.dig('memory_stats', 'stats', 'cache') || 0,
        rss: stats.dig('memory_stats', 'stats', 'rss') || 0,
        swap: stats.dig('memory_stats', 'stats', 'swap') || 0
      }
      
      @profile_data[:network_samples] << {
        timestamp: Time.now,
        rx_bytes: stats.dig('networks', 'eth0', 'rx_bytes') || 0,
        tx_bytes: stats.dig('networks', 'eth0', 'tx_bytes') || 0,
        rx_packets: stats.dig('networks', 'eth0', 'rx_packets') || 0,
        tx_packets: stats.dig('networks', 'eth0', 'tx_packets') || 0
      }
      
      sleep 5
    end
  end

  def sample_application_metrics(duration)
    return unless application_has_metrics_endpoint?
    
    start_time = Time.now
    
    while (Time.now - start_time) < duration
      begin
        result = @container.exec(['curl', '-s', 'http://localhost:3000/metrics'])
        if result[2] == 0
          metrics = parse_prometheus_metrics(result[0].join)
          @profile_data[:application_metrics] << {
            timestamp: Time.now,
            metrics: metrics
          }
        end
      rescue => e
        # Continue if metrics unavailable
      end
      
      sleep 10
    end
  end

  def analyze_bottlenecks(profile_data)
    puts "\n=== Performance Bottleneck Analysis ==="
    
    analyze_cpu_bottlenecks(profile_data[:cpu_samples])
    analyze_memory_bottlenecks(profile_data[:memory_samples])
    analyze_io_bottlenecks(profile_data[:io_samples])
    analyze_network_bottlenecks(profile_data[:network_samples])
  end

  def analyze_cpu_bottlenecks(cpu_samples)
    return if cpu_samples.empty?
    
    cpu_values = cpu_samples.map { |s| s[:cpu_percent] }
    avg_cpu = cpu_values.sum / cpu_values.size
    max_cpu = cpu_values.max
    throttling_events = cpu_samples.sum { |s| s[:cpu_throttling] }
    
    puts "\nCPU Analysis:"
    puts "  Average utilization: #{avg_cpu.round(2)}%"
    puts "  Peak utilization: #{max_cpu.round(2)}%"
    puts "  Throttling events: #{throttling_events}"
    
    if avg_cpu > 70
      puts "  ⚠️  High average CPU utilization indicates CPU bottleneck"
    end
    
    if throttling_events > 0
      puts "  🚨 CPU throttling detected - consider increasing CPU limits"
    end
    
    if max_cpu > 95
      puts "  🚨 CPU saturation detected - optimize CPU-intensive operations"
    end
  end

  def generate_optimization_recommendations(profile_data)
    puts "\n=== Optimization Recommendations ==="
    
    cpu_avg = profile_data[:cpu_samples].map { |s| s[:cpu_percent] }.sum / 
              profile_data[:cpu_samples].size rescue 0
    
    memory_max = profile_data[:memory_samples].map { |s| s[:usage] }.max rescue 0
    
    if cpu_avg > 80
      puts "• Consider horizontal scaling - CPU utilization is high"
      puts "• Profile application code for CPU-intensive operations"
      puts "• Consider using more efficient algorithms or data structures"
    end
    
    if memory_max > 800 * 1024 * 1024  # 800MB
      puts "• Monitor for memory leaks - high memory usage detected"
      puts "• Consider implementing object pooling for frequently created objects"
      puts "• Review caching strategies to reduce memory pressure"
    end
    
    puts "• Enable production profiling tools like Stackprof or Ruby Prof"
    puts "• Consider APM tools like New Relic or Datadog for ongoing monitoring"
    puts "• Implement circuit breakers for external service calls"
  end
end

Reference

Core Classes and Methods

Class Primary Purpose Key Methods
Docker Main entry point and configuration .url=, .logger=, .authenticate!
Docker::Container Container lifecycle management .create, .start, .stop, .remove, .logs
Docker::Image Image operations and building .create, .build_from_dir, .tag, .remove
Docker::Network Network management .create, .connect, .disconnect, .remove
Docker::Volume Volume operations .create, .remove, .info

Container Methods

Method Parameters Returns Description
.create(opts) Hash of container config Docker::Container Creates new container with specified configuration
#start None self Starts stopped container
#stop(timeout: 10) timeout (Integer) self Gracefully stops running container
#kill(signal: 'KILL') signal (String) self Sends signal to container process
#restart(timeout: 10) timeout (Integer) self Restarts container with optional timeout
#remove(force: false) force (Boolean) true Removes container, optionally forcing removal
#logs(opts = {}) stdout, stderr, timestamps, follow String/Stream Retrieves container logs with filtering options
#exec(cmd, opts = {}) cmd (Array), detach, tty Array Executes command inside running container
#stats(stream: true) stream (Boolean) Hash/Stream Returns resource usage statistics
#info None Hash Returns complete container inspection data
#wait(timeout = -1) timeout (Integer) Hash Waits for container to stop, returns exit code

Image Methods

Method Parameters Returns Description
.create(opts) fromImage, repo, tag, registry Docker::Image Pulls image from registry or creates from repository
.build(dockerfile) dockerfile (String) Docker::Image Builds image from Dockerfile content
.build_from_dir(path, opts) path (String), build options Docker::Image Builds image from directory context
.build_from_tar(tar, opts) tar (IO), build options Docker::Image Builds image from tar stream
#tag(opts) repo, tag, force true Tags image with repository and tag
#push(creds, repo_tag) credentials, repo_tag String Pushes image to registry
#remove(force: false) force (Boolean) true Removes image from local storage
#history None Array Returns image layer history
#info None Hash Returns detailed image inspection data

Network Methods

Method Parameters Returns Description
.create(opts) Name, Driver, IPAM, Options Docker::Network Creates custom network with specified driver
#connect(container, opts) container, EndpointConfig true Connects container to network
#disconnect(container, force) container, force true Disconnects container from network
#remove None true Removes network
#info None Hash Returns network inspection details

Volume Methods

Method Parameters Returns Description
.create(opts) Name, Driver, DriverOpts, Labels Docker::Volume Creates named volume with optional driver
#remove(force: false) force (Boolean) true Removes volume and all data
#info None Hash Returns volume details including mount point

Configuration Options

Container Creation Options

Option Type Purpose Example
'Image' String Base image name 'nginx:alpine'
'Cmd' Array Default command ['/bin/sh', '-c', 'echo hello']
'Env' Array Environment variables ['NODE_ENV=production', 'PORT=3000']
'ExposedPorts' Hash Container ports to expose {'80/tcp' => {}, '443/tcp' => {}}
'WorkingDir' String Working directory '/app'
'User' String User and group 'nobody:nogroup'
'Labels' Hash Metadata labels {'version' => '1.0', 'env' => 'prod'}

HostConfig Options

Option Type Purpose Example
'Memory' Integer Memory limit in bytes 512 * 1024 * 1024
'CpuShares' Integer Relative CPU weight 1024
'CpuQuota' Integer CPU quota in microseconds 100000
'PortBindings' Hash Host port mappings {'80/tcp' => [{'HostPort' => '8080'}]}
'Mounts' Array Volume and bind mounts [{'Type' => 'bind', 'Source' => '/host', 'Target' => '/container'}]
'RestartPolicy' Hash Restart behavior {'Name' => 'unless-stopped', 'MaximumRetryCount' => 3}
'LogConfig' Hash Logging configuration {'Type' => 'json-file', 'Config' => {'max-size' => '10m'}}

Error Classes

Exception Inheritance When Raised
Docker::Error::DockerError StandardError Base class for all Docker errors
Docker::Error::ClientError DockerError Client-side errors (4xx HTTP responses)
Docker::Error::ServerError DockerError Server-side errors (5xx HTTP responses)
Docker::Error::TimeoutError DockerError Request timeouts and connection timeouts
Docker::Error::NotFoundError ClientError Resource not found (404 responses)
Docker::Error::ConflictError ClientError Resource conflicts (409 responses)
Docker::Error::UnauthorizedError ClientError Authentication failures (401 responses)

Connection Configuration

Setting Default Purpose
Docker.url 'unix:///var/run/docker.sock' Docker daemon connection URL
Docker.connection Auto-configured HTTP connection object
Docker.logger nil Logger instance for debugging
Docker.read_timeout 60 Read timeout in seconds
Docker.write_timeout 60 Write timeout in seconds

Registry Authentication

# Configure registry credentials
Docker.authenticate!(
  'username' => 'myuser',
  'password' => 'mypassword',
  'email' => 'user@example.com',
  'serveraddress' => 'registry.example.com'
)

Build Context Options

Option Type Purpose
'dockerfile' String Dockerfile name
't' String Image tag
'buildargs' String (JSON) Build arguments
'target' String Multi-stage build target
'labels' String (JSON) Image labels
'nocache' Boolean Disable build cache