Overview
Docker provides a Ruby interface to the Docker Engine API for programmatic container management. The library exposes Docker's REST API through Ruby classes and methods, enabling applications to create, manage, and monitor containers, images, networks, and volumes directly from Ruby code.
The primary entry point is the Docker
module, which establishes connections to Docker daemons running locally or remotely. Core functionality centers around container lifecycle management, image operations, and resource monitoring. Docker Ruby supports both blocking and non-blocking operations, with built-in connection pooling and automatic retry mechanisms for network operations.
require 'docker'
# Connect to local Docker daemon
Docker.url = 'unix:///var/run/docker.sock'
# Create and start a container
container = Docker::Container.create(
'Image' => 'nginx:latest',
'ExposedPorts' => { '80/tcp' => {} }
)
container.start
The library organizes Docker resources into distinct classes: Docker::Container
for container operations, Docker::Image
for image management, Docker::Network
for network configuration, and Docker::Volume
for persistent storage. Each class provides methods that map directly to Docker API endpoints while handling JSON serialization, HTTP communication, and response parsing automatically.
# List all running containers
containers = Docker::Container.all(all: false)
containers.each { |c| puts c.info['Names'].first }
# Pull an image from registry
image = Docker::Image.create('fromImage' => 'redis:alpine')
puts image.info['RepoTags']
Docker Ruby manages authentication, connection timeouts, and error recovery transparently. The library supports Docker API versions 1.16 through the current release, with automatic version detection and feature compatibility checking. Authentication credentials for private registries can be configured globally or passed per operation.
Basic Usage
Container creation requires an image specification and optional configuration parameters. The Docker::Container.create
method accepts a hash matching Docker's container configuration format, including port mappings, environment variables, and volume mounts.
container = Docker::Container.create(
'Image' => 'ubuntu:20.04',
'Cmd' => ['/bin/bash', '-c', 'echo "Hello Docker"'],
'Env' => ['NODE_ENV=production', 'PORT=3000'],
'ExposedPorts' => { '3000/tcp' => {} },
'HostConfig' => {
'PortBindings' => { '3000/tcp' => [{ 'HostPort' => '8080' }] },
'Memory' => 512 * 1024 * 1024 # 512MB limit
}
)
# Start the container and wait for completion
container.start
exit_code = container.wait['StatusCode']
puts "Container exited with code: #{exit_code}"
Image operations include pulling from registries, building from Dockerfiles, and managing local image storage. The Docker::Image
class provides methods for image inspection, tagging, and removal. Images can be pulled by tag or digest, with authentication handled through registry credentials.
# Pull specific image version
image = Docker::Image.create('fromImage' => 'postgres:13.4')
# Build image from Dockerfile
build_image = Docker::Image.build_from_dir('/path/to/docker/context', {
'dockerfile' => 'Dockerfile.production',
't' => 'myapp:latest'
})
# Tag existing image
image.tag('repo' => 'myregistry.com/postgres', 'tag' => 'v13.4')
Container inspection and log retrieval provide runtime monitoring capabilities. The info
method returns complete container metadata, while logs
streams or retrieves container output with timestamp and stream filtering options.
# Get container details
info = container.info
puts "Status: #{info['State']['Status']}"
puts "IP Address: #{info['NetworkSettings']['IPAddress']}"
# Stream logs with timestamps
container.logs(
stdout: true,
stderr: true,
timestamps: true,
follow: true
) do |chunk|
puts chunk
end
Network and volume management enables complex multi-container applications. Networks can be created with custom drivers and configuration, while volumes provide persistent data storage that survives container removal.
# Create custom network
network = Docker::Network.create(
'Name' => 'myapp-network',
'Driver' => 'bridge',
'IPAM' => {
'Config' => [{ 'Subnet' => '172.20.0.0/16' }]
}
)
# Create named volume
volume = Docker::Volume.create('Name' => 'postgres-data')
# Connect container to network and volume
container = Docker::Container.create(
'Image' => 'postgres:13',
'NetworkingConfig' => {
'EndpointsConfig' => {
'myapp-network' => {}
}
},
'HostConfig' => {
'Mounts' => [{
'Type' => 'volume',
'Source' => 'postgres-data',
'Target' => '/var/lib/postgresql/data'
}]
}
)
Advanced Usage
Multi-stage container orchestration requires coordination between multiple containers with dependency management and service discovery. Complex applications often involve database containers, application servers, and reverse proxies that must start in specific order with proper network connectivity.
class DockerOrchestrator
def initialize
@network = create_application_network
@containers = {}
end
def deploy_stack
# Start database first
@containers[:database] = create_database_container
@containers[:database].start
wait_for_database_ready
# Start application containers
@containers[:app] = create_application_container
@containers[:app].start
wait_for_application_ready
# Start reverse proxy last
@containers[:proxy] = create_proxy_container
@containers[:proxy].start
configure_health_checks
end
private
def create_application_network
Docker::Network.create(
'Name' => 'application-tier',
'Driver' => 'bridge',
'Options' => {
'com.docker.network.driver.mtu' => '1450'
},
'Labels' => {
'environment' => 'production',
'project' => 'myapp'
}
)
end
def create_database_container
Docker::Container.create(
'Image' => 'postgres:13-alpine',
'Env' => [
'POSTGRES_DB=myapp_production',
'POSTGRES_USER=appuser',
'POSTGRES_PASSWORD=secure_password'
],
'NetworkingConfig' => {
'EndpointsConfig' => {
'application-tier' => {
'Aliases' => ['database', 'postgres']
}
}
},
'HostConfig' => {
'RestartPolicy' => { 'Name' => 'unless-stopped' },
'Memory' => 1024 * 1024 * 1024, # 1GB
'Mounts' => [{
'Type' => 'volume',
'Source' => 'postgres-data',
'Target' => '/var/lib/postgresql/data'
}]
}
)
end
def wait_for_database_ready
30.times do
begin
result = @containers[:database].exec(['pg_isready', '-U', 'appuser'])
return if result[2] == 0
rescue Docker::Error::DockerError
# Container not ready yet
end
sleep 2
end
raise 'Database failed to become ready'
end
end
Image building with custom contexts and multi-stage builds enables optimized production images. The build process can include custom build arguments, labels, and target stage selection for complex Dockerfile scenarios.
def build_production_image(app_path, version_tag)
# Create build context with .dockerignore handling
build_context = create_build_context(app_path)
image = Docker::Image.build_from_tar(
build_context,
'dockerfile' => 'Dockerfile.production',
't' => "myapp:#{version_tag}",
'target' => 'production',
'buildargs' => {
'RUBY_VERSION' => '3.0.2',
'NODE_VERSION' => '16.14.0',
'RAILS_ENV' => 'production',
'BUNDLE_WITHOUT' => 'development:test'
}.to_json,
'labels' => {
'version' => version_tag,
'build.timestamp' => Time.now.iso8601,
'git.commit' => `git rev-parse HEAD`.strip
}.to_json
) do |chunk|
# Stream build output for progress monitoring
if chunk.match(/"stream":"(.+)"/)
puts $1.gsub(/\\n/, "\n")
end
end
# Tag for different environments
image.tag('repo' => 'registry.company.com/myapp', 'tag' => version_tag)
image.tag('repo' => 'registry.company.com/myapp', 'tag' => 'latest')
image
end
def create_build_context(path)
require 'tempfile'
require 'zlib'
require 'archive/tar/minitar'
tarfile = Tempfile.new('docker-build-context')
Zlib::GzipWriter.open(tarfile.path) do |gz|
Archive::Tar::Minitar.pack(path, gz)
end
File.open(tarfile.path, 'rb')
end
Container resource monitoring and dynamic scaling requires real-time metrics collection and automated response to load changes. This involves monitoring CPU, memory, and network usage to make scaling decisions.
class ContainerScaler
def initialize(base_image, target_containers: 3)
@base_image = base_image
@target_containers = target_containers
@running_containers = []
@load_balancer_config = []
end
def monitor_and_scale
loop do
current_load = calculate_average_load
if current_load > 0.8 && @running_containers.size < @target_containers * 2
scale_up
elsif current_load < 0.3 && @running_containers.size > @target_containers
scale_down
end
update_load_balancer_config
sleep 30 # Monitor every 30 seconds
end
end
private
def calculate_average_load
return 0.0 if @running_containers.empty?
total_cpu = @running_containers.sum do |container|
stats = container.stats(stream: false)
calculate_cpu_percentage(stats)
end
total_cpu / @running_containers.size
end
def scale_up
new_container = Docker::Container.create(
'Image' => @base_image,
'Labels' => { 'scaling.group' => 'web-servers' },
'HostConfig' => {
'Memory' => 512 * 1024 * 1024,
'CpuShares' => 1024,
'RestartPolicy' => { 'Name' => 'on-failure' }
}
)
new_container.start
@running_containers << new_container
puts "Scaled up: #{@running_containers.size} containers running"
end
def calculate_cpu_percentage(stats)
cpu_delta = stats['cpu_stats']['cpu_usage']['total_usage'] -
stats['precpu_stats']['cpu_usage']['total_usage']
system_delta = stats['cpu_stats']['system_cpu_usage'] -
stats['precpu_stats']['system_cpu_usage']
return 0.0 if system_delta <= 0
(cpu_delta.to_f / system_delta) * 100.0
end
end
Error Handling & Debugging
Docker operations generate various exception types that require specific handling strategies. Network timeouts, daemon connectivity issues, and resource constraints each produce different error patterns requiring distinct recovery approaches.
def robust_container_operation
retry_count = 0
max_retries = 3
begin
container = Docker::Container.create(
'Image' => 'nginx:latest',
'HostConfig' => { 'Memory' => 256 * 1024 * 1024 }
)
container.start
# Wait for container to be healthy
wait_for_health_check(container)
rescue Docker::Error::TimeoutError => e
retry_count += 1
if retry_count <= max_retries
puts "Timeout error (attempt #{retry_count}): #{e.message}"
sleep(2 ** retry_count) # Exponential backoff
retry
else
raise "Container operation failed after #{max_retries} attempts: #{e.message}"
end
rescue Docker::Error::NotFoundError => e
# Image doesn't exist, try pulling first
puts "Image not found locally, attempting to pull: #{e.message}"
Docker::Image.create('fromImage' => 'nginx:latest')
retry
rescue Docker::Error::ConflictError => e
# Container name already exists
puts "Container conflict: #{e.message}"
existing_container = find_existing_container(e.message)
existing_container&.remove(force: true)
retry
rescue Docker::Error::ServerError => e
# Docker daemon errors
if e.message.include?('insufficient memory')
puts "Insufficient memory, reducing container limits"
# Reduce memory allocation and retry
container_config['HostConfig']['Memory'] = 128 * 1024 * 1024
retry
else
raise "Docker daemon error: #{e.message}"
end
rescue Excon::Error::Socket => e
# Connection to Docker daemon failed
puts "Cannot connect to Docker daemon: #{e.message}"
check_docker_daemon_status
raise "Docker daemon unavailable"
end
end
def wait_for_health_check(container, timeout: 60)
start_time = Time.now
loop do
container.reload
status = container.info['State']['Status']
health = container.info['State']['Health']
return true if status == 'running' && (!health || health['Status'] == 'healthy')
if Time.now - start_time > timeout
logs = container.logs(stdout: true, stderr: true, tail: 50)
raise "Container failed health check. Logs:\n#{logs}"
end
if status == 'exited'
exit_code = container.info['State']['ExitCode']
logs = container.logs(stdout: true, stderr: true)
raise "Container exited with code #{exit_code}. Logs:\n#{logs}"
end
sleep 2
end
end
Resource exhaustion debugging requires monitoring system resources and container limits to identify bottlenecks. Memory leaks, CPU spikes, and disk space issues manifest differently in containerized environments.
class ContainerDiagnostics
def self.diagnose_performance_issues(container)
stats = container.stats(stream: false)
info = container.info
diagnose_memory_usage(stats, info)
diagnose_cpu_usage(stats)
diagnose_network_issues(stats)
diagnose_disk_usage(container)
rescue => e
puts "Diagnostic error: #{e.message}"
fallback_diagnostics(container)
end
private
def self.diagnose_memory_usage(stats, info)
memory_limit = info.dig('HostConfig', 'Memory') || 0
memory_usage = stats.dig('memory_stats', 'usage') || 0
if memory_limit > 0
usage_percent = (memory_usage.to_f / memory_limit) * 100
puts "Memory usage: #{usage_percent.round(2)}% (#{memory_usage / 1024 / 1024}MB / #{memory_limit / 1024 / 1024}MB)"
if usage_percent > 90
puts "WARNING: High memory usage detected"
cache_usage = stats.dig('memory_stats', 'stats', 'cache') || 0
puts "Cache usage: #{cache_usage / 1024 / 1024}MB"
if stats.dig('memory_stats', 'failcnt', 0) > 0
puts "CRITICAL: Memory limit exceeded, container may be killed"
end
end
end
end
def self.diagnose_cpu_usage(stats)
cpu_usage = calculate_cpu_percentage(stats)
cpu_throttling = stats.dig('cpu_stats', 'throttling_data', 'throttled_periods') || 0
puts "CPU usage: #{cpu_usage.round(2)}%"
if cpu_throttling > 0
throttled_time = stats.dig('cpu_stats', 'throttling_data', 'throttled_time') || 0
puts "WARNING: CPU throttling detected (#{cpu_throttling} periods, #{throttled_time}ns throttled)"
end
end
def self.diagnose_disk_usage(container)
# Check container filesystem usage
exec_result = container.exec(['df', '-h', '/'])
if exec_result[2] == 0
df_output = exec_result[0].join("\n")
puts "Disk usage:\n#{df_output}"
# Parse usage percentage
if df_output.match(/(\d+)%/)
usage_percent = $1.to_i
puts "WARNING: High disk usage (#{usage_percent}%)" if usage_percent > 85
end
end
rescue Docker::Error::DockerError => e
puts "Cannot check disk usage: #{e.message}"
end
end
Log analysis and debugging techniques help identify application and infrastructure issues within containers. Structured log parsing and correlation across multiple containers provides comprehensive troubleshooting capabilities.
def analyze_container_logs(containers, since: Time.now - 3600)
log_entries = []
containers.each do |container|
container_logs = container.logs(
stdout: true,
stderr: true,
since: since.to_i,
timestamps: true
)
container_logs.split("\n").each do |line|
if parsed_entry = parse_log_entry(line, container.id)
log_entries << parsed_entry
end
end
end
# Sort by timestamp for chronological analysis
log_entries.sort_by! { |entry| entry[:timestamp] }
# Analyze patterns
analyze_error_patterns(log_entries)
analyze_performance_patterns(log_entries)
log_entries
end
def parse_log_entry(log_line, container_id)
# Parse Docker log format: timestamp stream_type log_content
match = log_line.match(/^(\d{4}-\d{2}-\d{2}T[\d:.]+Z)\s+(.*)$/)
return nil unless match
timestamp = Time.parse(match[1])
content = match[2]
# Extract log level if present
level = case content
when /ERROR|FATAL/i then 'ERROR'
when /WARN/i then 'WARN'
when /INFO/i then 'INFO'
when /DEBUG/i then 'DEBUG'
else 'UNKNOWN'
end
{
timestamp: timestamp,
container_id: container_id[0..11], # Short ID
level: level,
content: content
}
end
def analyze_error_patterns(log_entries)
error_entries = log_entries.select { |e| e[:level] == 'ERROR' }
# Group by error type
error_groups = error_entries.group_by do |entry|
# Extract error class from log content
entry[:content].match(/([A-Z][a-zA-Z]*Error|Exception)/)?.[1] || 'Unknown'
end
error_groups.each do |error_type, entries|
puts "#{error_type}: #{entries.count} occurrences"
if entries.count > 10
puts " WARNING: High frequency error pattern detected"
puts " First occurrence: #{entries.first[:timestamp]}"
puts " Latest occurrence: #{entries.last[:timestamp]}"
end
end
end
Production Patterns
Production Docker deployments require robust health checking, graceful shutdown handling, and comprehensive monitoring to ensure reliability. Applications must handle container lifecycle events properly and maintain service availability during updates and scaling operations.
class ProductionContainerManager
def initialize(image_name, replicas: 3)
@image_name = image_name
@replicas = replicas
@containers = []
@load_balancer = LoadBalancerConfig.new
@monitoring = ContainerMonitoring.new
end
def deploy_service
# Create containers with production configuration
@replicas.times do |index|
container = create_production_container(index)
container.start
# Wait for health check before adding to load balancer
wait_for_readiness(container)
@containers << container
@load_balancer.add_backend(container)
end
# Setup monitoring and alerting
@monitoring.start_monitoring(@containers)
configure_log_forwarding
puts "Service deployed with #{@containers.size} healthy containers"
end
def rolling_update(new_image)
puts "Starting rolling update to #{new_image}"
@containers.each_with_index do |old_container, index|
# Create new container with updated image
new_container = create_production_container(index, image: new_image)
new_container.start
# Wait for new container to be ready
wait_for_readiness(new_container)
# Add to load balancer before removing old container
@load_balancer.add_backend(new_container)
# Graceful shutdown of old container
shutdown_gracefully(old_container)
@load_balancer.remove_backend(old_container)
# Clean up old container
old_container.remove
@containers[index] = new_container
puts "Updated container #{index + 1}/#{@containers.size}"
end
puts "Rolling update completed successfully"
end
private
def create_production_container(index, image: @image_name)
Docker::Container.create(
'Image' => image,
'Env' => production_environment_variables(index),
'Labels' => production_labels(index),
'HostConfig' => {
'Memory' => 1024 * 1024 * 1024, # 1GB limit
'CpuShares' => 1024,
'RestartPolicy' => { 'Name' => 'unless-stopped' },
'LogConfig' => {
'Type' => 'json-file',
'Config' => {
'max-size' => '100m',
'max-file' => '3'
}
},
'SecurityOpt' => ['no-new-privileges:true'],
'ReadonlyRootfs' => true,
'Tmpfs' => { '/tmp' => 'rw,noexec,nosuid,size=256m' }
},
'Healthcheck' => {
'Test' => ['CMD-SHELL', 'curl -f http://localhost:3000/health || exit 1'],
'Interval' => 30000000000, # 30 seconds in nanoseconds
'Timeout' => 10000000000, # 10 seconds
'Retries' => 3,
'StartPeriod' => 60000000000 # 60 seconds
}
)
end
def production_environment_variables(index)
[
'RAILS_ENV=production',
'NODE_ENV=production',
'PORT=3000',
"INSTANCE_ID=#{index}",
"HOSTNAME=app-#{index}",
'DATABASE_URL=postgresql://user:pass@database:5432/prod',
'REDIS_URL=redis://redis:6379/0',
'SECRET_KEY_BASE=#{ENV['SECRET_KEY_BASE']}',
'HONEYBADGER_API_KEY=#{ENV['HONEYBADGER_API_KEY']}'
]
end
def shutdown_gracefully(container, timeout: 30)
puts "Gracefully shutting down container #{container.id[0..11]}"
# Send SIGTERM to allow graceful shutdown
container.kill(signal: 'TERM')
# Wait for graceful shutdown
start_time = Time.now
loop do
container.reload
break if container.info['State']['Status'] != 'running'
if Time.now - start_time > timeout
puts "Graceful shutdown timeout, forcing termination"
container.kill(signal: 'KILL')
break
end
sleep 1
end
end
end
Secret management and configuration handling in production requires secure approaches to sensitive data access without exposing credentials in container images or environment variables.
class SecretManager
def initialize(vault_client)
@vault_client = vault_client
@secret_cache = {}
@cache_ttl = 300 # 5 minutes
end
def create_container_with_secrets(image, secret_paths)
# Mount secrets as temporary files instead of environment variables
secret_mounts = prepare_secret_mounts(secret_paths)
container = Docker::Container.create(
'Image' => image,
'HostConfig' => {
'Mounts' => secret_mounts,
'SecurityOpt' => [
'seccomp=unconfined',
'apparmor=unconfined'
],
'CapDrop' => ['ALL'],
'CapAdd' => ['CHOWN', 'SETUID', 'SETGID'],
'ReadonlyRootfs' => true,
'Tmpfs' => {
'/secrets' => 'rw,noexec,nosuid,size=64m,mode=0700'
}
},
'User' => 'appuser:appgroup'
)
# Rotate secrets periodically
Thread.new { rotate_secrets_periodically(container, secret_paths) }
container
end
private
def prepare_secret_mounts(secret_paths)
mounts = []
secret_paths.each do |secret_path|
secret_value = fetch_secret(secret_path)
temp_file = write_secret_to_temp_file(secret_value)
mounts << {
'Type' => 'bind',
'Source' => temp_file,
'Target' => "/secrets/#{File.basename(secret_path)}",
'ReadOnly' => true
}
end
mounts
end
def fetch_secret(path)
cache_key = "secret:#{path}"
cached = @secret_cache[cache_key]
if cached && (Time.now - cached[:timestamp]) < @cache_ttl
return cached[:value]
end
secret_value = @vault_client.logical.read(path).data[:value]
@secret_cache[cache_key] = {
value: secret_value,
timestamp: Time.now
}
secret_value
end
def write_secret_to_temp_file(secret_value)
temp_file = Tempfile.new('docker-secret')
temp_file.write(secret_value)
temp_file.close
# Set restrictive permissions
File.chmod(0600, temp_file.path)
temp_file.path
end
end
Monitoring and alerting systems for production containers require comprehensive metrics collection, anomaly detection, and automated response to service degradation.
class ProductionMonitoring
def initialize(alert_manager, metrics_client)
@alert_manager = alert_manager
@metrics_client = metrics_client
@monitoring_threads = []
@alert_thresholds = {
cpu_percent: 80.0,
memory_percent: 85.0,
error_rate: 5.0,
response_time_p95: 2000.0 # milliseconds
}
end
def start_monitoring(containers)
@monitoring_threads << Thread.new { monitor_resource_usage(containers) }
@monitoring_threads << Thread.new { monitor_application_health(containers) }
@monitoring_threads << Thread.new { monitor_log_patterns(containers) }
# Setup graceful shutdown
trap('TERM') { shutdown_monitoring }
trap('INT') { shutdown_monitoring }
end
private
def monitor_resource_usage(containers)
loop do
containers.each do |container|
next unless container_running?(container)
stats = container.stats(stream: false)
metrics = calculate_container_metrics(stats, container)
# Send metrics to monitoring system
@metrics_client.gauge('container.cpu.percent', metrics[:cpu_percent],
tags: ["container:#{container.id[0..11]}"])
@metrics_client.gauge('container.memory.percent', metrics[:memory_percent],
tags: ["container:#{container.id[0..11]}"])
# Check alert thresholds
check_resource_alerts(container, metrics)
end
sleep 15 # Monitor every 15 seconds
end
rescue => e
puts "Resource monitoring error: #{e.message}"
@alert_manager.send_alert("Monitoring system error: #{e.message}", severity: 'critical')
end
def monitor_application_health(containers)
loop do
containers.each do |container|
next unless container_running?(container)
health_status = check_application_health(container)
@metrics_client.gauge('container.health.status',
health_status[:healthy] ? 1 : 0,
tags: ["container:#{container.id[0..11]}"])
unless health_status[:healthy]
@alert_manager.send_alert(
"Container #{container.id[0..11]} health check failed: #{health_status[:error]}",
severity: 'warning'
)
end
end
sleep 30 # Health check every 30 seconds
end
end
def check_application_health(container)
result = container.exec(['curl', '-f', '-s', 'http://localhost:3000/health'])
if result[2] == 0 # Exit code 0 means success
response = result[0].join
health_data = JSON.parse(response) rescue {}
{
healthy: health_data['status'] == 'ok',
response_time: health_data['response_time_ms'],
database_connected: health_data['database'] == 'connected',
error: nil
}
else
{
healthy: false,
error: result[1].join.strip
}
end
rescue => e
{
healthy: false,
error: e.message
}
end
def check_resource_alerts(container, metrics)
container_id = container.id[0..11]
if metrics[:cpu_percent] > @alert_thresholds[:cpu_percent]
@alert_manager.send_alert(
"High CPU usage on container #{container_id}: #{metrics[:cpu_percent].round(2)}%",
severity: 'warning'
)
end
if metrics[:memory_percent] > @alert_thresholds[:memory_percent]
@alert_manager.send_alert(
"High memory usage on container #{container_id}: #{metrics[:memory_percent].round(2)}%",
severity: 'critical'
)
end
end
end
Performance & Memory
Container resource optimization requires understanding Docker's resource allocation mechanisms, memory management patterns, and performance bottlenecks specific to containerized Ruby applications. Memory limits, CPU quotas, and I/O constraints affect application behavior differently than bare metal deployments.
class ContainerResourceOptimizer
def self.optimize_container_config(base_config, workload_type)
optimized_config = base_config.dup
case workload_type
when :web_server
optimized_config['HostConfig'].merge!(
'Memory' => 512 * 1024 * 1024, # 512MB
'MemorySwap' => 512 * 1024 * 1024, # Disable swap
'CpuShares' => 1024,
'CpuQuota' => 100000, # 1 CPU core
'BlkioWeight' => 500,
'Ulimits' => [
{ 'Name' => 'nofile', 'Soft' => 65536, 'Hard' => 65536 },
{ 'Name' => 'nproc', 'Soft' => 32768, 'Hard' => 32768 }
]
)
when :background_worker
optimized_config['HostConfig'].merge!(
'Memory' => 1024 * 1024 * 1024, # 1GB for processing
'CpuShares' => 2048, # Higher CPU priority
'CpuQuota' => 200000, # 2 CPU cores
'BlkioWeight' => 300 # Lower I/O priority
)
when :database
optimized_config['HostConfig'].merge!(
'Memory' => 2048 * 1024 * 1024, # 2GB
'MemorySwappiness' => 1, # Minimize swapping
'CpuShares' => 2048,
'BlkioWeight' => 1000, # High I/O priority
'ShmSize' => 256 * 1024 * 1024 # 256MB shared memory
)
end
# Add resource monitoring labels
optimized_config['Labels'] ||= {}
optimized_config['Labels'].merge!(
'resource.profile' => workload_type.to_s,
'resource.memory_limit' => (optimized_config['HostConfig']['Memory'] / 1024 / 1024).to_s,
'resource.cpu_shares' => optimized_config['HostConfig']['CpuShares'].to_s
)
optimized_config
end
def self.benchmark_container_performance(container, duration: 60)
puts "Starting #{duration}s performance benchmark..."
metrics = {
cpu_samples: [],
memory_samples: [],
network_rx: [],
network_tx: [],
disk_read: [],
disk_write: []
}
start_time = Time.now
sample_count = 0
while (Time.now - start_time) < duration
stats = container.stats(stream: false)
metrics[:cpu_samples] << calculate_cpu_percentage(stats)
metrics[:memory_samples] << calculate_memory_usage(stats)
metrics[:network_rx] << stats.dig('networks', 'eth0', 'rx_bytes') || 0
metrics[:network_tx] << stats.dig('networks', 'eth0', 'tx_bytes') || 0
metrics[:disk_read] << stats.dig('blkio_stats', 'io_service_bytes_recursive', 0, 'value') || 0
metrics[:disk_write] << stats.dig('blkio_stats', 'io_service_bytes_recursive', 1, 'value') || 0
sample_count += 1
sleep 2
end
analyze_performance_metrics(metrics, duration, sample_count)
end
private
def self.calculate_cpu_percentage(stats)
cpu_delta = stats.dig('cpu_stats', 'cpu_usage', 'total_usage') -
stats.dig('precpu_stats', 'cpu_usage', 'total_usage')
system_delta = stats.dig('cpu_stats', 'system_cpu_usage') -
stats.dig('precpu_stats', 'system_cpu_usage')
return 0.0 if system_delta <= 0 || cpu_delta < 0
num_cpus = stats.dig('cpu_stats', 'online_cpus') || 1
(cpu_delta.to_f / system_delta) * num_cpus * 100.0
end
def self.analyze_performance_metrics(metrics, duration, sample_count)
results = {}
# CPU analysis
cpu_avg = metrics[:cpu_samples].sum / metrics[:cpu_samples].size
cpu_max = metrics[:cpu_samples].max
cpu_p95 = calculate_percentile(metrics[:cpu_samples], 95)
# Memory analysis
memory_avg = metrics[:memory_samples].sum / metrics[:memory_samples].size
memory_max = metrics[:memory_samples].max
memory_trend = calculate_trend(metrics[:memory_samples])
# Network throughput
network_rx_total = metrics[:network_rx].last - metrics[:network_rx].first
network_tx_total = metrics[:network_tx].last - metrics[:network_tx].first
network_rx_mbps = (network_rx_total * 8.0 / 1024 / 1024) / duration
network_tx_mbps = (network_tx_total * 8.0 / 1024 / 1024) / duration
puts "\n=== Performance Benchmark Results ==="
puts "Duration: #{duration}s (#{sample_count} samples)"
puts "\nCPU Usage:"
puts " Average: #{cpu_avg.round(2)}%"
puts " Maximum: #{cpu_max.round(2)}%"
puts " 95th percentile: #{cpu_p95.round(2)}%"
puts "\nMemory Usage:"
puts " Average: #{(memory_avg / 1024 / 1024).round(2)}MB"
puts " Maximum: #{(memory_max / 1024 / 1024).round(2)}MB"
puts " Trend: #{memory_trend > 0 ? 'increasing' : 'stable/decreasing'}"
puts "\nNetwork Throughput:"
puts " RX: #{network_rx_mbps.round(2)} Mbps"
puts " TX: #{network_tx_mbps.round(2)} Mbps"
results
end
end
Memory leak detection and analysis in containerized applications requires monitoring allocation patterns, garbage collection behavior, and container-specific memory constraints that differ from traditional Ruby deployments.
class MemoryLeakDetector
def initialize(container)
@container = container
@baseline_memory = nil
@memory_samples = []
@gc_samples = []
@object_samples = []
end
def start_monitoring(sample_interval: 30)
Thread.new do
loop do
collect_memory_sample
collect_gc_sample if ruby_application?
analyze_trends if @memory_samples.size >= 10
sleep sample_interval
end
rescue => e
puts "Memory monitoring error: #{e.message}"
end
end
def generate_leak_report
return unless @memory_samples.size >= 5
# Calculate memory growth rate
memory_growth = calculate_memory_growth_rate
gc_efficiency = calculate_gc_efficiency
puts "\n=== Memory Leak Analysis ==="
puts "Sample period: #{@memory_samples.size} samples over #{(@memory_samples.size * 30) / 60} minutes"
puts "Memory growth rate: #{memory_growth.round(2)} MB/hour"
puts "GC efficiency: #{gc_efficiency.round(2)}%" if gc_efficiency
if memory_growth > 10 # More than 10MB/hour growth
puts "\n⚠️ POTENTIAL MEMORY LEAK DETECTED"
identify_leak_sources
provide_remediation_steps
else
puts "\n✅ No significant memory leak detected"
end
end
private
def collect_memory_sample
stats = @container.stats(stream: false)
memory_usage = stats.dig('memory_stats', 'usage') || 0
timestamp = Time.now
@baseline_memory ||= memory_usage
sample = {
timestamp: timestamp,
memory_usage: memory_usage,
memory_limit: stats.dig('memory_stats', 'limit') || 0,
cache_usage: stats.dig('memory_stats', 'stats', 'cache') || 0,
rss: stats.dig('memory_stats', 'stats', 'rss') || 0
}
@memory_samples << sample
@memory_samples.shift if @memory_samples.size > 100 # Keep last 100 samples
end
def collect_gc_sample
# Execute Ruby code in container to get GC stats
result = @container.exec(['ruby', '-e', 'puts GC.stat.to_json'])
return unless result[2] == 0 # Success exit code
gc_stats = JSON.parse(result[0].join)
@gc_samples << {
timestamp: Time.now,
heap_allocated_pages: gc_stats['heap_allocated_pages'],
heap_sorted_length: gc_stats['heap_sorted_length'],
old_objects: gc_stats['old_objects'],
total_allocated_objects: gc_stats['total_allocated_objects'],
major_gc_count: gc_stats['major_gc_count'],
minor_gc_count: gc_stats['minor_gc_count']
}
@gc_samples.shift if @gc_samples.size > 100
rescue => e
# Silently continue if we can't get GC stats
end
def ruby_application?
# Check if container is running a Ruby application
result = @container.exec(['pgrep', 'ruby'])
result[2] == 0
rescue
false
end
def calculate_memory_growth_rate
return 0 if @memory_samples.size < 2
first_sample = @memory_samples.first
last_sample = @memory_samples.last
time_diff_hours = (last_sample[:timestamp] - first_sample[:timestamp]) / 3600.0
memory_diff_mb = (last_sample[:memory_usage] - first_sample[:memory_usage]) / 1024.0 / 1024.0
return 0 if time_diff_hours <= 0
memory_diff_mb / time_diff_hours
end
def calculate_gc_efficiency
return nil if @gc_samples.size < 2
recent_samples = @gc_samples.last(10)
gc_frequency = calculate_gc_frequency(recent_samples)
heap_growth = calculate_heap_growth(recent_samples)
# Simple efficiency metric: lower heap growth with reasonable GC frequency
return nil if gc_frequency == 0
efficiency = [0, 100 - (heap_growth * gc_frequency / 10.0)].max
[efficiency, 100].min
end
def identify_leak_sources
puts "\nPotential leak sources:"
# Check for excessive heap growth
if @gc_samples.size >= 5
heap_pages_growth = @gc_samples.last[:heap_allocated_pages] - @gc_samples[-5][:heap_allocated_pages]
if heap_pages_growth > 100
puts "• Excessive heap page allocation (#{heap_pages_growth} pages)"
end
old_objects_growth = @gc_samples.last[:old_objects] - @gc_samples[-5][:old_objects]
if old_objects_growth > 10000
puts "• High old object generation (#{old_objects_growth} objects)"
end
end
# Check memory usage patterns
cache_ratio = @memory_samples.last[:cache_usage].to_f / @memory_samples.last[:memory_usage]
if cache_ratio < 0.1
puts "• Low cache usage ratio (#{(cache_ratio * 100).round(2)}%)"
end
rss_ratio = @memory_samples.last[:rss].to_f / @memory_samples.last[:memory_usage]
if rss_ratio > 0.8
puts "• High RSS usage ratio (#{(rss_ratio * 100).round(2)}%)"
end
end
end
Performance profiling and bottleneck identification requires container-aware tooling that accounts for resource limitations, shared kernel overhead, and containerized application behavior patterns.
class ContainerProfiler
def self.profile_application(container, duration: 120)
puts "Starting comprehensive application profiling..."
profiler = new(container)
profile_data = profiler.collect_profile_data(duration)
profiler.analyze_bottlenecks(profile_data)
profiler.generate_optimization_recommendations(profile_data)
end
def initialize(container)
@container = container
@profile_data = {
cpu_samples: [],
memory_samples: [],
io_samples: [],
network_samples: [],
application_metrics: []
}
end
def collect_profile_data(duration)
threads = []
# System resource sampling
threads << Thread.new { sample_system_resources(duration) }
# Application-level profiling if possible
threads << Thread.new { sample_application_metrics(duration) }
# I/O and network profiling
threads << Thread.new { sample_io_patterns(duration) }
# Wait for all profiling threads
threads.each(&:join)
@profile_data
end
private
def sample_system_resources(duration)
start_time = Time.now
while (Time.now - start_time) < duration
stats = @container.stats(stream: false)
@profile_data[:cpu_samples] << {
timestamp: Time.now,
cpu_percent: calculate_cpu_percentage(stats),
cpu_throttling: stats.dig('cpu_stats', 'throttling_data', 'throttled_periods') || 0,
system_cpu_usage: stats.dig('cpu_stats', 'system_cpu_usage') || 0
}
@profile_data[:memory_samples] << {
timestamp: Time.now,
usage: stats.dig('memory_stats', 'usage') || 0,
cache: stats.dig('memory_stats', 'stats', 'cache') || 0,
rss: stats.dig('memory_stats', 'stats', 'rss') || 0,
swap: stats.dig('memory_stats', 'stats', 'swap') || 0
}
@profile_data[:network_samples] << {
timestamp: Time.now,
rx_bytes: stats.dig('networks', 'eth0', 'rx_bytes') || 0,
tx_bytes: stats.dig('networks', 'eth0', 'tx_bytes') || 0,
rx_packets: stats.dig('networks', 'eth0', 'rx_packets') || 0,
tx_packets: stats.dig('networks', 'eth0', 'tx_packets') || 0
}
sleep 5
end
end
def sample_application_metrics(duration)
return unless application_has_metrics_endpoint?
start_time = Time.now
while (Time.now - start_time) < duration
begin
result = @container.exec(['curl', '-s', 'http://localhost:3000/metrics'])
if result[2] == 0
metrics = parse_prometheus_metrics(result[0].join)
@profile_data[:application_metrics] << {
timestamp: Time.now,
metrics: metrics
}
end
rescue => e
# Continue if metrics unavailable
end
sleep 10
end
end
def analyze_bottlenecks(profile_data)
puts "\n=== Performance Bottleneck Analysis ==="
analyze_cpu_bottlenecks(profile_data[:cpu_samples])
analyze_memory_bottlenecks(profile_data[:memory_samples])
analyze_io_bottlenecks(profile_data[:io_samples])
analyze_network_bottlenecks(profile_data[:network_samples])
end
def analyze_cpu_bottlenecks(cpu_samples)
return if cpu_samples.empty?
cpu_values = cpu_samples.map { |s| s[:cpu_percent] }
avg_cpu = cpu_values.sum / cpu_values.size
max_cpu = cpu_values.max
throttling_events = cpu_samples.sum { |s| s[:cpu_throttling] }
puts "\nCPU Analysis:"
puts " Average utilization: #{avg_cpu.round(2)}%"
puts " Peak utilization: #{max_cpu.round(2)}%"
puts " Throttling events: #{throttling_events}"
if avg_cpu > 70
puts " ⚠️ High average CPU utilization indicates CPU bottleneck"
end
if throttling_events > 0
puts " 🚨 CPU throttling detected - consider increasing CPU limits"
end
if max_cpu > 95
puts " 🚨 CPU saturation detected - optimize CPU-intensive operations"
end
end
def generate_optimization_recommendations(profile_data)
puts "\n=== Optimization Recommendations ==="
cpu_avg = profile_data[:cpu_samples].map { |s| s[:cpu_percent] }.sum /
profile_data[:cpu_samples].size rescue 0
memory_max = profile_data[:memory_samples].map { |s| s[:usage] }.max rescue 0
if cpu_avg > 80
puts "• Consider horizontal scaling - CPU utilization is high"
puts "• Profile application code for CPU-intensive operations"
puts "• Consider using more efficient algorithms or data structures"
end
if memory_max > 800 * 1024 * 1024 # 800MB
puts "• Monitor for memory leaks - high memory usage detected"
puts "• Consider implementing object pooling for frequently created objects"
puts "• Review caching strategies to reduce memory pressure"
end
puts "• Enable production profiling tools like Stackprof or Ruby Prof"
puts "• Consider APM tools like New Relic or Datadog for ongoing monitoring"
puts "• Implement circuit breakers for external service calls"
end
end
Reference
Core Classes and Methods
Class | Primary Purpose | Key Methods |
---|---|---|
Docker |
Main entry point and configuration | .url= , .logger= , .authenticate! |
Docker::Container |
Container lifecycle management | .create , .start , .stop , .remove , .logs |
Docker::Image |
Image operations and building | .create , .build_from_dir , .tag , .remove |
Docker::Network |
Network management | .create , .connect , .disconnect , .remove |
Docker::Volume |
Volume operations | .create , .remove , .info |
Container Methods
Method | Parameters | Returns | Description |
---|---|---|---|
.create(opts) |
Hash of container config | Docker::Container |
Creates new container with specified configuration |
#start |
None | self |
Starts stopped container |
#stop(timeout: 10) |
timeout (Integer) | self |
Gracefully stops running container |
#kill(signal: 'KILL') |
signal (String) | self |
Sends signal to container process |
#restart(timeout: 10) |
timeout (Integer) | self |
Restarts container with optional timeout |
#remove(force: false) |
force (Boolean) | true |
Removes container, optionally forcing removal |
#logs(opts = {}) |
stdout, stderr, timestamps, follow | String/Stream | Retrieves container logs with filtering options |
#exec(cmd, opts = {}) |
cmd (Array), detach, tty | Array | Executes command inside running container |
#stats(stream: true) |
stream (Boolean) | Hash/Stream | Returns resource usage statistics |
#info |
None | Hash | Returns complete container inspection data |
#wait(timeout = -1) |
timeout (Integer) | Hash | Waits for container to stop, returns exit code |
Image Methods
Method | Parameters | Returns | Description |
---|---|---|---|
.create(opts) |
fromImage, repo, tag, registry | Docker::Image |
Pulls image from registry or creates from repository |
.build(dockerfile) |
dockerfile (String) | Docker::Image |
Builds image from Dockerfile content |
.build_from_dir(path, opts) |
path (String), build options | Docker::Image |
Builds image from directory context |
.build_from_tar(tar, opts) |
tar (IO), build options | Docker::Image |
Builds image from tar stream |
#tag(opts) |
repo, tag, force | true |
Tags image with repository and tag |
#push(creds, repo_tag) |
credentials, repo_tag | String | Pushes image to registry |
#remove(force: false) |
force (Boolean) | true |
Removes image from local storage |
#history |
None | Array | Returns image layer history |
#info |
None | Hash | Returns detailed image inspection data |
Network Methods
Method | Parameters | Returns | Description |
---|---|---|---|
.create(opts) |
Name, Driver, IPAM, Options | Docker::Network |
Creates custom network with specified driver |
#connect(container, opts) |
container, EndpointConfig | true |
Connects container to network |
#disconnect(container, force) |
container, force | true |
Disconnects container from network |
#remove |
None | true |
Removes network |
#info |
None | Hash | Returns network inspection details |
Volume Methods
Method | Parameters | Returns | Description |
---|---|---|---|
.create(opts) |
Name, Driver, DriverOpts, Labels | Docker::Volume |
Creates named volume with optional driver |
#remove(force: false) |
force (Boolean) | true |
Removes volume and all data |
#info |
None | Hash | Returns volume details including mount point |
Configuration Options
Container Creation Options
Option | Type | Purpose | Example |
---|---|---|---|
'Image' |
String | Base image name | 'nginx:alpine' |
'Cmd' |
Array | Default command | ['/bin/sh', '-c', 'echo hello'] |
'Env' |
Array | Environment variables | ['NODE_ENV=production', 'PORT=3000'] |
'ExposedPorts' |
Hash | Container ports to expose | {'80/tcp' => {}, '443/tcp' => {}} |
'WorkingDir' |
String | Working directory | '/app' |
'User' |
String | User and group | 'nobody:nogroup' |
'Labels' |
Hash | Metadata labels | {'version' => '1.0', 'env' => 'prod'} |
HostConfig Options
Option | Type | Purpose | Example |
---|---|---|---|
'Memory' |
Integer | Memory limit in bytes | 512 * 1024 * 1024 |
'CpuShares' |
Integer | Relative CPU weight | 1024 |
'CpuQuota' |
Integer | CPU quota in microseconds | 100000 |
'PortBindings' |
Hash | Host port mappings | {'80/tcp' => [{'HostPort' => '8080'}]} |
'Mounts' |
Array | Volume and bind mounts | [{'Type' => 'bind', 'Source' => '/host', 'Target' => '/container'}] |
'RestartPolicy' |
Hash | Restart behavior | {'Name' => 'unless-stopped', 'MaximumRetryCount' => 3} |
'LogConfig' |
Hash | Logging configuration | {'Type' => 'json-file', 'Config' => {'max-size' => '10m'}} |
Error Classes
Exception | Inheritance | When Raised |
---|---|---|
Docker::Error::DockerError |
StandardError | Base class for all Docker errors |
Docker::Error::ClientError |
DockerError | Client-side errors (4xx HTTP responses) |
Docker::Error::ServerError |
DockerError | Server-side errors (5xx HTTP responses) |
Docker::Error::TimeoutError |
DockerError | Request timeouts and connection timeouts |
Docker::Error::NotFoundError |
ClientError | Resource not found (404 responses) |
Docker::Error::ConflictError |
ClientError | Resource conflicts (409 responses) |
Docker::Error::UnauthorizedError |
ClientError | Authentication failures (401 responses) |
Connection Configuration
Setting | Default | Purpose |
---|---|---|
Docker.url |
'unix:///var/run/docker.sock' |
Docker daemon connection URL |
Docker.connection |
Auto-configured | HTTP connection object |
Docker.logger |
nil |
Logger instance for debugging |
Docker.read_timeout |
60 | Read timeout in seconds |
Docker.write_timeout |
60 | Write timeout in seconds |
Registry Authentication
# Configure registry credentials
Docker.authenticate!(
'username' => 'myuser',
'password' => 'mypassword',
'email' => 'user@example.com',
'serveraddress' => 'registry.example.com'
)
Build Context Options
Option | Type | Purpose |
---|---|---|
'dockerfile' |
String | Dockerfile name |
't' |
String | Image tag |
'buildargs' |
String (JSON) | Build arguments |
'target' |
String | Multi-stage build target |
'labels' |
String (JSON) | Image labels |
'nocache' |
Boolean | Disable build cache |