Overview
Infrastructure as Code (IaC) treats infrastructure provisioning and management as a software development process. Rather than manually configuring servers, networks, and other infrastructure components through graphical interfaces or command-line operations, IaC defines infrastructure in code files that can be version controlled, tested, and automatically deployed.
The approach emerged from the need to manage increasingly complex infrastructure at scale. Traditional manual configuration creates several problems: configurations become inconsistent across environments, changes lack documentation, recovery from failures requires manual reconstruction, and scaling requires repetitive manual work. IaC addresses these issues by applying software engineering practices to infrastructure management.
IaC operates on a declarative or imperative model. Declarative IaC describes the desired end state—what infrastructure should exist—and the IaC tool determines how to achieve that state. Imperative IaC specifies explicit commands to execute in sequence to create the infrastructure. Both models serve different use cases and often work together in production environments.
# Declarative approach - describes desired state
server "web-01" do
instance_type "t3.medium"
ami "ami-0c55b159cbfafe1f0"
security_groups ["web-sg"]
tags(
Name: "Web Server",
Environment: "production"
)
end
# Imperative approach - specifies actions
def provision_server
instance = create_instance("t3.medium", "ami-0c55b159cbfafe1f0")
attach_security_group(instance, "web-sg")
add_tags(instance, Name: "Web Server", Environment: "production")
wait_until_running(instance)
end
The infrastructure code lives in version control systems alongside application code. Teams review infrastructure changes through pull requests, apply automated testing to configuration changes, and maintain a complete history of infrastructure evolution. When disaster strikes, recovery involves executing the infrastructure code rather than following documented procedures that may be outdated.
Key Principles
Idempotence forms the foundation of reliable IaC. Running the same infrastructure code multiple times produces the same result without creating duplicate resources or causing errors. The infrastructure converges to the desired state regardless of its current state. This property allows safe re-execution after failures and makes deployments predictable.
# Idempotent resource definition
resource :database do
action :create
database_name "production_db"
username "app_user"
allocated_storage 100
# Running this multiple times:
# - First run: creates database
# - Subsequent runs: no changes if database matches specification
# - After drift: updates database to match specification
end
Version Control Integration treats infrastructure definitions as source code. All infrastructure changes flow through the same review and approval process as application code. Teams track who made changes, when they occurred, and why through commit messages. Rolling back problematic changes requires reverting commits and re-applying infrastructure code.
Immutability discourages modifying running infrastructure. Instead of updating existing servers, immutable infrastructure replaces old resources with new ones. This approach eliminates configuration drift—the gradual divergence of actual infrastructure from documented specifications. Each deployment creates fresh infrastructure from the code definition.
# Immutable server replacement pattern
class ServerDeployment
def deploy(new_version)
# Create new servers with updated configuration
new_servers = create_servers(
count: 3,
image: "app-#{new_version}",
config: load_config(new_version)
)
# Wait for health checks
wait_for_healthy(new_servers)
# Shift traffic to new servers
update_load_balancer(new_servers)
# Terminate old servers
terminate_servers(@current_servers)
@current_servers = new_servers
end
end
Self-Documentation emerges from infrastructure code. The code itself documents the current infrastructure state. Reading the code reveals what resources exist, how they connect, and what configurations apply. Documentation cannot become outdated because the code represents the actual infrastructure.
Modularity breaks infrastructure into reusable components. Common patterns become modules that teams share across projects. A web application stack module might include load balancers, auto-scaling groups, databases, and networking—packaged for reuse with configurable parameters.
# Reusable infrastructure module
module WebStack
class Configuration
attr_accessor :app_name, :instance_count, :instance_type, :database_size
def initialize(app_name)
@app_name = app_name
@instance_count = 2
@instance_type = "t3.small"
@database_size = 20
end
end
def self.provision(config)
vpc = create_vpc(config.app_name)
db = create_database(vpc, config.database_size)
instances = create_instances(vpc, config.instance_count, config.instance_type)
lb = create_load_balancer(vpc, instances)
{ vpc: vpc, database: db, instances: instances, load_balancer: lb }
end
end
# Using the module
config = WebStack::Configuration.new("my-app")
config.instance_count = 5
config.instance_type = "t3.medium"
infrastructure = WebStack.provision(config)
Environment Parity maintains consistency across development, staging, and production environments. The same infrastructure code deploys all environments with different parameters. This reduces environment-specific bugs and makes testing reliable—production problems can be reproduced in development.
State Management tracks the relationship between infrastructure code and actual resources. IaC tools maintain state files that map code declarations to real infrastructure identifiers. State files record which cloud resources correspond to which code definitions, enabling updates and deletions. State must be shared across team members and protected from corruption.
Implementation Approaches
Declarative Configuration describes the target infrastructure state without specifying how to achieve it. The IaC tool compares the desired state against current reality and determines necessary changes. This approach handles dependency ordering automatically and detects manual changes that cause drift.
Declarative tools work particularly well for infrastructure with complex dependencies. When a database depends on a VPC, which depends on network routing, the tool calculates the correct creation order. Removing a resource from the code triggers deletion of that resource and dependent resources.
Imperative Scripting specifies exact steps to build infrastructure. Scripts execute commands in the order written. The developer controls the entire process but must handle dependencies, error conditions, and state tracking manually.
# Imperative infrastructure script
class InfrastructureProvisioner
def provision_complete_stack
# Explicit ordering required
network = create_network
raise "Network creation failed" unless network.successful?
subnets = create_subnets(network.id)
raise "Subnet creation failed" unless subnets.all?(&:successful?)
security = create_security_groups(network.id)
database = create_database(subnets.first.id, security.database_sg.id)
instances = []
3.times do |i|
instance = create_instance(
subnet: subnets[i % subnets.length].id,
security_group: security.app_sg.id
)
instances << instance
end
load_balancer = create_load_balancer(subnets.map(&:id))
register_instances(load_balancer, instances)
{ network: network, database: database, instances: instances }
end
end
Mutable Infrastructure updates existing resources in place. When configuration changes, the IaC tool modifies running infrastructure. This approach reduces downtime but increases risk of configuration drift and failed updates leaving infrastructure in inconsistent states.
Immutable Infrastructure replaces resources rather than updating them. Each change creates new resources with the updated configuration, then deletes old resources. This approach eliminates drift but requires infrastructure designed for replacement—stateless applications, externalized data storage, and automated deployment pipelines.
Push vs Pull Models determine how configuration reaches infrastructure. Push models execute from a central location, sending configuration to target systems. Pull models install agents on infrastructure that periodically fetch and apply configuration from a central server.
Push models work well for cloud infrastructure where API calls provision resources. Pull models suit large server fleets where agents can apply configuration at scale and report status centrally.
# Pull model agent
class ConfigurationAgent
def initialize(server_url, interval: 300)
@server_url = server_url
@interval = interval
end
def run
loop do
begin
desired_config = fetch_configuration
current_config = read_local_configuration
if configuration_changed?(current_config, desired_config)
apply_configuration(desired_config)
report_success
end
rescue => e
report_failure(e)
end
sleep @interval
end
end
private
def fetch_configuration
HTTP.get("#{@server_url}/config/#{hostname}")
end
def apply_configuration(config)
# Apply configuration changes to local system
config.packages.each { |pkg| install_package(pkg) }
config.services.each { |svc| restart_service(svc) }
write_configuration_files(config.files)
end
end
Multi-Layer Architecture separates infrastructure into layers with different change frequencies. The network layer changes rarely, the platform layer (Kubernetes clusters, database services) changes occasionally, and the application layer changes frequently. Each layer has separate infrastructure code managed by appropriate teams.
Ruby Implementation
Ruby provides several tools and frameworks for infrastructure management, with Chef and Puppet being the most prominent. These tools use Ruby's expressive syntax to create domain-specific languages for infrastructure configuration.
Chef uses pure Ruby for infrastructure definitions called recipes. Resources represent infrastructure components—packages, services, files, users. Recipes combine resources with Ruby logic to define system configuration.
# Chef recipe for web server configuration
package 'nginx' do
action :install
end
service 'nginx' do
action [:enable, :start]
supports restart: true, reload: true
end
template '/etc/nginx/nginx.conf' do
source 'nginx.conf.erb'
owner 'root'
group 'root'
mode '0644'
variables(
worker_processes: node['cpu']['total'],
worker_connections: 1024
)
notifies :reload, 'service[nginx]', :delayed
end
directory '/var/www/myapp' do
owner 'www-data'
group 'www-data'
mode '0755'
recursive true
end
# Custom resource with Ruby logic
ruby_block 'configure_ssl' do
block do
require 'openssl'
key = OpenSSL::PKey::RSA.new(2048)
cert = OpenSSL::X509::Certificate.new
File.write('/etc/nginx/ssl/server.key', key.to_pem)
File.write('/etc/nginx/ssl/server.crt', cert.to_pem)
end
not_if { File.exist?('/etc/nginx/ssl/server.key') }
end
Chef recipes support full Ruby capabilities—conditionals, loops, functions, external libraries. This flexibility allows complex configuration logic but requires careful design to maintain idempotence.
Fog provides a Ruby cloud services library that abstracts multiple cloud providers behind a common interface. It supports creating and managing cloud resources programmatically.
require 'fog/aws'
class CloudProvisioner
def initialize(credentials)
@compute = Fog::Compute.new(
provider: 'AWS',
aws_access_key_id: credentials[:access_key],
aws_secret_access_key: credentials[:secret_key],
region: 'us-east-1'
)
end
def provision_web_cluster(count:, instance_type:)
security_group = create_security_group
instances = count.times.map do |i|
@compute.servers.create(
image_id: 'ami-0c55b159cbfafe1f0',
flavor_id: instance_type,
key_name: 'deployment-key',
security_group_ids: [security_group.id],
tags: {
'Name' => "web-server-#{i}",
'Role' => 'web',
'Cluster' => 'production'
}
)
end
instances.each(&:wait_for { ready? })
instances
end
private
def create_security_group
sg = @compute.security_groups.create(
name: 'web-servers',
description: 'Web server security group'
)
sg.authorize_port_range(80..80, ip_protocol: 'tcp')
sg.authorize_port_range(443..443, ip_protocol: 'tcp')
sg.authorize_port_range(22..22, ip_protocol: 'tcp', cidr_ip: '10.0.0.0/8')
sg
end
end
AWS SDK for Ruby provides direct access to AWS services for infrastructure automation. While not a full IaC framework, it enables building custom infrastructure tools.
require 'aws-sdk-ec2'
class AutoScalingInfrastructure
def initialize
@ec2 = Aws::EC2::Client.new(region: 'us-west-2')
end
def deploy_stack(config)
vpc = create_vpc(config[:vpc_cidr])
subnet = create_subnet(vpc.vpc_id, config[:subnet_cidr])
igw = create_internet_gateway(vpc.vpc_id)
launch_template = create_launch_template(
subnet_id: subnet.subnet_id,
security_group: config[:security_group],
user_data: config[:user_data]
)
{
vpc_id: vpc.vpc_id,
subnet_id: subnet.subnet_id,
launch_template_id: launch_template.launch_template_id
}
end
private
def create_vpc(cidr_block)
resp = @ec2.create_vpc(cidr_block: cidr_block)
@ec2.modify_vpc_attribute(
vpc_id: resp.vpc.vpc_id,
enable_dns_hostnames: { value: true }
)
resp.vpc
end
def create_launch_template(subnet_id:, security_group:, user_data:)
@ec2.create_launch_template(
launch_template_name: "web-server-#{Time.now.to_i}",
launch_template_data: {
image_id: 'ami-0c55b159cbfafe1f0',
instance_type: 't3.medium',
network_interfaces: [{
subnet_id: subnet_id,
groups: [security_group],
device_index: 0,
associate_public_ip_address: true
}],
user_data: Base64.encode64(user_data)
}
).launch_template
end
end
Ruby's metaprogramming capabilities enable creating custom DSLs for infrastructure. These DSLs provide readable syntax while maintaining the full power of Ruby underneath.
class InfrastructureDSL
def self.define(&block)
dsl = new
dsl.instance_eval(&block)
dsl.resources
end
attr_reader :resources
def initialize
@resources = []
end
def server(name, &block)
config = ServerConfig.new(name)
config.instance_eval(&block)
@resources << config
end
def database(name, &block)
config = DatabaseConfig.new(name)
config.instance_eval(&block)
@resources << config
end
end
class ServerConfig
attr_accessor :name, :instance_type, :ami, :count
def initialize(name)
@name = name
@count = 1
end
def type(instance_type)
@instance_type = instance_type
end
def image(ami)
@ami = ami
end
def instances(count)
@count = count
end
end
# Usage
infrastructure = InfrastructureDSL.define do
server "web" do
type "t3.medium"
image "ami-12345"
instances 3
end
database "main" do
engine "postgresql"
size 100
end
end
Tools & Ecosystem
Terraform dominates the declarative IaC space across cloud providers. It uses HCL (HashiCorp Configuration Language) to define infrastructure and maintains state files tracking deployed resources. Terraform supports hundreds of providers—AWS, Azure, GCP, Kubernetes, databases, monitoring services.
Terraform executes in three phases: plan (calculate required changes), apply (execute changes), and destroy (remove infrastructure). The plan phase shows exactly what will change before making modifications. State files can be stored locally or remotely in services like S3, enabling team collaboration.
AWS CloudFormation provides AWS-specific declarative infrastructure management using JSON or YAML templates. CloudFormation integrates deeply with AWS services and provides stack-level operations—creating, updating, or deleting entire infrastructure stacks atomically. Change sets preview modifications before applying them.
CloudFormation handles dependencies automatically, rolling back failed deployments, and providing drift detection to identify manual changes. The service is free—users pay only for created resources.
Pulumi allows writing infrastructure code in general-purpose programming languages including Python, JavaScript, Go, and C#. Rather than learning a DSL, developers use familiar languages and can apply standard programming practices—loops, conditionals, functions, classes, packages.
Pulumi manages state similarly to Terraform but provides features like testing infrastructure code with standard testing frameworks and using existing package ecosystems for code sharing.
Ansible takes an agentless imperative approach, connecting to servers via SSH and executing configuration tasks. Playbooks written in YAML describe tasks to execute. Ansible excels at configuration management across heterogeneous environments and server provisioning, though it also supports cloud resource creation.
Chef and Puppet provide agent-based configuration management. Agents installed on servers periodically fetch configuration from a central server and apply it locally. This pull model scales to thousands of servers and provides continuous configuration enforcement.
Chef uses Ruby DSL for configuration recipes. Puppet uses its own declarative language. Both tools maintain large ecosystems of community-contributed configurations called cookbooks (Chef) or modules (Puppet).
Kubernetes Operators extend Kubernetes to manage infrastructure beyond containerized applications. Operators use custom resource definitions and controllers to provision and manage databases, message queues, and other infrastructure components following Kubernetes patterns.
# Example: Wrapping Terraform from Ruby
require 'open3'
require 'json'
class TerraformWrapper
def initialize(working_dir)
@working_dir = working_dir
end
def plan(var_file: nil)
cmd = ['terraform', 'plan', '-json']
cmd += ['-var-file', var_file] if var_file
execute_streaming(cmd) do |line|
data = JSON.parse(line)
yield data if block_given? && data['type'] == 'planned_change'
end
end
def apply(auto_approve: false)
cmd = ['terraform', 'apply']
cmd << '-auto-approve' if auto_approve
execute(cmd)
end
def output(name = nil)
cmd = ['terraform', 'output', '-json']
cmd << name if name
stdout, status = execute(cmd)
JSON.parse(stdout)
end
private
def execute(cmd)
stdout, stderr, status = Open3.capture3(*cmd, chdir: @working_dir)
raise "Terraform failed: #{stderr}" unless status.success?
[stdout, status]
end
def execute_streaming(cmd)
Open3.popen3(*cmd, chdir: @working_dir) do |stdin, stdout, stderr, wait_thr|
stdout.each_line do |line|
yield line.chomp
end
raise "Command failed" unless wait_thr.value.success?
end
end
end
# Usage
terraform = TerraformWrapper.new('/infrastructure')
terraform.plan(var_file: 'production.tfvars') do |change|
puts "Planning to #{change['change']['action']} #{change['resource']['addr']}"
end
terraform.apply(auto_approve: true)
outputs = terraform.output
puts "Load balancer DNS: #{outputs['load_balancer_dns']['value']}"
Infrastructure Testing Tools verify infrastructure code correctness. Test Kitchen tests Chef cookbooks across multiple platforms. Terratest provides testing for Terraform, Packer, and Docker using Go. InSpec verifies infrastructure compliance.
State Management Services provide remote state storage with locking, versioning, and team collaboration. Terraform Cloud, AWS S3 with DynamoDB, and Azure Blob Storage all serve this purpose. Remote state prevents conflicts when multiple team members work on infrastructure simultaneously.
Common Patterns
Module Composition creates reusable infrastructure building blocks. Modules encapsulate related resources and expose parameters for customization. A database module might include the database instance, security groups, parameter groups, and monitoring alarms—configured through input variables.
# Reusable database module pattern
module DatabaseModule
class Configuration
attr_accessor :identifier, :instance_class, :allocated_storage,
:engine, :engine_version, :backup_retention_days,
:multi_az, :vpc_security_group_ids
def initialize(identifier)
@identifier = identifier
@instance_class = 'db.t3.micro'
@allocated_storage = 20
@engine = 'postgres'
@engine_version = '13.7'
@backup_retention_days = 7
@multi_az = false
@vpc_security_group_ids = []
end
def validate!
raise "Identifier required" if identifier.nil?
raise "Invalid instance class" unless instance_class =~ /^db\.\w+\.\w+$/
raise "Storage must be positive" unless allocated_storage > 0
end
end
def self.create(config)
config.validate!
{
database_instance: create_db_instance(config),
security_group: create_security_group(config),
parameter_group: create_parameter_group(config),
subnet_group: create_subnet_group(config),
monitoring: create_cloudwatch_alarms(config)
}
end
end
Environment Promotion uses the same infrastructure code across environments with different variable files. Development uses small instance types and minimal redundancy. Production uses larger instances, multi-availability zone deployments, and enhanced monitoring. The infrastructure code remains identical.
# Environment-specific configurations
class EnvironmentConfig
ENVIRONMENTS = {
development: {
instance_type: 't3.small',
min_instances: 1,
max_instances: 2,
database_instance: 'db.t3.micro',
multi_az: false
},
staging: {
instance_type: 't3.medium',
min_instances: 2,
max_instances: 4,
database_instance: 'db.t3.small',
multi_az: false
},
production: {
instance_type: 't3.large',
min_instances: 4,
max_instances: 20,
database_instance: 'db.r5.xlarge',
multi_az: true
}
}
def self.for_environment(env)
ENVIRONMENTS.fetch(env.to_sym) do
raise "Unknown environment: #{env}"
end
end
end
# Provisioner uses environment config
class InfrastructureProvisioner
def deploy(environment)
config = EnvironmentConfig.for_environment(environment)
create_auto_scaling_group(
min_size: config[:min_instances],
max_size: config[:max_instances],
instance_type: config[:instance_type]
)
create_database(
instance_class: config[:database_instance],
multi_az: config[:multi_az]
)
end
end
Blue-Green Deployment maintains two identical production environments. Only one serves traffic at a time. Deploying new infrastructure creates or updates the inactive environment, tests it thoroughly, then switches traffic from the old (blue) to new (green) environment. If problems occur, switching back provides instant rollback.
Workspace Pattern manages multiple infrastructure instances from single code. Terraform workspaces, for example, maintain separate state files for each workspace while using the same configuration files. This pattern suits multiple customers, regions, or testing environments.
State File Management requires careful handling. State files contain sensitive information—passwords, private keys, resource identifiers. They must never be committed to version control. Remote state backends with encryption and access controls protect state files. State locking prevents concurrent modifications that could corrupt state.
# State management wrapper
class StateManager
def initialize(backend_type, config)
@backend = create_backend(backend_type, config)
end
def with_lock(&block)
lock_id = acquire_lock
begin
current_state = @backend.read_state
yield current_state
ensure
release_lock(lock_id)
end
end
def update_state(new_state)
with_lock do |current_state|
validate_state_transition(current_state, new_state)
@backend.write_state(new_state)
@backend.create_backup(current_state)
end
end
private
def acquire_lock
timeout = 300
start_time = Time.now
loop do
lock_id = @backend.try_lock
return lock_id if lock_id
elapsed = Time.now - start_time
raise "Failed to acquire state lock after #{timeout}s" if elapsed > timeout
sleep 5
end
end
def validate_state_transition(old_state, new_state)
# Ensure new state has higher version
raise "State version must increase" unless new_state[:version] > old_state[:version]
# Verify critical resources not accidentally deleted
old_resources = old_state[:resources].map { |r| r[:id] }
new_resources = new_state[:resources].map { |r| r[:id] }
deleted = old_resources - new_resources
critical_resources = old_state[:resources]
.select { |r| r[:prevent_destroy] }
.map { |r| r[:id] }
dangerous_deletes = deleted & critical_resources
raise "Attempting to delete protected resources: #{dangerous_deletes}" if dangerous_deletes.any?
end
end
Immutable Infrastructure Pattern creates new infrastructure for each deployment rather than modifying existing resources. Application updates build new machine images with the latest code. Deployment launches instances from new images and terminates old instances after traffic migration. This eliminates configuration drift and simplifies rollback.
Common Pitfalls
State File Corruption occurs when multiple users modify infrastructure simultaneously without proper locking. Two team members running infrastructure changes concurrently can create inconsistent state—resources exist that state files do not track, or state references deleted resources. Remote state backends with locking prevent this problem. Always verify state locking is enabled before collaborative use.
Credential Leakage happens when infrastructure code contains hardcoded credentials or when state files with embedded secrets are improperly stored. Infrastructure code should never contain passwords, API keys, or access tokens. Use environment variables, secret management services, or encrypted variable files. State files require encryption at rest and strict access controls.
# Dangerous - credentials in code
def create_database
Database.create(
username: "admin",
password: "hardcoded_password_123" # Never do this
)
end
# Better - credentials from environment
def create_database
Database.create(
username: ENV.fetch('DB_USERNAME'),
password: ENV.fetch('DB_PASSWORD')
)
end
# Best - credentials from secret manager
def create_database
secrets = SecretManager.get_secret('database/credentials')
Database.create(
username: secrets['username'],
password: secrets['password']
)
end
Circular Dependencies create deadlocks where resources depend on each other. A security group allows traffic from instances, and instances require the security group. IaC tools cannot determine creation order for circular dependencies. Break circular dependencies by using separate resource blocks or allowing creation before full configuration.
Resource Naming Conflicts occur when infrastructure code reuses names across environments or accounts. Cloud providers often require unique names globally or within accounts. Hard-coding resource names prevents deploying the same infrastructure code multiple times. Use name prefixes, suffixes, or generated identifiers to ensure uniqueness.
class ResourceNaming
def initialize(environment:, region:, app_name:)
@environment = environment
@region = region
@app_name = app_name
end
def generate_name(resource_type, identifier = nil)
parts = [@app_name, @environment, @region, resource_type]
parts << identifier if identifier
parts << SecureRandom.hex(4) if requires_global_uniqueness?(resource_type)
parts.join('-')
end
private
def requires_global_uniqueness?(resource_type)
[:s3_bucket, :iam_role, :certificate].include?(resource_type)
end
end
# Usage prevents naming conflicts
naming = ResourceNaming.new(
environment: 'production',
region: 'us-east-1',
app_name: 'myapp'
)
bucket_name = naming.generate_name(:s3_bucket, 'backups')
# => "myapp-production-us-east-1-s3-bucket-backups-f3a9c8d2"
Ignoring Dependencies causes resources to be created in wrong order. A server requiring a VPC fails if the VPC does not yet exist. Explicit dependency declarations ensure proper creation sequence. Some tools infer dependencies from references, but explicit dependencies clarify intent and prevent subtle bugs.
Incomplete Cleanup leaves orphaned resources consuming costs. Deleting infrastructure code without running destroy operations leaves cloud resources running. Manual resource deletion can orphan related resources. Always destroy infrastructure through IaC tools to ensure complete cleanup. Implement resource tagging to identify orphaned resources.
State Drift happens when infrastructure changes outside of IaC tools—manual console modifications, direct API calls, or other automation systems. State files no longer match reality. Some IaC tools detect drift and can refresh state from actual infrastructure. Regular drift detection prevents accumulated differences from causing deployment failures.
class DriftDetector
def initialize(infrastructure_provider)
@provider = infrastructure_provider
end
def detect_drift(expected_state)
actual_state = @provider.get_current_state
differences = compare_states(expected_state, actual_state)
{
has_drift: differences.any?,
added_resources: differences[:added],
removed_resources: differences[:removed],
modified_resources: differences[:modified]
}
end
private
def compare_states(expected, actual)
expected_ids = expected.resources.map(&:id).to_set
actual_ids = actual.resources.map(&:id).to_set
added = actual_ids - expected_ids
removed = expected_ids - actual_ids
common_ids = expected_ids & actual_ids
modified = common_ids.select do |id|
expected_resource = expected.resources.find { |r| r.id == id }
actual_resource = actual.resources.find { |r| r.id == id }
resource_differs?(expected_resource, actual_resource)
end
{
added: actual.resources.select { |r| added.include?(r.id) },
removed: expected.resources.select { |r| removed.include?(r.id) },
modified: modified.map do |id|
{
expected: expected.resources.find { |r| r.id == id },
actual: actual.resources.find { |r| r.id == id }
}
end
}
end
def resource_differs?(expected, actual)
expected.configuration != actual.configuration
end
end
Insufficient Testing deploys untested infrastructure changes directly to production. Infrastructure code needs testing just like application code. Unit tests verify individual modules. Integration tests deploy infrastructure to test environments. Validation tests check deployed infrastructure meets requirements. Policy-as-code tools enforce compliance requirements.
Poor Secret Management stores secrets in version control, state files, or logs. Secrets require special handling—encryption at rest and in transit, automatic rotation, access auditing. Use dedicated secret management services and inject secrets at runtime rather than embedding them in infrastructure code.
Reference
Core Principles
| Principle | Description | Implementation |
|---|---|---|
| Idempotence | Multiple executions produce same result | Design resources to converge to desired state |
| Version Control | Infrastructure code in source control | Track all changes through commits and pull requests |
| Immutability | Replace resources instead of modifying | Build new infrastructure from code for each change |
| Self-Documentation | Code documents infrastructure | Infrastructure code serves as authoritative documentation |
| Modularity | Reusable infrastructure components | Package common patterns as modules with parameters |
| Environment Parity | Consistent configuration across environments | Use same code with different variable files |
| State Management | Track infrastructure state centrally | Remote state backends with locking and versioning |
Implementation Models
| Model | Approach | Best For | Examples |
|---|---|---|---|
| Declarative | Describe desired end state | Cloud infrastructure with complex dependencies | Terraform, CloudFormation |
| Imperative | Specify execution steps | Procedural provisioning, migration scripts | Custom scripts, Ansible playbooks |
| Mutable | Modify existing resources | Traditional server management, minor updates | Configuration management tools |
| Immutable | Replace resources on change | Cloud-native applications, containerized workloads | AMI-based deployments, containers |
| Push | Execute from central location | On-demand provisioning, cloud resources | Terraform, CloudFormation |
| Pull | Agents fetch and apply configuration | Large server fleets, continuous configuration | Chef, Puppet |
Ruby Tools Comparison
| Tool | Type | Approach | Use Case |
|---|---|---|---|
| Chef | Configuration Management | Ruby DSL with agent | Server configuration at scale |
| Puppet | Configuration Management | Declarative DSL with agent | Enterprise configuration management |
| Fog | Cloud Library | Imperative API | Multi-cloud Ruby automation |
| AWS SDK | Cloud API | Imperative API | AWS-specific automation |
State Management Considerations
| Aspect | Requirement | Implementation |
|---|---|---|
| Storage | Encrypted, versioned, backed up | S3 with versioning, Terraform Cloud |
| Locking | Prevent concurrent modifications | DynamoDB locks, Consul |
| Access Control | Restrict state file access | IAM policies, RBAC |
| Sensitive Data | Encrypt secrets in state | State encryption, separate secret storage |
| Backup | Regular state file backups | Automated backup to separate storage |
| Sharing | Multi-user access | Remote backends with authentication |
Resource Lifecycle
| Phase | Actions | Considerations |
|---|---|---|
| Plan | Calculate required changes | Preview modifications before execution |
| Validate | Check configuration syntax | Catch errors before deployment |
| Apply | Execute infrastructure changes | Monitor progress, handle failures |
| Test | Verify infrastructure correctness | Automated testing after deployment |
| Monitor | Track infrastructure health | Detect drift, performance issues |
| Update | Modify existing infrastructure | Minimize disruption, maintain state |
| Destroy | Remove infrastructure | Complete cleanup, prevent orphaned resources |
Common Patterns
| Pattern | Purpose | Implementation |
|---|---|---|
| Module Composition | Reusable infrastructure components | Parameterized modules for common stacks |
| Blue-Green Deployment | Zero-downtime deployments | Parallel environments with traffic switching |
| Environment Promotion | Consistent multi-environment deployment | Shared code with environment-specific variables |
| Workspace Pattern | Multiple infrastructure instances | Separate state per workspace |
| Immutable Infrastructure | Eliminate configuration drift | Build new infrastructure for each change |
| GitOps | Infrastructure changes through git | Pull requests for infrastructure modifications |
Security Practices
| Practice | Purpose | Implementation |
|---|---|---|
| Secret Management | Protect sensitive credentials | Dedicated secret stores, runtime injection |
| Least Privilege | Minimal required permissions | Role-based access control, scoped permissions |
| Encryption | Protect data at rest and in transit | State encryption, TLS for communication |
| Audit Logging | Track infrastructure changes | Centralized logging, change attribution |
| Compliance Validation | Enforce security policies | Policy-as-code tools, automated scanning |
| Network Segmentation | Isolate infrastructure components | VPCs, security groups, network ACLs |