CrackedRuby - Multi-Cloud Strategy

Overview

Multi-cloud strategy refers to the architectural approach of distributing workloads, data, and services across multiple cloud service providers rather than relying on a single vendor. Organizations deploy applications using combinations of AWS, Google Cloud Platform, Microsoft Azure, Oracle Cloud, and other providers to meet specific technical, business, and regulatory requirements.

The approach differs from hybrid cloud, which combines on-premises infrastructure with cloud resources. Multi-cloud specifically addresses the use of multiple public cloud providers simultaneously, whether for different applications, different components of the same application, or redundant deployments for high availability.

Multi-cloud adoption stems from several drivers: avoiding vendor lock-in prevents dependency on a single provider's pricing and feature roadmap; geographic distribution requirements necessitate providers with specific regional presence; regulatory compliance often mandates data residency in particular jurisdictions; and cost optimization becomes possible by selecting the most economical provider for each workload type.

The strategy introduces complexity in areas including identity management across providers, network connectivity between clouds, data synchronization and consistency, monitoring and observability, and operational tooling. Applications must abstract cloud-specific services or maintain provider-specific implementations behind unified interfaces.

# Example: Abstract cloud storage interface
class CloudStorage
  def self.for_provider(provider)
    case provider
    when :aws
      AwsStorage.new
    when :gcp
      GcpStorage.new
    when :azure
      AzureStorage.new
    end
  end
end

class AwsStorage
  def upload(bucket, key, data)
    s3_client.put_object(bucket: bucket, key: key, body: data)
  end
  
  def download(bucket, key)
    s3_client.get_object(bucket: bucket, key: key).body.read
  end
  
  private
  
  def s3_client
    @s3_client ||= Aws::S3::Client.new
  end
end

class GcpStorage
  def upload(bucket, key, data)
    storage.bucket(bucket).create_file(StringIO.new(data), key)
  end
  
  def download(bucket, key)
    storage.bucket(bucket).file(key).download.read
  end
  
  private
  
  def storage
    @storage ||= Google::Cloud::Storage.new
  end
end

Key Principles

Provider Abstraction: Applications separate cloud-specific implementations from business logic through abstraction layers. The adapter pattern isolates provider-specific code, allowing workload migration without application rewrites. Abstractions cover compute (virtual machines, containers, serverless), storage (object, block, file), databases (relational, NoSQL, caching), messaging (queues, pub/sub), and identity (authentication, authorization).

Service Portability: Workloads must move between providers with minimal friction. Containerization using Docker and Kubernetes provides runtime consistency across clouds. Infrastructure as Code (IaC) tools describe infrastructure declaratively, enabling reproduction across providers. Data portability requires exportable formats and migration tooling. Application portability depends on avoiding provider-specific proprietary services or maintaining multiple implementations.

Unified Operations: Managing multiple providers requires consistent operational practices. Centralized logging aggregates logs from all providers into a single system. Distributed tracing tracks requests across provider boundaries. Unified monitoring collects metrics from diverse sources. Single pane of glass dashboards provide visibility across the entire multi-cloud estate. Cost management tools aggregate spending across providers.

Network Architecture: Multi-cloud networks connect resources across providers while maintaining security and performance. Provider-specific VPCs or VNets operate in isolation. Cloud interconnects establish dedicated, high-bandwidth connections between providers. VPN tunnels provide encrypted connectivity over public internet. Service mesh architectures manage service-to-service communication across clouds. DNS-based routing directs traffic based on latency, health, or policy.

Identity Federation: Users and services authenticate once and access resources across multiple clouds. Identity providers (IdP) serve as the central authentication source. SAML or OAuth/OIDC protocols federate identity to cloud providers. Service accounts require synchronization or federation across clouds. Role-based access control (RBAC) policies must align across providers. Secrets management synchronizes credentials needed across environments.

Data Consistency: Applications maintain data consistency when distributed across clouds. Synchronous replication keeps data identical across regions but impacts write performance. Asynchronous replication provides eventual consistency with better performance. Conflict resolution strategies handle concurrent updates to replicated data. Master-slave architectures designate primary and read-replica databases. Multi-master setups allow writes to any location but increase complexity.

Failure Domains: Multi-cloud architectures treat entire cloud providers as failure domains. Active-active deployments run workloads simultaneously across providers with traffic distribution. Active-passive configurations maintain standby capacity in secondary clouds. Geographic distribution spreads risk across regions and continents. Circuit breakers detect provider failures and redirect traffic. Automated failover switches traffic without manual intervention.

Implementation Approaches

Application-Level Multi-Cloud: Different applications run on different cloud providers based on requirements. A company might run its customer-facing web application on AWS for its CDN and edge capabilities, analytics platform on Google Cloud for BigQuery, and Windows-based legacy systems on Azure for Active Directory integration. This approach minimizes complexity since each application remains single-cloud internally. Migration happens at the application boundary. Teams develop provider-specific expertise per application. The trade-off: provider failure impacts specific applications completely rather than degrading overall service.

Component-Level Multi-Cloud: Individual application components deploy to different providers. A web application might use AWS for compute (EC2), Google Cloud for data warehouse (BigQuery), and Azure for AI services (Cognitive Services). The adapter pattern abstracts component interfaces. Service meshes manage inter-component communication. API gateways route requests across provider boundaries. This approach optimizes each component independently but increases operational complexity and introduces cross-provider latency.

Redundant Multi-Cloud: The same application deploys identically to multiple providers for redundancy. DNS-based load balancing distributes traffic. Database replication keeps data synchronized. Stateless services scale independently on each provider. Stateful components require data synchronization strategies. This approach maximizes availability against provider outages but doubles infrastructure costs and operational overhead.

Kubernetes-Based Multi-Cloud: Kubernetes provides the abstraction layer across clouds. Cloud-managed Kubernetes services (EKS, GKE, AKS) run in each provider. Cluster federation links Kubernetes clusters across clouds. Container images deploy identically across all clusters. Persistent volumes require cloud-specific storage classes. Load balancers and ingress controllers differ per provider. Service mesh products like Istio span clusters. This approach provides strong portability but requires Kubernetes expertise and abstracts away provider-specific optimizations.

Terraform Multi-Cloud: Infrastructure as Code defines resources across providers in a single codebase. Terraform providers abstract cloud-specific APIs. Modules encapsulate reusable infrastructure patterns. State management tracks resources across all clouds. Workspaces separate environments (development, staging, production). Remote backends store state in highly available storage. This approach maintains infrastructure consistency but requires careful state management and provider version coordination.

# Application-level approach: Provider-specific configuration
class ApplicationConfig
  def self.load
    case ENV['CLOUD_PROVIDER']
    when 'aws'
      {
        database: RDS.connection_string,
        cache: ElastiCache.endpoint,
        storage: S3.bucket_url,
        queue: SQS.queue_url
      }
    when 'gcp'
      {
        database: CloudSQL.connection_string,
        cache: Memorystore.endpoint,
        storage: CloudStorage.bucket_url,
        queue: PubSub.topic_path
      }
    when 'azure'
      {
        database: AzureSQL.connection_string,
        cache: RedisCache.endpoint,
        storage: BlobStorage.container_url,
        queue: ServiceBus.queue_path
      }
    end
  end
end

Design Considerations

Vendor Lock-In Risk: Each cloud provider offers proprietary services with unique features and pricing models. Managed databases like AWS Aurora or Google Cloud Spanner provide capabilities unavailable in portable alternatives. Serverless platforms (Lambda, Cloud Functions, Azure Functions) use provider-specific triggers and integrations. Machine learning services train models using proprietary APIs. Organizations must decide whether to accept lock-in for specific capabilities or maintain portability through abstraction.

Design decision: Use managed services and accept lock-in where they provide significant business value. Abstract critical path components that might need migration. Accept that 100% portability eliminates access to cloud-native innovations.

Cost vs. Complexity Trade-Off: Multi-cloud increases operational costs through multiple contracts, training requirements, tooling investments, and network egress fees. Complexity grows with the number of providers: each adds authentication systems, monitoring tools, deployment pipelines, and compliance requirements. Single-cloud simplicity reduces operational burden.

Design decision: Start single-cloud for new organizations. Add additional providers only when specific requirements justify the complexity: regulatory mandates, acquisition integrations, or genuine best-of-breed advantages. Measure total cost of ownership including operational overhead, not just infrastructure costs.

Consistency vs. Optimization: Applications can maintain identical configurations across clouds for consistency or optimize per provider. Consistent configurations simplify operations but leave performance on the table. Provider-specific optimizations improve performance and cost but increase maintenance burden.

Design decision: Maintain consistent core architectures (container orchestration, service mesh, observability). Optimize provider-specific components where differences are substantial: instance types, storage tiers, networking configurations. Document provider-specific optimizations to prevent configuration drift.

Build vs. Buy for Abstraction: Organizations can build custom abstraction layers or adopt third-party tools. Custom abstractions fit specific requirements exactly but require ongoing maintenance. Third-party tools (Terraform, Kubernetes, Crossplane) provide proven abstractions but may not cover all services or may lag provider innovations.

Design decision: Use established open-source abstractions (Kubernetes, Terraform) as the foundation. Build thin custom layers only for business-specific logic. Avoid building abstractions for commodity services like storage or compute.

Active-Active vs. Active-Passive: Active-active deployments run workloads on all providers simultaneously, requiring data synchronization and traffic distribution. Active-passive maintains hot standby capacity, reducing synchronization complexity but increasing waste. Active-active provides better resource utilization and faster regional response times. Active-passive simplifies operations and reduces data consistency challenges.

Design decision: Use active-passive for stateful components with complex consistency requirements. Deploy stateless services active-active for better utilization. Consider costs: active-active doubles infrastructure spending, active-passive wastes standby capacity.

Network Architecture Choices: Direct interconnects between cloud providers offer high bandwidth and low latency but incur fixed costs. VPN tunnels over internet provide flexibility but variable performance. Public internet routing costs nothing but exposes traffic and suffers unpredictable latency.

Design decision: Use direct interconnects (AWS Direct Connect, Azure ExpressRoute, Google Cloud Interconnect) for high-throughput data transfers. VPN tunnels for management traffic and occasional synchronization. Public internet with TLS for client-facing traffic where provider-native CDN edges serve requests.

# Consistency vs optimization: Compute instance selection
class ComputeInstance
  def self.optimal_instance_type(provider, workload_type)
    case [provider, workload_type]
    when [:aws, :compute_intensive]
      'c7i.4xlarge'  # AWS-optimized compute
    when [:aws, :memory_intensive]
      'r7i.4xlarge'  # AWS-optimized memory
    when [:gcp, :compute_intensive]
      'c3-standard-16'  # GCP-optimized compute
    when [:gcp, :memory_intensive]
      'n2-highmem-16'  # GCP-optimized memory
    when [:azure, :compute_intensive]
      'F16s_v2'  # Azure compute-optimized
    when [:azure, :memory_intensive]
      'E16s_v5'  # Azure memory-optimized
    end
  end
end

# Consistent approach: Standard instance types
class StandardInstance
  INSTANCE_SPECS = {
    small: { vcpu: 2, memory_gb: 8 },
    medium: { vcpu: 4, memory_gb: 16 },
    large: { vcpu: 8, memory_gb: 32 }
  }
  
  def self.provision(provider, size)
    specs = INSTANCE_SPECS[size]
    # Map standard size to provider-specific instance type
    find_closest_match(provider, specs)
  end
end

Ruby Implementation

Ruby applications interact with multiple cloud providers through official SDKs and community gems. Each provider maintains SDK gems with consistent patterns but provider-specific APIs.

AWS SDK: The aws-sdk gem family provides Ruby interfaces to AWS services. Each service has a dedicated gem (aws-sdk-s3, aws-sdk-ec2, aws-sdk-dynamodb). Clients initialize with region and credentials. Methods map to AWS API operations.

require 'aws-sdk-s3'
require 'aws-sdk-dynamodb'

class AwsService
  def initialize(region: 'us-east-1')
    @region = region
  end
  
  def s3_client
    @s3_client ||= Aws::S3::Client.new(region: @region)
  end
  
  def dynamodb_client
    @dynamodb_client ||= Aws::DynamoDB::Client.new(region: @region)
  end
  
  def upload_file(bucket, key, file_path)
    s3_client.put_object(
      bucket: bucket,
      key: key,
      body: File.read(file_path),
      server_side_encryption: 'AES256'
    )
  end
  
  def store_metadata(table, item)
    dynamodb_client.put_item(
      table_name: table,
      item: item
    )
  end
end

Google Cloud SDK: The google-cloud gem family covers GCP services. Individual gems like google-cloud-storage, google-cloud-bigquery, and google-cloud-pubsub follow consistent patterns. Authentication uses service account JSON keys or application default credentials.

require 'google/cloud/storage'
require 'google/cloud/bigquery'

class GcpService
  def initialize(project_id:, credentials_path:)
    @project_id = project_id
    @credentials_path = credentials_path
  end
  
  def storage
    @storage ||= Google::Cloud::Storage.new(
      project_id: @project_id,
      credentials: @credentials_path
    )
  end
  
  def bigquery
    @bigquery ||= Google::Cloud::Bigquery.new(
      project_id: @project_id,
      credentials: @credentials_path
    )
  end
  
  def upload_file(bucket_name, file_path)
    bucket = storage.bucket(bucket_name)
    bucket.create_file(file_path, File.basename(file_path))
  end
  
  def query_data(dataset_id, sql)
    dataset = bigquery.dataset(dataset_id)
    results = bigquery.query(sql)
    results.map { |row| row.to_h }
  end
end

Azure SDK: The azure-storage-blob, azure-identity, and related gems provide Azure access. Authentication uses Azure Active Directory tokens or connection strings.

require 'azure/storage/blob'
require 'azure/identity'

class AzureService
  def initialize(storage_account_name:, access_key:)
    @storage_account_name = storage_account_name
    @access_key = access_key
  end
  
  def blob_client
    @blob_client ||= Azure::Storage::Blob::BlobService.create(
      storage_account_name: @storage_account_name,
      storage_access_key: @access_key
    )
  end
  
  def upload_file(container, blob_name, file_path)
    content = File.read(file_path)
    blob_client.create_block_blob(container, blob_name, content)
  end
  
  def list_blobs(container)
    blobs = blob_client.list_blobs(container)
    blobs.map(&:name)
  end
end

Multi-Cloud Abstraction: Ruby applications implement adapter patterns to unify cloud provider interfaces. The strategy pattern selects providers at runtime based on configuration.

class CloudProvider
  attr_reader :config
  
  def initialize(provider_type, config)
    @provider_type = provider_type
    @config = config
  end
  
  def storage_service
    @storage_service ||= case @provider_type
    when :aws
      AwsStorageAdapter.new(config)
    when :gcp
      GcpStorageAdapter.new(config)
    when :azure
      AzureStorageAdapter.new(config)
    else
      raise "Unsupported provider: #{@provider_type}"
    end
  end
  
  def database_service
    @database_service ||= case @provider_type
    when :aws
      AwsDatabaseAdapter.new(config)
    when :gcp
      GcpDatabaseAdapter.new(config)
    when :azure
      AzureDatabaseAdapter.new(config)
    end
  end
end

class StorageAdapter
  def upload(bucket, key, data)
    raise NotImplementedError
  end
  
  def download(bucket, key)
    raise NotImplementedError
  end
  
  def delete(bucket, key)
    raise NotImplementedError
  end
  
  def list(bucket, prefix: nil)
    raise NotImplementedError
  end
end

class AwsStorageAdapter < StorageAdapter
  def initialize(config)
    @client = Aws::S3::Client.new(
      region: config[:region],
      credentials: config[:credentials]
    )
  end
  
  def upload(bucket, key, data)
    @client.put_object(bucket: bucket, key: key, body: data)
  end
  
  def download(bucket, key)
    @client.get_object(bucket: bucket, key: key).body.read
  end
  
  def delete(bucket, key)
    @client.delete_object(bucket: bucket, key: key)
  end
  
  def list(bucket, prefix: nil)
    resp = @client.list_objects_v2(bucket: bucket, prefix: prefix)
    resp.contents.map(&:key)
  end
end

Credential Management: Multi-cloud applications manage credentials for multiple providers securely. Environment variables separate provider credentials. Credential providers abstract credential retrieval.

class CredentialManager
  def self.aws_credentials
    Aws::Credentials.new(
      ENV['AWS_ACCESS_KEY_ID'],
      ENV['AWS_SECRET_ACCESS_KEY']
    )
  end
  
  def self.gcp_credentials
    ENV['GOOGLE_APPLICATION_CREDENTIALS']
  end
  
  def self.azure_credentials
    {
      storage_account_name: ENV['AZURE_STORAGE_ACCOUNT'],
      storage_access_key: ENV['AZURE_STORAGE_ACCESS_KEY']
    }
  end
  
  def self.for_provider(provider)
    case provider
    when :aws
      aws_credentials
    when :gcp
      gcp_credentials
    when :azure
      azure_credentials
    end
  end
end

# Usage
provider = CloudProvider.new(:aws, {
  region: 'us-west-2',
  credentials: CredentialManager.aws_credentials
})

storage = provider.storage_service
storage.upload('my-bucket', 'data.json', JSON.generate(data))

Error Handling Across Providers: Provider SDKs raise different exception types. Applications must handle provider-specific errors uniformly.

class MultiCloudError < StandardError; end
class ProviderUnavailableError < MultiCloudError; end
class ResourceNotFoundError < MultiCloudError; end
class PermissionDeniedError < MultiCloudError; end

class CloudErrorHandler
  def self.normalize_error(provider, error)
    case provider
    when :aws
      normalize_aws_error(error)
    when :gcp
      normalize_gcp_error(error)
    when :azure
      normalize_azure_error(error)
    end
  end
  
  def self.normalize_aws_error(error)
    case error
    when Aws::S3::Errors::NoSuchKey, Aws::DynamoDB::Errors::ResourceNotFoundException
      ResourceNotFoundError.new(error.message)
    when Aws::S3::Errors::AccessDenied
      PermissionDeniedError.new(error.message)
    when Aws::Errors::ServiceError
      ProviderUnavailableError.new("AWS service error: #{error.message}")
    else
      MultiCloudError.new(error.message)
    end
  end
  
  def self.normalize_gcp_error(error)
    case error
    when Google::Cloud::NotFoundError
      ResourceNotFoundError.new(error.message)
    when Google::Cloud::PermissionDeniedError
      PermissionDeniedError.new(error.message)
    when Google::Cloud::Error
      ProviderUnavailableError.new("GCP error: #{error.message}")
    else
      MultiCloudError.new(error.message)
    end
  end
end

# Usage in adapter
def download(bucket, key)
  @client.get_object(bucket: bucket, key: key).body.read
rescue => e
  raise CloudErrorHandler.normalize_error(:aws, e)
end

Tools & Ecosystem

Terraform: Infrastructure as Code tool supporting all major cloud providers. Providers abstract cloud APIs into declarative HCL syntax. Modules encapsulate reusable infrastructure patterns. State files track resource mappings across providers. Workspaces separate environments.

Configuration defines resources for multiple providers in a single codebase. Remote backends store state in S3, GCS, or Azure Storage. Provider version constraints ensure compatibility.

# Multi-cloud Terraform configuration
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    google = {
      source  = "hashicorp/google"
      version = "~> 5.0"
    }
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 3.0"
    }
  }
  
  backend "s3" {
    bucket = "terraform-state"
    key    = "multi-cloud/terraform.tfstate"
    region = "us-east-1"
  }
}

provider "aws" {
  region = "us-east-1"
}

provider "google" {
  project = "my-gcp-project"
  region  = "us-central1"
}

provider "azurerm" {
  features {}
}

# AWS S3 bucket
resource "aws_s3_bucket" "data" {
  bucket = "multi-cloud-data-aws"
}

# GCP Storage bucket
resource "google_storage_bucket" "data" {
  name     = "multi-cloud-data-gcp"
  location = "US"
}

# Azure Blob container
resource "azurerm_storage_account" "data" {
  name                     = "multiclouddata"
  resource_group_name      = azurerm_resource_group.main.name
  location                 = "eastus"
  account_tier             = "Standard"
  account_replication_type = "LRS"
}

Kubernetes: Container orchestration platform running on all major clouds. Managed services (EKS, GKE, AKS) provide control planes. Cluster federation links multiple clusters. Service mesh spans clusters across providers. Storage classes abstract provider-specific persistent volumes.

Pulumi: Infrastructure as Code using programming languages including Ruby. Cloud providers expose APIs through native Ruby classes. State management tracks resources. Supports multi-cloud deployments with language-native abstractions.

require 'pulumi'
require 'pulumi/aws'
require 'pulumi/gcp'

# AWS S3 bucket
aws_bucket = Pulumi::Aws::S3::Bucket.new('aws-data',
  bucket: 'multi-cloud-data-aws',
  acl: 'private'
)

# GCP Storage bucket
gcp_bucket = Pulumi::Gcp::Storage::Bucket.new('gcp-data',
  name: 'multi-cloud-data-gcp',
  location: 'US'
)

Pulumi.export('aws_bucket_name', aws_bucket.id)
Pulumi.export('gcp_bucket_name', gcp_bucket.name)

Crossplane: Kubernetes-native infrastructure management. Custom Resource Definitions (CRDs) represent cloud resources. Providers connect to AWS, GCP, Azure APIs. Compositions create reusable infrastructure patterns. Control plane reconciles desired state with actual cloud resources.

HashiCorp Vault: Secrets management across clouds. Dynamic secrets generate credentials on-demand. Cloud authentication backends integrate with provider IAM. Secret engines support AWS, GCP, Azure credential generation. Token renewal automates credential rotation.

Prometheus & Grafana: Monitoring stack aggregating metrics from multiple clouds. Prometheus scrapes metrics from applications and infrastructure. Cloud exporters collect provider-specific metrics. Grafana dashboards visualize multi-cloud environments. Alert rules trigger on cross-provider conditions.

Datadog / New Relic: Commercial observability platforms supporting multi-cloud. Agents collect metrics, logs, traces from all providers. Unified dashboards show infrastructure across clouds. Distributed tracing follows requests across provider boundaries. Cloud integrations pull provider-specific metrics.

CloudHealth / CloudCheckr: Multi-cloud cost management platforms. Aggregate spending across providers. Cost allocation tags categorize expenses. Right-sizing recommendations optimize instance types. Reserved instance planning reduces costs. Budget alerts prevent overruns.

Istio: Service mesh managing microservices communication. Runs on Kubernetes clusters across clouds. Mutual TLS secures service-to-service traffic. Traffic management routes requests between providers. Observability exports traces to centralized systems. Multi-cluster mesh spans provider boundaries.

Integration & Interoperability

Cross-Cloud Networking: Establishing connectivity between provider networks requires dedicated connections or VPN tunnels. AWS Direct Connect, Azure ExpressRoute, and Google Cloud Interconnect offer dedicated fiber connections. IPsec VPN tunnels connect VPCs across providers over internet. BGP routing protocols exchange routes between provider networks.

# VPN connection configuration manager
class VpnConnection
  attr_reader :local_provider, :remote_provider, :tunnel_config
  
  def initialize(local:, remote:)
    @local_provider = local
    @remote_provider = remote
    @tunnel_config = generate_tunnel_config
  end
  
  def establish
    case [@local_provider, @remote_provider]
    when [:aws, :gcp]
      setup_aws_to_gcp_tunnel
    when [:aws, :azure]
      setup_aws_to_azure_tunnel
    when [:gcp, :azure]
      setup_gcp_to_azure_tunnel
    end
  end
  
  private
  
  def setup_aws_to_gcp_tunnel
    # AWS VPN Gateway configuration
    vpn_gateway = aws_client.create_vpn_gateway(type: 'ipsec.1')
    customer_gateway = aws_client.create_customer_gateway(
      type: 'ipsec.1',
      public_ip: gcp_vpn_endpoint,
      bgp_asn: 65000
    )
    vpn_connection = aws_client.create_vpn_connection(
      type: 'ipsec.1',
      vpn_gateway_id: vpn_gateway.vpn_gateway_id,
      customer_gateway_id: customer_gateway.customer_gateway_id
    )
    
    # Configure corresponding GCP VPN
    gcp_client.create_vpn_gateway(
      name: 'aws-tunnel',
      network: gcp_network,
      target_vpn_gateway: vpn_connection.vpn_gateway_id
    )
  end
end

Identity Federation: Single sign-on across providers uses SAML or OAuth. Identity provider (Okta, Azure AD, Google Workspace) serves as authentication source. Cloud providers trust the IdP assertions. SAML assertions contain user attributes and group memberships. Cloud providers map groups to roles.

class IdentityFederation
  def initialize(idp_metadata_url)
    @idp = SamlIdp.new(metadata_url: idp_metadata_url)
  end
  
  def federate_to_aws(user_email, role_arn)
    saml_assertion = @idp.generate_assertion(
      user: user_email,
      attributes: {
        'https://aws.amazon.com/SAML/Attributes/Role' => role_arn
      }
    )
    
    sts = Aws::STS::Client.new
    response = sts.assume_role_with_saml(
      role_arn: role_arn,
      principal_arn: principal_arn,
      saml_assertion: Base64.encode64(saml_assertion)
    )
    
    response.credentials
  end
  
  def federate_to_gcp(user_email, service_account)
    token = @idp.generate_token(user: user_email)
    
    iam = Google::Apis::IamV1::IamService.new
    iam.authorization = token
    
    key = iam.create_service_account_key(
      "projects/-/serviceAccounts/#{service_account}",
      Google::Apis::IamV1::CreateServiceAccountKeyRequest.new
    )
    
    JSON.parse(Base64.decode64(key.private_key_data))
  end
end

Data Synchronization: Moving data between clouds requires transfer mechanisms. Object storage replication copies data between S3, GCS, and Azure Blob. Database replication streams changes between cloud databases. Message queues bridge pub/sub systems across providers.

class DataSynchronizer
  def initialize(source_provider:, target_provider:)
    @source = CloudProvider.new(source_provider, source_config)
    @target = CloudProvider.new(target_provider, target_config)
  end
  
  def sync_bucket(source_bucket, target_bucket, prefix: nil)
    objects = @source.storage_service.list(source_bucket, prefix: prefix)
    
    objects.each do |key|
      data = @source.storage_service.download(source_bucket, key)
      @target.storage_service.upload(target_bucket, key, data)
    end
  end
  
  def sync_database_table(source_table, target_table)
    # Read from source
    records = @source.database_service.scan_table(source_table)
    
    # Write to target in batches
    records.each_slice(100) do |batch|
      @target.database_service.batch_write(target_table, batch)
    end
  end
  
  def stream_messages(source_topic, target_topic)
    @source.messaging_service.subscribe(source_topic) do |message|
      @target.messaging_service.publish(target_topic, message)
    end
  end
end

API Gateway Aggregation: Multi-cloud applications expose unified APIs while routing to provider-specific backends. API gateways route requests based on geography, load, or capability. GraphQL federation combines provider-specific APIs into unified schema.

class MultiCloudApiGateway
  def initialize
    @routing_table = load_routing_config
  end
  
  def route_request(request)
    provider = select_provider(request)
    backend = backend_for_provider(provider)
    
    begin
      backend.handle(request)
    rescue ProviderUnavailableError => e
      failover_provider = next_available_provider(exclude: provider)
      backend_for_provider(failover_provider).handle(request)
    end
  end
  
  private
  
  def select_provider(request)
    # Route based on geography
    return :aws if request.origin_country == 'US'
    return :gcp if request.origin_country == 'JP'
    return :azure if request.origin_region == 'EU'
    
    # Route based on load
    least_loaded_provider
  end
  
  def backend_for_provider(provider)
    case provider
    when :aws
      AwsBackend.new(endpoint: aws_api_endpoint)
    when :gcp
      GcpBackend.new(endpoint: gcp_api_endpoint)
    when :azure
      AzureBackend.new(endpoint: azure_api_endpoint)
    end
  end
end

Observability Integration: Distributed tracing spans provider boundaries using OpenTelemetry. Trace context propagates through HTTP headers. Spans record cloud provider, region, and service. Centralized collection aggregates traces from all providers.

require 'opentelemetry/sdk'
require 'opentelemetry/instrumentation/all'

OpenTelemetry::SDK.configure do |c|
  c.service_name = 'multi-cloud-app'
  c.use_all
end

class TracedMultiCloudOperation
  def execute
    tracer = OpenTelemetry.tracer_provider.tracer('multi-cloud')
    
    tracer.in_span('multi_cloud_operation') do |span|
      span.set_attribute('cloud.provider', 'aws')
      span.set_attribute('cloud.region', 'us-east-1')
      
      aws_result = aws_operation
      
      span.add_event('aws_complete', attributes: {
        'result.size' => aws_result.size
      })
      
      span.set_attribute('cloud.provider', 'gcp')
      gcp_result = gcp_operation
      
      combine_results(aws_result, gcp_result)
    end
  end
  
  private
  
  def aws_operation
    tracer = OpenTelemetry.tracer_provider.tracer('aws')
    tracer.in_span('aws_s3_read') do
      storage = AwsStorageAdapter.new(aws_config)
      storage.download('bucket', 'key')
    end
  end
  
  def gcp_operation
    tracer = OpenTelemetry.tracer_provider.tracer('gcp')
    tracer.in_span('gcp_bigquery') do
      bq = GcpService.new(project_id: 'project', credentials_path: 'creds.json')
      bq.query_data('dataset', 'SELECT * FROM table')
    end
  end
end

Real-World Applications

Global Content Delivery: Media companies distribute content across multiple clouds for geographic coverage. Video files store in S3 (Americas), GCS (Asia-Pacific), and Azure Blob (Europe). CloudFront, Cloud CDN, and Azure CDN serve regional traffic. DNS geo-routing directs users to nearest provider. Origin servers pull from regional storage.

Content upload workflow stores files in the primary provider. Replication jobs copy to other regions. CDN invalidation purges stale cache entries across providers. Analytics aggregate viewing metrics from all CDNs.

Financial Services Compliance: Banks deploy different applications on different providers based on regulatory requirements. European customer data resides in GCP Europe regions for GDPR compliance. US trading systems run on AWS for SEC requirements. Asian operations use Azure for local data residency laws.

Data never crosses regulatory boundaries. Identity federation authenticates users once but accesses region-appropriate systems. Audit logging streams to centralized SIEM from all providers. Encryption at rest uses provider-managed or customer-managed keys per regulation.

Hybrid Analytics Platform: Enterprises run data warehouses on multiple clouds. Transactional data lives in AWS RDS. Google BigQuery performs large-scale analytics. Azure Synapse runs machine learning workloads. Data pipelines replicate between systems.

class HybridAnalyticsPipeline
  def initialize
    @aws_db = Aws::RDS::Client.new(region: 'us-east-1')
    @bigquery = Google::Cloud::Bigquery.new(project_id: 'analytics-project')
    @synapse = Azure::Synapse::Client.new(workspace_url: workspace_url)
  end
  
  def daily_pipeline
    # Extract from AWS RDS
    transactions = extract_transactions_from_rds
    
    # Load to BigQuery for analytics
    load_to_bigquery(transactions)
    
    # Run BigQuery analysis
    aggregated = run_bigquery_analysis
    
    # Load results to Synapse for ML
    load_to_synapse(aggregated)
    
    # Train model in Azure
    train_model_in_synapse
  end
  
  private
  
  def extract_transactions_from_rds
    # Export RDS data to S3
    s3 = Aws::S3::Client.new
    # ... export logic
  end
  
  def load_to_bigquery(data)
    dataset = @bigquery.dataset('transactions')
    table = dataset.table('daily')
    table.insert(data)
  end
  
  def run_bigquery_analysis
    sql = <<~SQL
      SELECT 
        customer_id,
        SUM(amount) as total_spent,
        COUNT(*) as transaction_count
      FROM transactions.daily
      WHERE date = CURRENT_DATE()
      GROUP BY customer_id
    SQL
    
    @bigquery.query(sql).map(&:to_h)
  end
  
  def load_to_synapse(data)
    # Load to Azure Synapse for ML processing
    @synapse.load_data(pool: 'ml_pool', table: 'customer_features', data: data)
  end
end

Disaster Recovery Architecture: SaaS platforms maintain hot standby across providers. Primary workloads run on AWS. Azure maintains synchronized replicas. Database replication keeps data current. Health checks monitor primary availability. Automated failover switches traffic on failure.

DNS with health checks points to active provider. Application sessions replicate to both providers. Stateless services scale independently. Database failover requires promotion of replica to primary.

E-commerce Platform: Online retailers use provider-specific strengths. AWS hosts web servers and API gateways. Google Cloud runs recommendation engine using AI Platform. Azure handles payment processing with PCI-compliant infrastructure. Kubernetes clusters on each provider run shared microservices.

Service mesh connects microservices across clouds. API calls traverse VPN tunnels between providers. Product catalog replicates to all providers. Order processing spans multiple clouds: shopping cart (AWS) → payment (Azure) → fulfillment (GCP).

class EcommercePlatform
  def initialize
    @catalog_service = AwsCatalogService.new
    @recommendation_service = GcpRecommendationService.new
    @payment_service = AzurePaymentService.new
  end
  
  def purchase_flow(user_id, items)
    # Get current inventory from AWS
    available_items = @catalog_service.check_availability(items)
    
    # Get personalized recommendations from GCP
    recommendations = @recommendation_service.recommend(
      user_id: user_id,
      items: items
    )
    
    # Process payment through Azure
    payment_result = @payment_service.process_payment(
      user_id: user_id,
      items: available_items,
      amount: calculate_total(available_items)
    )
    
    if payment_result.success?
      # Update inventory on AWS
      @catalog_service.reserve_items(available_items)
      
      # Record analytics on GCP
      @recommendation_service.record_purchase(user_id, available_items)
      
      # Send confirmation
      send_confirmation(payment_result.transaction_id)
    end
    
    payment_result
  end
end

Research Computing: Universities distribute compute-intensive workloads across clouds for cost optimization. Spot instances on AWS run batch jobs. Preemptible VMs on GCP handle training workloads. Azure low-priority VMs process simulations. Job scheduler allocates work based on current pricing.

Data sets replicate to object storage on all providers. Compute jobs pull input from local storage. Results aggregate to central data lake. Cost tracking attributes spending to research grants.

Reference

Provider Service Mapping

Service Category	AWS	Google Cloud	Azure
Compute	EC2, Lambda, Fargate	Compute Engine, Cloud Functions, Cloud Run	Virtual Machines, Functions, Container Instances
Object Storage	S3	Cloud Storage	Blob Storage
Block Storage	EBS	Persistent Disk	Managed Disks
File Storage	EFS	Filestore	Files
Relational Database	RDS, Aurora	Cloud SQL, Spanner	SQL Database, PostgreSQL
NoSQL Database	DynamoDB	Firestore, Bigtable	Cosmos DB
Data Warehouse	Redshift	BigQuery	Synapse Analytics
Container Registry	ECR	Container Registry	Container Registry
Kubernetes	EKS	GKE	AKS
Load Balancing	ELB, ALB	Cloud Load Balancing	Load Balancer, Application Gateway
DNS	Route 53	Cloud DNS	DNS
CDN	CloudFront	Cloud CDN	CDN
Message Queue	SQS	Pub/Sub	Service Bus, Queue Storage
Caching	ElastiCache	Memorystore	Cache for Redis
Identity	IAM, Cognito	Identity Platform, IAM	Active Directory, Azure AD
Monitoring	CloudWatch	Cloud Monitoring	Monitor, Application Insights
Logging	CloudWatch Logs	Cloud Logging	Log Analytics

Ruby SDK Gems

Provider	Core Gem	Service Gems
AWS	aws-sdk	aws-sdk-s3, aws-sdk-ec2, aws-sdk-dynamodb, aws-sdk-sqs
Google Cloud	google-cloud	google-cloud-storage, google-cloud-bigquery, google-cloud-pubsub
Azure	azure	azure-storage-blob, azure-identity, azure-messaging-servicebus

Network Connectivity Options

Connection Type	AWS	Google Cloud	Azure	Bandwidth	Latency
Direct Connect	Direct Connect	Cloud Interconnect	ExpressRoute	1-100 Gbps	Low
VPN	VPN Gateway	Cloud VPN	VPN Gateway	Up to 1.25 Gbps	Medium
Internet	Public endpoints	Public endpoints	Public endpoints	Variable	High
Cross-Cloud Direct	Direct Connect to Azure/GCP	Partner Interconnect	ExpressRoute to AWS/GCP	1-10 Gbps	Low

Authentication Methods

Method	AWS	Google Cloud	Azure
Access Keys	IAM Access Keys	Service Account Keys	Storage Account Keys
Temporary Credentials	STS AssumeRole	Token Exchange	Managed Identity
Identity Federation	SAML, OIDC	Workload Identity, OIDC	Azure AD SSO, SAML
Service Accounts	IAM Roles	Service Accounts	Managed Identities

Cost Optimization Strategies

Strategy	Description	Implementation
Spot Instances	Use interruptible VMs	AWS Spot, GCP Preemptible, Azure Spot
Reserved Capacity	Commit for discounts	Reserved Instances, Committed Use, Reserved VM Instances
Right-Sizing	Match instance to workload	Regular analysis, auto-scaling
Storage Tiering	Archive infrequent data	S3 Glacier, Archive Storage, Cool Blob
Data Transfer	Minimize egress	Regional architecture, CDN
Resource Tagging	Track spending	Consistent tag strategy

Multi-Cloud Architecture Patterns

Pattern	Use Case	Complexity	Cost
Application-Level	Different apps per provider	Low	Medium
Component-Level	Services across providers	High	Medium
Redundant	Same app all providers	Medium	High
Kubernetes	Container-based portability	High	Medium
Serverless	Function-level distribution	Medium	Low

Failover Strategy Comparison

Strategy	RTO	RPO	Cost	Complexity
Active-Active	Minutes	Near-zero	High	High
Active-Passive	Hours	Minutes	Medium	Medium
Backup-Restore	Days	Hours	Low	Low
Pilot Light	Hours	Hours	Medium	Medium

Data Synchronization Methods

Method	Latency	Consistency	Use Case
Synchronous Replication	Low	Strong	Financial transactions
Asynchronous Replication	Medium	Eventual	Content distribution
Batch Transfer	High	Eventual	Analytics pipelines
Change Data Capture	Low	Eventual	Database sync
Message Queue Bridge	Medium	Eventual	Event streaming

Multi-Cloud Strategy