name: hybrid-cloud-architect type: reference description: "Designs hybrid cloud architectures connecting on-premises infrastructure with public cloud services. Use when designing systems spanning on-prem and cloud, or when the user mentions hybrid cloud or multi-environment architecture." effort: 4 allowed-tools: Read, Glob, Grep, Write, Edit, Bash user-invocable: true when_to_use: "When designing complex multi-cloud or hybrid cloud solutions across AWS, Azure, GCP, and private clouds"

Hybrid Cloud Architect

Designs hybrid and multi-cloud architectures that bridge on-premises infrastructure (OpenStack, VMware, bare metal) with public cloud services (AWS, Azure, GCP).

When to Use

Designing systems that span on-premises and cloud environments
Planning workload placement across private and public clouds
Migrating from on-prem to hybrid architecture
User mentions hybrid cloud, multi-cloud, or cross-environment

When NOT to Use

Single-cloud deployment (use cloud-architect instead)
Pure infrastructure provisioning without architecture decisions (use devops-deploy)
Application-level architecture without infrastructure concerns (use backend-architect)

Workflow

1. Assess Requirements

Gather constraints before designing:

Dimension	Questions
Compliance	Data sovereignty? Regulatory frameworks (HIPAA, PCI-DSS, GDPR)?
Performance	Latency requirements? Data gravity? Real-time vs batch?
Budget	TCO targets? Existing licenses? CapEx vs OpEx preference?
Skills	Team expertise in cloud platforms? OpenStack experience?
Timeline	Migration urgency? Phased approach acceptable?

2. Classify Workloads

For each workload, determine placement:

Criteria	On-Prem	Public Cloud	Edge
Data sovereignty	Yes	No unless region-locked	Yes
Low latency (less than 10ms)	Yes	No unless co-located	Yes
Elastic scaling	No	Yes	No
Cost-sensitive steady-state	Yes	No	-
Managed services needed	No	Yes	No

3. Design Connectivity

Choose connectivity based on requirements:

Options:

VPN: Low cost, lower bandwidth, good for non-critical traffic
Dedicated (Direct Connect / ExpressRoute / Interconnect): High bandwidth, low latency, SLA-backed
SD-WAN: Multi-site, dynamic path selection, cost optimization
Service mesh: For cross-cloud microservices communication (Istio, Linkerd)

4. Design Security Architecture

Apply zero-trust across environments:

Identity federation: AD/LDAP to cloud IAM (SAML/OIDC)
Network segmentation: Micro-segmentation, security groups across clouds
Encryption: In-transit (TLS) + at-rest, key management per environment
Secret management: Centralized (Vault) or cloud-native (KMS/KeyVault)
Compliance: Per-environment compliance controls, audit logging

5. Design Data Strategy

Pattern	Use When	Tools
Active-active replication	RPO=0, RTO less than 1min	Database-native replication, Kafka
Active-passive	RPO less than 15min, RTO less than 1hr	Cross-cloud backup, DNS failover
Data mesh	Domain ownership, distributed teams	Data catalogs, federated queries
Edge preprocessing	IoT, real-time analytics	Edge compute to cloud aggregation

6. Define Infrastructure as Code

Multi-cloud IaC strategy:

Terraform/OpenTofu: Cross-cloud resource provisioning
Ansible: Configuration management
Pulumi/CDK: Complex orchestration logic
OPA/Conftest: Policy as Code
GitOps (ArgoCD/Flux): Multi-environment deployment

State management:

Remote state with locking (S3+DynamoDB, Azure Storage, GCS)
Separate state per environment, shared modules
State migration plan for cross-cloud moves

7. Design Observability

Unified monitoring across environments:

Metrics: Prometheus + Thanos / Grafana Mimir (cross-cloud)
Logs: Centralized logging (ELK/Loki) with per-environment collectors
Traces: Distributed tracing (Jaeger/Tempo) across service boundaries
Alerting: Unified alerting with environment-aware routing
Cost monitoring: Per-cloud cost dashboards, anomaly detection

8. Plan Disaster Recovery

Tier	Strategy	RPO	RTO	Cost
Tier 1	Active-active multi-cloud	0	less than 1min	High
Tier 2	Active-passive cross-cloud	less than 15min	less than 1hr	Medium-High
Tier 3	Backup + manual failover	less than 24hr	less than 4hr	Medium
Tier 4	Backup only	less than 24hr	less than 24hr	Low

DR automation:

Automated failover triggers (health checks, circuit breakers)
Runbook automation for failover procedures
Regular DR testing schedule (quarterly minimum)

Output

Deliver:

Architecture diagram: showing all environments, connectivity, data flow
Workload placement matrix: workload to environment with justification
Connectivity plan: network topology, bandwidth, latency requirements
Security model: identity, network, data security per environment
Cost estimate: TCO comparison, per-environment breakdown
Migration plan: phased approach with rollback procedures (if applicable)

Platform-Specific Notes

OpenStack Integration

Services: Nova (compute), Neutron (network), Cinder (block storage), Swift (object), Keystone (identity)
Hybrid identity: Keystone federation with cloud IAM
Networking: Provider networks, VLAN/VXLAN for multi-tenant isolation

AWS Hybrid

Outposts: AWS hardware in on-prem data center
EKS Anywhere: Kubernetes on-prem with EKS compatibility
Direct Connect: Dedicated network connection

Azure Hybrid

Azure Arc: Manage resources across environments from Azure
Azure Stack: On-prem Azure services
ExpressRoute: Dedicated private connection

GCP Hybrid

Anthos: Multi-cloud Kubernetes management
Distributed Cloud: GCP services on-prem
Cloud Interconnect: Dedicated network connection

ナビゲーション

Skillsとは？

リンク

hybrid-cloud-architect