Problem
The current `aws-main.tf` chains all AWS resource creation into a single tightly-coupled module — EC2, EBS, Security Groups, Route53, RKE2, Rancher import, NFS, PostgreSQL, and ActiveMQ all share one Terraform state and one sequential `depends_on` chain.
This means:
- Adding a DNS subdomain requires Terraform to evaluate all EC2 and SG state — risky
- EBS volumes are inline `ebs_block_device` blocks inside `aws_instance` — can't resize or add volumes without touching the instance resource
- Security groups for all 4 node types (nginx, control-plane, etcd, worker) are one combined variable blob — can't update independently
- RKE2 and Rancher import are in the same module — can't re-run import without re-running RKE2 setup
- No isolation between DNS, compute, cluster, and setup layers
Proposed Solution
Decouple into 3 layers, each with its own Terraform root and state file:
```
Layer 1 — Networking (rarely changes)
└── security-groups
Layer 2 — Compute (changes when scaling)
├── ec2
├── ebs
└── dns
Layer 3 — Cluster setup (runs after compute)
├── nginx
├── rke2
├── rancher-import
├── nfs
├── postgresql
└── activemq
```
Each component has its own state file (GPG-encrypted, committed to git). Components discover dependencies via AWS data sources (tag-based lookup) rather than state sharing.
Sub-tasks
Problem
The current `aws-main.tf` chains all AWS resource creation into a single tightly-coupled module — EC2, EBS, Security Groups, Route53, RKE2, Rancher import, NFS, PostgreSQL, and ActiveMQ all share one Terraform state and one sequential `depends_on` chain.
This means:
Proposed Solution
Decouple into 3 layers, each with its own Terraform root and state file:
```
Layer 1 — Networking (rarely changes)
└── security-groups
Layer 2 — Compute (changes when scaling)
├── ec2
├── ebs
└── dns
Layer 3 — Cluster setup (runs after compute)
├── nginx
├── rke2
├── rancher-import
├── nfs
├── postgresql
└── activemq
```
Each component has its own state file (GPG-encrypted, committed to git). Components discover dependencies via AWS data sources (tag-based lookup) rather than state sharing.
Sub-tasks