Tools: AWS Cloud Migration Checklist: Moving a Legacy App to the Cloud

Tools: AWS Cloud Migration Checklist: Moving a Legacy App to the Cloud

Phase 1: Pre-Migration Audit (Do Not Skip This)

Phase 2: Architecture Decisions

Phase 3: Infrastructure as Code (Non-Negotiable)

Phase 4: The Actual Migration

Phase 5: Post-Migration Validation

The Cost Trap Cloud migration projects fail for predictable reasons. Not because the technology is hard but because AWS is well-documented and the tooling is mature. They fail because teams skip the audit phase, underestimate data migration complexity, and try to lift-and-shift architectures that were never designed for the cloud. I've led cloud migrations for US businesses across FinTech, healthcare, and SaaS from single-server monoliths to multi-region systems. This checklist captures what actually needs to happen before, during, and after migration. Before touching AWS, you need a complete picture of what you're moving. Application inventory: Choose your migration strategy: For most US SaaS companies I work with, replatform is the right starting point. You get the reliability and scaling benefits of managed services without a full rewrite. Key AWS service decisions: If you're clicking through the AWS console to set up production infrastructure, you're building technical debt with every click. Use Terraform or AWS CDK from day one. Infrastructure as code means your entire environment is reproducible, reviewable in PRs, and recoverable after a disaster. Database migration approach: For minimal downtime, use the strangler pattern: Zero-downtime deployment setup: The health check is critical. ECS won't route traffic to a new container until it passes. Meaning a bad deployment rolls back automatically instead of taking down production. AWS costs can surprise teams coming from fixed-price hosting. Two common traps: Data transfer costs: Moving data out of AWS is expensive. If your app serves large files to US users, price your CloudFront distribution vs direct S3 transfer costs before you launch. RDS instance sizing: Teams often start with a db.r6g.4xlarge "to be safe" and pay $800+/month for a database handling 10 req/sec. Start smaller, enable Performance Insights, and scale based on actual metrics, not fear. Cloud migration done right leaves you with infrastructure that's more reliable, more scalable, and often cheaper than what you started with. Done wrong, it's an expensive, slower version of what you had before. If you're planning a cloud migration for a US business application, this is core work I do. From architecture planning through live cutover. More at waqarhabib.com/services/cloud-migration. Templates let you quickly answer FAQs or store snippets for re-use. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse

Code Block

Copy

Compute: Stateless web servers → ECS (Fargate) or EC2 Auto Scaling Groups Scheduled jobs → ECS Scheduled Tasks or Lambda (if under 15 min) Long-running workers → ECS with SQS trigger Database: PostgreSQL / MySQL → RDS with Multi-AZ enabled Redis cache → ElastiCache (Redis) File storage → S3 with CloudFront CDN Search → OpenSearch Service Networking: Load balancing → Application Load Balancer (ALB) DNS → Route 53 CDN → CloudFront Secrets → AWS Secrets Manager (never hardcode credentials) Compute: Stateless web servers → ECS (Fargate) or EC2 Auto Scaling Groups Scheduled jobs → ECS Scheduled Tasks or Lambda (if under 15 min) Long-running workers → ECS with SQS trigger Database: PostgreSQL / MySQL → RDS with Multi-AZ enabled Redis cache → ElastiCache (Redis) File storage → S3 with CloudFront CDN Search → OpenSearch Service Networking: Load balancing → Application Load Balancer (ALB) DNS → Route 53 CDN → CloudFront Secrets → AWS Secrets Manager (never hardcode credentials) Compute: Stateless web servers → ECS (Fargate) or EC2 Auto Scaling Groups Scheduled jobs → ECS Scheduled Tasks or Lambda (if under 15 min) Long-running workers → ECS with SQS trigger Database: PostgreSQL / MySQL → RDS with Multi-AZ enabled Redis cache → ElastiCache (Redis) File storage → S3 with CloudFront CDN Search → OpenSearch Service Networking: Load balancing → Application Load Balancer (ALB) DNS → Route 53 CDN → CloudFront Secrets → AWS Secrets Manager (never hardcode credentials) # terraform/main.tf: example: ECS cluster + RDS resource "aws_ecs_cluster" "app" { name = "${var.app_name}-${var.environment}" setting { name = "containerInsights" value = "enabled" } } resource "aws_db_instance" "postgres" { identifier = "${var.app_name}-${var.environment}" engine = "postgres" engine_version = "15.3" instance_class = var.db_instance_class allocated_storage = var.db_storage_gb multi_az = var.environment == "production" deletion_protection = var.environment == "production" backup_retention_period = 7 db_subnet_group_name = aws_db_subnet_group.main.name vpc_security_group_ids = [aws_security_group.rds.id] } # terraform/main.tf: example: ECS cluster + RDS resource "aws_ecs_cluster" "app" { name = "${var.app_name}-${var.environment}" setting { name = "containerInsights" value = "enabled" } } resource "aws_db_instance" "postgres" { identifier = "${var.app_name}-${var.environment}" engine = "postgres" engine_version = "15.3" instance_class = var.db_instance_class allocated_storage = var.db_storage_gb multi_az = var.environment == "production" deletion_protection = var.environment == "production" backup_retention_period = 7 db_subnet_group_name = aws_db_subnet_group.main.name vpc_security_group_ids = [aws_security_group.rds.id] } # terraform/main.tf: example: ECS cluster + RDS resource "aws_ecs_cluster" "app" { name = "${var.app_name}-${var.environment}" setting { name = "containerInsights" value = "enabled" } } resource "aws_db_instance" "postgres" { identifier = "${var.app_name}-${var.environment}" engine = "postgres" engine_version = "15.3" instance_class = var.db_instance_class allocated_storage = var.db_storage_gb multi_az = var.environment == "production" deletion_protection = var.environment == "production" backup_retention_period = 7 db_subnet_group_name = aws_db_subnet_group.main.name vpc_security_group_ids = [aws_security_group.rds.id] } # ECS task definition snippet { "family": "app-production", "containerDefinitions": [ { "name": "web", "image": "your-account.dkr.ecr.us-east-1.amazonaws.com/app:latest", "healthCheck": { "command": ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"], "interval": 30, "timeout": 5, "retries": 3, "startPeriod": 60 } } ], "requiresCompatibilities": ["FARGATE"] } # ECS task definition snippet { "family": "app-production", "containerDefinitions": [ { "name": "web", "image": "your-account.dkr.ecr.us-east-1.amazonaws.com/app:latest", "healthCheck": { "command": ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"], "interval": 30, "timeout": 5, "retries": 3, "startPeriod": 60 } } ], "requiresCompatibilities": ["FARGATE"] } # ECS task definition snippet { "family": "app-production", "containerDefinitions": [ { "name": "web", "image": "your-account.dkr.ecr.us-east-1.amazonaws.com/app:latest", "healthCheck": { "command": ["CMD-SHELL", "curl -f http://localhost:3000/health || exit 1"], "interval": 30, "timeout": 5, "retries": 3, "startPeriod": 60 } } ], "requiresCompatibilities": ["FARGATE"] } - [ ] List every service, process, and scheduled job running on your current infrastructure - [ ] Map all external dependencies: third-party APIs, payment processors, email providers - [ ] Document every port, protocol, and network path between services - [ ] Identify stateful vs stateless components: They migrate differently - [ ] Catalog every database: type, size, read/write patterns, peak load times - [ ] Identify data with compliance constraints (PII, PHI, PCI): This affects your AWS region choices and service selections - [ ] Measure your RTO (Recovery Time Objective) and RPO (Recovery Point Objective) requirements: These drive your backup and replication strategy - [ ] Document any data that cannot be in certain geographic regions (US-only requirements are common for government and healthcare clients) - [ ] Capture 30 days of traffic patterns: requests/second, peak times, geographic distribution - [ ] Profile database query patterns: identify slow queries, N+1 problems, missing indexes - [ ] Measure current response times as your benchmark: You need to beat these after migration - Set up RDS instance and run a full dump/restore to establish the baseline - Enable continuous replication from old DB to RDS (AWS DMS handles this) - Run both databases in parallel, validate data consistency - Switch application read traffic to RDS, keep writes going to old DB - Switch write traffic to RDS - Monitor for 48 hours - Decommission old database - [ ] Response times at or below pre-migration baseline - [ ] Error rates within normal range for 72 hours - [ ] Database query performance profiled on RDS: Slow query log enabled - [ ] CloudWatch alarms configured for: CPU, memory, database connections, error rates, 5xx responses - [ ] Cost Explorer reviewed: Confirm you're not running over-provisioned instances - [ ] Security: all services in private subnets, no public RDS endpoints, WAF in front of ALB - [ ] Backup restore tested: Not just that backups run, but that you can actually restore from them