Retail Group Migrates 40 Servers to AWS — 6 Weeks, Zero Downtime
How a national retail group migrated 40 on-premises VMware servers to AWS in six weeks, achieving 35% cost reduction and 99.9% uptime.
Key Outcomes
- ✓35% infrastructure cost reduction within 6 months
- ✓99.9% uptime achieved post-migration
- ✓Disaster recovery time reduced from days to minutes
- ✓Staff self-sufficient within 4 weeks of training
The Challenge
A national retail group with 12 locations had been running on the same VMware on-premises infrastructure for seven years. The setup had served them well, but by 2023 the cracks were showing:
- Hardware approaching end-of-support with no replacement roadmap
- A disaster recovery plan that existed on paper but had never been tested
- A maintenance overhead that was consuming significant internal IT bandwidth
- No ability to scale compute up or down for seasonal peaks (Christmas, EOFY)
The business had evaluated cloud migration twice before but backed away due to cost concerns and perceived complexity. This time, they wanted a concrete plan — not a vendor presentation.
Our Approach
Phase 1: Workload Inventory and Classification (Weeks 1–2)
Before any migration work began, we audited all 40 servers to understand what was running, what depended on what, and how each workload should be migrated.
The classification broke down as:
- 28 servers: Rehost (lift and shift to EC2)
- 8 servers: Replatform (move to AWS managed services — RDS for databases, EFS for file shares)
- 3 servers: Retire (no longer actively used, confirmed with the business)
- 1 server: Retain on-premises (legacy POS integration requiring local network access)
This inventory phase also identified 4 undocumented dependencies — applications that called other servers via hardcoded IP addresses that would have broken during migration. Finding these early saved significant remediation time.
Phase 2: Environment Build and Testing (Weeks 2–4)
We built the target AWS environment in parallel with the existing infrastructure:
- VPC architecture with public and private subnets, security groups, and NACLs
- IAM roles and policies following least-privilege principles
- RDS instances for the two primary databases (MSSQL → RDS for SQL Server)
- CloudFront distribution for the customer-facing web assets
- S3 buckets with lifecycle policies for document storage
All migrated workloads were tested in the AWS environment before any production cutover.
Phase 3: Migration Execution (Weeks 4–6)
Migration was sequenced to move low-risk, non-customer-facing workloads first. This gave the team confidence and surfaced any unexpected issues before the critical systems were moved.
Cutover for customer-facing systems was executed over a Sunday evening maintenance window. The longest individual system downtime was 47 minutes.
Zero customer-facing outages occurred.
Phase 4: Training and Handover (Ongoing)
All internal IT staff completed AWS Cloud Practitioner training. We documented runbooks for the 20 most common operational tasks and held three structured knowledge transfer sessions.
By week 10, the internal team was managing the environment independently.
The Outcomes
Cost: Infrastructure costs dropped 35% within six months. The largest savings came from right-sizing over-provisioned on-premises servers and eliminating hardware maintenance contracts.
Reliability: The new environment achieved 99.9% uptime in the first six months. The business ran its most successful Christmas trading period on record without a single infrastructure incident.
Resilience: Disaster recovery changed from a theoretical plan to an automated, tested capability. RTO (Recovery Time Objective) dropped from an estimated 24–48 hours to under 30 minutes.
Scalability: The retail group now scales compute automatically during peak periods — something that was impossible with fixed on-premises hardware.
Key Lessons
The inventory phase is not optional. Every time we've seen cloud migrations fail or run over budget, it's because the team skipped or rushed the workload inventory. The four hidden dependencies we found would have caused significant outages if discovered during cutover.
Sequencing matters. Moving low-risk workloads first builds team confidence, surfaces issues early, and allows the business to validate the process before the critical systems are moved.
Training is part of the migration. A successful technical migration that leaves the internal team unable to operate the environment is a failed project. Build training into the plan, not as an afterthought.
Technologies used
Could this be your business?
Book a free 30-minute discovery call to explore what's possible.
Book a Free Call