Cloud migration is often described as “move servers to AWS/Azure/GCP”. In reality, successful migration is a program: you standardize security and networking, map dependencies, choose the right strategy per application, migrate in waves, then optimize cost and reliability after go-live.
This guide walks you through a proven step-by-step approach that works for small businesses and larger teams alike. It is intentionally practical: what to do, what decisions to make, and what to avoid.
Best success indicator
A “successful” migration is not just workloads running in the cloud. It’s when ownership is clear, monitoring is in place, security controls are enforced by default, and costs are predictable.
1. What Cloud Migration Is (And What It Is Not)
Cloud migration is the process of moving applications, data, and supporting infrastructure from on-premises (or another environment) into a cloud platform. Depending on your strategy, you might:
- Move “as-is” (fast, minimal change),
- Modernize parts of the stack (managed databases, containers),
- Re-architect to cloud-native patterns (event-driven, autoscaling),
- Or replace a system with SaaS entirely.
What cloud migration is not: a purely technical copy/paste exercise. Most migration problems come from missing ownership, missing dependency understanding, and missing operational readiness.
2. When You Should Migrate (And When You Shouldn’t)
Cloud migration can be a strong move when you want:
- Faster delivery: faster provisioning, standardized environments, automation.
- Elastic capacity: scale up/down without buying hardware.
- Better resilience: multi-zone designs, managed services, improved backup patterns.
- Security baseline: centralized IAM, logging, and policy enforcement.
- Modern tooling: CI/CD, observability, IaC, managed databases/queues.
You should pause (or narrow scope) if:
- You cannot define why you are migrating beyond “everyone does it”.
- You lack application owners and clear accountability.
- Your dependencies are unknown and undocumented.
- You are moving a system that should be retired or replaced (SaaS) instead.
Common trap
“Lift-and-shift everything” often moves technical debt into the cloud and then adds cloud billing on top. Treat lift-and-shift as a tactic for specific apps, not a default strategy for the whole portfolio.
3. Step 1: Define Goals, Scope & Success Metrics
Start with outcomes. Cloud migration should solve specific problems, not create new ones. Align early on:
- Business goals: time-to-market, stability, cost predictability, compliance, exit a data center.
- Scope: which apps are in scope, which are out, and why.
- Constraints: regulatory, data residency, latency, integration requirements.
- Success metrics: measurable KPIs you can report (not just “migrated”).
Examples of practical success metrics:
- Deployment frequency increases from monthly to weekly.
- Mean time to recover (MTTR) improves by X% after standardizing monitoring and runbooks.
- Infrastructure provisioning time drops from days to hours via IaC.
- Cloud spend is tagged to owners and stays within agreed budgets.
Scope statement example
“Migrate 12 customer-facing services and 3 internal services to the cloud in 3 waves. Replace the legacy ticketing tool with SaaS. Retire 2 unused batch jobs. Keep the mainframe workload as-is for now.”
4. Step 2: Discovery, Inventory & Dependency Mapping
Discovery is the foundation. You cannot plan waves or cutovers if you do not know what talks to what. Build an application inventory with:
- Owner: technical and business owner for each system.
- Criticality: customer-facing, internal, revenue-impacting, compliance-impacting.
- Architecture: runtime, database, integrations, batch jobs, queues, file shares.
- Dependencies: upstream/downstream services, network flows, identity providers.
- Non-functional needs: latency, RTO/RPO, availability targets, data residency.
- Operational readiness: monitoring, runbooks, alerting, on-call, logging.
Practical output
Create a “dependency map” and a “migration wave plan”. Those two artifacts prevent most surprise outages.
5. Step 3: Choose a Strategy (The 6Rs)
The most common framework is the 6Rs. Each application gets a strategy based on risk, complexity, and value.
- Rehost (Lift-and-shift): move as-is to cloud VMs. Fast, but may not reduce cost or complexity.
- Replatform: small optimizations (e.g., move to managed database, change runtime settings).
- Refactor: redesign for cloud-native (containers, event-driven, microservices). Higher effort, higher potential benefit.
- Repurchase: replace with SaaS (e.g., CRM, HR, helpdesk).
- Retire: decommission unused apps.
- Retain: keep as-is for now (technical/legal constraints).
Decision rule
If an app has unclear ownership, unknown dependencies, and poor documentation, it is a bad candidate for early waves— regardless of which “R” you pick.
6. Step 4: Build a Secure Cloud Landing Zone
A landing zone is the standardized foundation where workloads will live. Without it, teams create inconsistent networks, ad-hoc IAM policies, and fragmented logging.
A minimal landing zone typically includes:
- Account/subscription structure: separation for prod/non-prod, shared services, security.
- Identity integration: SSO, MFA, role-based access model.
- Networking baseline: VPC/VNet design, subnets, routing, egress strategy.
- Logging & monitoring: centralized logs, metrics, audit trails.
- Policy enforcement: guardrails (no public storage buckets, required encryption, tagging).
- Infrastructure as Code (IaC): repeatable deployments, change control, drift detection.
What “good” looks like
Your first application migration should feel boring because the landing zone already provides networking, security controls, and observability by default.
7. Step 5: Identity, Security & Compliance Controls
Security must be designed before migrations start, otherwise teams ship workloads and “fix security later”. Establish a baseline:
- IAM model: least privilege, role-based access, no shared admin accounts.
- MFA + SSO: enforced for privileged access.
- Secrets management: do not hardcode secrets; use managed vaults.
- Encryption: at rest and in transit, with key management defined.
- Audit logging: who did what, when, from where.
- Compliance mapping: data classification, retention, access reviews.
Red flag
If you cannot answer “who owns this cloud resource and why does it exist?”, you will eventually have a security and cost problem. Enforce tagging and ownership from day one.
8. Step 6: Networking, Connectivity, DNS & Certificates
Networking is where migrations often stall because it impacts everything: identity, integration, latency, and cutovers. Plan explicitly:
- Connectivity: VPN or dedicated connectivity between on-prem and cloud (if needed).
- Addressing: IP ranges that do not overlap with existing networks.
- Routing: clear ingress/egress rules and where NAT/firewalls live.
- DNS plan: internal vs external zones, TTL strategy for cutover.
- TLS certificates: issuance, renewals, and where certs are stored.
Cutover-friendly DNS tip
Lower DNS TTL ahead of cutover windows (when appropriate) so you can switch traffic faster. Then raise TTL again after stability is confirmed.
9. Step 7: Data Migration (Databases, Files, Backups)
Data migration needs its own plan because it is often the biggest risk. Choose a pattern depending on your system:
- Offline migration: stop writes, export/import, then start in cloud (simple, but downtime).
- Online replication: continuous sync, then short cutover window (more complex, less downtime).
- Hybrid period: some read replicas or dual-write patterns (highest complexity, used selectively).
Define non-negotiables before moving data:
- RPO/RTO: acceptable data loss and recovery time.
- Validation: row counts, checksums, sample queries, app-level validation.
- Backup strategy: backup frequency, retention, restore drills.
- Security: encryption, access controls, auditability.
Do not skip restore tests
Backups are not real until you have successfully restored them. Schedule restore drills early, not after an incident.
10. Step 8: Migrate Applications in Waves (Execution)
Migrate in waves (also called batches) rather than moving everything at once. A wave groups systems that can move together with manageable risk.
A typical wave structure:
- Wave 0 (foundation): landing zone, identity, networking, logging, CI/CD baseline.
- Wave 1 (learning): low-risk apps, internal tooling, non-critical services.
- Wave 2 (scale): core services with more dependencies.
- Wave 3 (critical): high criticality systems after patterns are proven.
Wave readiness checklist
Each app should have an owner, dependency map, runbook, monitoring, a tested rollback plan, and an agreed cutover window. If any of these are missing, delay the app—not the whole program.
11. Step 9: Testing, Cutover & Rollback Planning
The cutover plan is where migration becomes real. You need a plan you can execute under pressure. Minimum testing layers:
- Functional testing: core user journeys.
- Integration testing: dependencies, webhooks, queues, third-party APIs.
- Performance testing: baseline latency, load handling, DB performance.
- Security testing: IAM checks, secret usage, exposure scanning.
- Operational testing: alerts, dashboards, on-call handoffs, runbooks.
A practical cutover runbook should include:
- Who is on the call, roles, and decision authority.
- Exact steps with timestamps (freeze window, DB sync, DNS switch, smoke tests).
- Clear “go/no-go” criteria.
- Rollback triggers and rollback steps that are actually tested.
Cutover example (high-level)
1) Freeze changes / deploy freeze
2) Final data sync (or stop writes + export/import)
3) Switch traffic (DNS / load balancer / routing)
4) Run smoke tests (top 10 user journeys)
5) Monitor dashboards + logs
6) If error rate > threshold for X minutes: rollback
12. Step 10: Operate, Monitor & Optimize (FinOps + Reliability)
The first 30–90 days after a wave is where value is either realized or lost. Focus on two tracks: operational excellence and cost management.
Operational excellence
- Observability: logs, metrics, traces, SLOs.
- Incident response: on-call, runbooks, postmortems, action tracking.
- Patch and vulnerability management: clear responsibilities.
- Backup/restore drills: scheduled and documented.
FinOps basics (cost control)
- Tagging standards: owner, environment, system, cost center.
- Budgets and alerts: proactive notifications before surprises.
- Right-sizing: reduce over-provisioned compute and storage.
- Autoscaling: match capacity to demand where possible.
- Commitment discounts: reserved instances/savings plans for steady workloads.
Cloud cost myth
The cloud is not automatically cheaper. It becomes cost-effective when resources are managed actively and teams treat cost as an engineering metric.
13. Governance & Change Management (People Side)
Cloud migration changes how teams work. Treat it like an operating model change, not just infrastructure work:
- Define ownership: who owns each service end-to-end (build, run, cost).
- Standardize delivery: CI/CD, IaC patterns, template repos, review rules.
- Enable teams: training, documentation, internal office hours.
- Govern lightly, enforce automatically: policies and guardrails as code.
Migration program rhythm
Weekly steering (scope, risks), daily/bi-weekly delivery standups (execution), monthly cost/security reviews (governance).
14. Common Migration Mistakes (And How to Avoid Them)
- Skipping discovery: leads to hidden dependencies and failed cutovers. Fix: inventory + dependency mapping is non-negotiable.
- No landing zone: inconsistent networks, fragmented logging, weak controls. Fix: build a baseline foundation before wave migrations.
- Migrating critical apps first: high blast radius before patterns are proven. Fix: start with learning waves and increase complexity gradually.
- Ignoring operations: apps run, but nobody can support them. Fix: monitoring, runbooks, and on-call readiness before cutover.
- Cost surprise: over-provisioned resources, no tagging, no accountability. Fix: FinOps basics from day one.
15. Cloud Migration Checklist
Use this as a quick sanity-check before and during migrations:
- Goals: outcomes, scope, and success metrics agreed.
- Inventory: owners + dependencies documented for in-scope apps.
- Strategy: each app assigned an R (6Rs) with rationale.
- Landing zone: identity, networking, logging, policies, IaC patterns ready.
- Security: IAM baseline, secrets management, encryption, audit logs.
- Networking: connectivity, routing, DNS, certs planned and tested.
- Data: migration pattern chosen, validation and restore tests planned.
- Wave plan: wave sequencing, cutover windows, stakeholder comms.
- Testing: functional, integration, performance, and ops readiness checks.
- Cutover: runbook + rollback triggers tested and approved.
- Post-migration: monitoring, incident process, cost controls in place.
Fast win
Even if you postpone migration, building a landing zone and standardizing CI/CD + IaC usually delivers immediate operational benefits.
16. FAQ: Cloud Migration
What are the 6Rs of cloud migration?
Rehost, Replatform, Refactor, Repurchase, Retire, and Retain. They help you pick the right approach per application instead of forcing one strategy.
How do I choose which applications to migrate first?
Start with low-risk, well-understood apps with clear owners and limited dependencies. Use early waves to validate networking, security, monitoring, and cutover processes.
What is a landing zone and why does it matter?
A landing zone is the standardized cloud foundation (identity, networking, logging, governance). It reduces risk by making migrations consistent and secure by default.
How do I reduce downtime during migration?
Use online replication where possible, plan short cutover windows, rehearse cutovers, and use a rollback plan with clear triggers. Also plan DNS/traffic switching carefully.
Why do cloud costs spike after migration?
Common reasons include over-provisioning, lack of autoscaling, missing tagging/ownership, and paying for idle resources. Fix with budgets, alerts, right-sizing, and regular cost reviews.
Key cloud terms (quick glossary)
- Landing Zone
- A standardized cloud foundation for identity, networking, logging, security guardrails, and governance.
- 6Rs
- Six common migration strategies: Rehost, Replatform, Refactor, Repurchase, Retire, Retain.
- RTO / RPO
- Recovery Time Objective (how fast you must recover) and Recovery Point Objective (acceptable data loss window).
- Cutover
- The controlled switch from the old environment to the new one, typically involving traffic routing and data finalization.
- Rollback
- A planned reversal to the previous system if the cutover fails or stability thresholds are breached.
- FinOps
- A discipline for managing cloud cost with shared accountability across engineering, finance, and product teams.
- Infrastructure as Code (IaC)
- Managing infrastructure through versioned code (templates/modules) instead of manual console changes.
Worth reading
Recommended guides from the category.