Zero-downtime deployment pattern in Terraform - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
We want to understand how the time to deploy changes grows as we increase the number of servers or instances.
Specifically, how does zero-downtime deployment affect the number of operations Terraform performs?
Analyze the time complexity of this Terraform snippet for zero-downtime deployment.
resource "aws_autoscaling_group" "blue" {
name_prefix = "blue-"
desired_capacity = var.instance_count
launch_configuration = aws_launch_configuration.blue.id
}
resource "aws_autoscaling_group" "green" {
name_prefix = "green-"
desired_capacity = var.instance_count
launch_configuration = aws_launch_configuration.green.id
}
resource "aws_route53_record" "app" {
zone_id = var.zone_id
name = var.app_name
type = "A"
alias {
name = aws_autoscaling_group.green.load_balancer_dns_name
zone_id = aws_autoscaling_group.green.load_balancer_zone_id
evaluate_target_health = true
}
}
This code creates two groups of servers (blue and green) and switches traffic between them for zero downtime.
Look at what Terraform does repeatedly when scaling or switching.
- Primary operation: Creating or updating each server instance in the autoscaling groups.
- How many times: Once per instance, for both blue and green groups, so roughly twice the number of instances.
As the number of instances increases, Terraform must manage more resources.
| Input Size (n) | Approx. Api Calls/Operations |
|---|---|
| 10 | About 20 instance operations (10 blue + 10 green) |
| 100 | About 200 instance operations |
| 1000 | About 2000 instance operations |
Pattern observation: The number of operations grows roughly twice as fast as the number of instances because two groups are managed.
Time Complexity: O(n)
This means the deployment time grows linearly with the number of instances managed in both groups.
[X] Wrong: "Zero-downtime deployment means the deployment time stays the same no matter how many instances there are."
[OK] Correct: Even with zero downtime, Terraform must create or update each instance, so the total work grows with the number of instances.
Understanding how deployment time scales helps you design systems that stay responsive and reliable as they grow.
What if we changed from two autoscaling groups (blue and green) to a single group with rolling updates? How would the time complexity change?
Practice
Solution
Step 1: Understand zero-downtime deployment purpose
Zero-downtime deployment means updating apps without stopping them or causing service interruptions.Step 2: Compare options with this goal
Only Update applications without stopping them or causing downtime describes updating without stopping or downtime, matching the goal.Final Answer:
Update applications without stopping them or causing downtime -> Option BQuick Check:
Zero-downtime = no stopping, no downtime [OK]
- Thinking deployment must stop all tasks
- Assuming manual traffic switch is required
- Believing updates only happen off-hours
Solution
Step 1: Identify settings related to task counts during update
Terraform uses settings like max_percent and min_healthy_percent to control task numbers during deployment.Step 2: Understand min_healthy_percent role
min_healthy_percent ensures a minimum percentage of tasks stay healthy and running during updates, preventing downtime.Final Answer:
min_healthy_percent -> Option AQuick Check:
min_healthy_percent controls running tasks during update [OK]
- Confusing max_percent with min_healthy_percent
- Using desired_count which sets total tasks, not update behavior
- Selecting task_definition which defines task specs
deployment_minimum_healthy_percent = 75
deployment_maximum_percent = 200
What does this configuration ensure during deployment?
Solution
Step 1: Interpret deployment_minimum_healthy_percent
This means at least 75% of current tasks must stay healthy and running during deployment.Step 2: Interpret deployment_maximum_percent
This allows up to 200% of the desired tasks to run temporarily, enabling new tasks to start before old ones stop.Final Answer:
At least 75% of tasks stay running; up to 200% tasks can run temporarily -> Option DQuick Check:
Min healthy 75%, max 200% = safe rolling update [OK]
- Thinking percentages mean exact task counts
- Assuming deployment stops tasks before starting new ones
- Confusing min and max percentages
deployment_minimum_healthy_percent = 100 and deployment_maximum_percent = 100 in Terraform for ECS service. What issue will this cause?Solution
Step 1: Analyze min and max percent both at 100%
Min healthy 100% means all old tasks must stay running; max 100% means no extra tasks can start.Step 2: Understand deployment impact
New tasks cannot start until old ones stop, but old ones cannot stop because min healthy is 100%, causing deployment to fail.Final Answer:
Deployment will fail because no new tasks can start before old ones stop -> Option CQuick Check:
Min 100% + Max 100% blocks rolling update [OK]
- Assuming deployment will succeed without downtime
- Thinking max 100% allows extra tasks
- Ignoring min healthy effect on stopping old tasks
Solution
Step 1: Evaluate each option for zero-downtime support
deployment_minimum_healthy_percent = 50allows only 50% healthy tasks, risking downtime.
deployment_maximum_percent = 150deployment_minimum_healthy_percent = 100blocks new tasks starting before old stop.
deployment_maximum_percent = 100deployment_minimum_healthy_percent = 0allows zero healthy tasks, risking downtime.
deployment_maximum_percent = 200deployment_minimum_healthy_percent = 75keeps 75% healthy and allows 125% max tasks, enabling smooth rolling update.
deployment_maximum_percent = 125Step 2: Choose best balance for zero-downtime
deployment_minimum_healthy_percent = 75ensures enough healthy tasks remain while allowing new tasks to start before old stop, supporting zero downtime.
deployment_maximum_percent = 125Final Answer:
deployment_minimum_healthy_percent = 75 and deployment_maximum_percent = 125 -> Option AQuick Check:
Min healthy 75% + max 125% = safe rolling update [OK]
- Choosing min healthy too low risking downtime
- Choosing min and max both 100% blocking updates
- Allowing zero healthy tasks during deployment
