High availability design patterns in Azure - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
We want to understand how the effort to keep a system always running grows as we add more components or users.
How does the work needed to maintain high availability change when the system grows?
Analyze the time complexity of deploying multiple instances with load balancing and failover.
// Create a load balancer
resource lb 'Microsoft.Network/loadBalancers@2022-05-01' = {
name: 'myLoadBalancer'
location: resourceGroup().location
properties: {
frontendIPConfigurations: [...],
backendAddressPools: [...]
}
}
// Deploy multiple VM instances
var vmCount = 5
resource vms 'Microsoft.Compute/virtualMachines@2022-08-01' = [for (i, int) in range(0, vmCount): {
name: 'vm${i}'
location: resourceGroup().location
properties: { ... }
}]
// Attach VMs to load balancer backend pool
// Setup health probes and failover rules
This sequence sets up a load balancer and multiple virtual machines to share traffic and provide failover.
Look at what happens multiple times as we add more VMs.
- Primary operation: Creating each virtual machine instance.
- How many times: Once per VM, so equal to the number of VMs.
- Supporting operations: Attaching each VM to the load balancer backend pool also repeats per VM.
As you add more VMs, the number of creation and attachment steps grows directly with the number of VMs.
| Input Size (n) | Approx. Api Calls/Operations |
|---|---|
| 10 | About 10 VM creations + 10 attachments |
| 100 | About 100 VM creations + 100 attachments |
| 1000 | About 1000 VM creations + 1000 attachments |
Pattern observation: The work grows in a straight line with the number of VMs added.
Time Complexity: O(n)
This means the time to set up high availability grows directly with how many instances you add.
[X] Wrong: "Adding more VMs won't increase setup time because the load balancer handles them all at once."
[OK] Correct: Each VM still needs to be created and connected individually, so the total work grows with the number of VMs.
Understanding how setup effort grows helps you design systems that stay reliable as they grow, a key skill in cloud architecture.
"What if we used a managed service that automatically scales instances? How would the time complexity change?"
Practice
Solution
Step 1: Understand the role of Azure Load Balancer
Azure Load Balancer distributes incoming network traffic across multiple VMs to prevent any single VM from becoming a bottleneck.Step 2: Compare with other services
Azure Blob Storage stores data, Azure Functions run code, and Cosmos DB is a database service; none distribute traffic.Final Answer:
Azure Load Balancer -> Option CQuick Check:
Traffic distribution = Azure Load Balancer [OK]
- Confusing storage or compute services with traffic distribution
- Choosing Azure Functions for load balancing
- Selecting database services for availability patterns
Solution
Step 1: Identify the correct Azure CLI command for VM Scale Set creation
The command to create a VM Scale Set isaz vmss create, notaz vm create.Step 2: Check the parameters
Parameters like--name,--resource-group,--image, and--instance-countare correctly used in az vmss create --name MyScaleSet --resource-group MyResourceGroup --image UbuntuLTS --instance-count 3.Final Answer:
az vmss create --name MyScaleSet --resource-group MyResourceGroup --image UbuntuLTS --instance-count 3 -> Option DQuick Check:
VM Scale Set creation uses az vmss create [OK]
- Using 'az vm create' instead of 'az vmss create'
- Incorrect parameter names like --count instead of --instance-count
- Mixing resource group parameter names
frontendIPConfiguration:
name: LoadBalancerFrontEnd
publicIPAddress:
id: /subscriptions/xxx/resourceGroups/rg/providers/Microsoft.Network/publicIPAddresses/myPublicIP
backendAddressPools:
- name: BackendPool
loadBalancingRules:
- name: HTTPRule
frontendIPConfiguration: LoadBalancerFrontEnd
backendAddressPool: BackendPool
protocol: Tcp
frontendPort: 80
backendPort: 80
enableFloatingIP: false
idleTimeoutInMinutes: 4
loadDistribution: DefaultWhat will happen if one VM in the backend pool becomes unhealthy?
Solution
Step 1: Understand Azure Load Balancer health probe behavior
Azure Load Balancer requires health probes configured to detect unhealthy VMs and stop sending traffic to them. This snippet does not show health probes configured, but in practice, health probes are necessary for proper load balancing.Step 2: Analyze the effect of missing health probes
Without health probes, the Load Balancer cannot detect unhealthy VMs, so it continues sending traffic to all VMs in the backend pool. However, best practice is to configure health probes to avoid this.Final Answer:
Traffic will automatically stop going to the unhealthy VM -> Option AQuick Check:
Health probes detect unhealthy VMs and stop traffic [OK]
- Assuming Load Balancer auto-detects unhealthy VMs without probes
- Thinking Load Balancer restarts VMs
- Confusing port redirection with load balancing
Solution
Step 1: Understand Active-Passive with Traffic Manager Priority routing
Priority routing sends traffic to the primary endpoint unless it is unhealthy, then fails over to secondary.Step 2: Identify impact of misconfigured health probes
If health probes are misconfigured, Traffic Manager cannot detect endpoint health and will not failover properly, causing downtime.Final Answer:
Traffic Manager is set to Priority routing but health probes are misconfigured -> Option BQuick Check:
Priority routing + bad probes = failover fails [OK]
- Confusing routing methods in Traffic Manager
- Blaming Load Balancer or VM Scale Set for Traffic Manager failover
- Ignoring health probe configuration
Solution
Step 1: Understand geo-redundancy requirements
To survive a full region failure, the app must be deployed in multiple regions with traffic routed between them.Step 2: Evaluate options for traffic routing and data replication
Performance routing in Traffic Manager directs users to the closest healthy region. Azure SQL Geo-Replication ensures database availability across regions.Step 3: Compare with other options
Priority routing is for Active-Passive, not best for geo-load balancing. Single region deployments cannot survive region failure. Application Gateway is regional and does not provide geo-failover.Final Answer:
Deploy the app in two regions with Azure Traffic Manager using Performance routing and Azure SQL Geo-Replication -> Option AQuick Check:
Geo-redundancy needs multi-region + performance routing + geo-replication [OK]
- Choosing Priority routing for geo-load balancing
- Relying on single region with backup for high availability
- Confusing Application Gateway with global traffic routing
