Process Flow - High availability configuration

Start: User requests service

↓

Load Balancer receives request

↓

Check: Are multiple instances available?

No→Create more instances

Yes↓

Distribute request to healthy instance

↓

Instance processes request

↓

Monitor instance health

↓

If instance fails, redirect traffic to healthy instances

↓

Continue serving requests without downtime

The flow shows how a load balancer directs user requests to multiple instances, monitors their health, and ensures traffic is only sent to healthy instances to maintain service availability.

Execution Sample

GCP

resource "google_compute_instance_template" "template" {
  name = "my-instance-template"
  machine_type = "e2-medium"
  disk {
    boot = true
    auto_delete = true
    initialize_params {
      image = "debian-cloud/debian-10"
    }
  }
  network_interface {
    network = "default"
    access_config {}
  }
}

resource "google_compute_instance_group_manager" "igm" {
  name = "my-instance-group"
  base_instance_name = "my-instance"
  instance_template = google_compute_instance_template.template.self_link
  target_size = 2
}

resource "google_compute_health_check" "hc" {
  name = "my-health-check"
  tcp_health_check {
    port = 80
  }
}

resource "google_compute_region_backend_service" "backend" {
  name = "my-backend-service"
  backends {
    group = google_compute_instance_group_manager.igm.instance_group
  }
  health_checks = [google_compute_health_check.hc.self_link]
}

This Terraform code creates an instance template, a managed instance group with two instances, a health check to monitor instance health, and a backend service linked to the group for high availability.

Process Table

Step	Action	Resource State	Result
1	Create instance template	Template created	Ready for instances
2	Create instance group manager with target_size=2	2 instances created	Instances running
3	Create health check on port 80	Health check active	Monitors instance health
4	Create backend service linked to instance group	Backend service active	Load balancer can route traffic
5	Load balancer receives request	Instances healthy	Request routed to instance 1
6	Instance 1 fails health check	Instance 1 marked unhealthy	Traffic rerouted to instance 2
7	Instance 1 recovers	Instance 1 healthy again	Traffic balanced between instances
8	Scale instance group to 3	3 instances running	More capacity for requests
9	Load balancer distributes requests	All instances healthy	Requests balanced across 3 instances
10	Terminate	All systems healthy	High availability maintained

💡 Execution stops as system maintains high availability with healthy instances serving requests.

Status Tracker

Variable	Start	After Step 2	After Step 6	After Step 7	After Step 8	Final
instance_count	0	2	2	2	3	3
instance_1_health	unknown	healthy	unhealthy	healthy	healthy	healthy
instance_2_health	unknown	healthy	healthy	healthy	healthy	healthy
instance_3_health	unknown	n/a	n/a	n/a	healthy	healthy
traffic_distribution	none	balanced between 2	all to instance 2	balanced between 2	balanced between 3	balanced between 3

Key Moments - 3 Insights

Why does traffic stop going to instance 1 at step 6?

What happens when the instance group scales from 2 to 3 instances at step 8?

How does the health check affect high availability?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at step 6. What is the health status of instance 1?

AHealthy

BUnhealthy

CUnknown

DTerminated

Concept Snapshot

High availability uses multiple instances behind a load balancer.
Health checks monitor instance status.
Unhealthy instances are removed from traffic rotation.
Scaling adds more instances for capacity.
This setup ensures continuous service without downtime.

Full Transcript

High availability configuration in cloud infrastructure means setting up multiple instances of a service so that if one fails, others can take over. A load balancer receives user requests and sends them to healthy instances only. Health checks continuously monitor each instance's status. If an instance becomes unhealthy, the load balancer stops sending traffic to it and redirects to healthy ones. Scaling the instance group adds more instances to handle more requests. This process ensures the service stays available without interruption.