GCPcloud~10 mins

Request-based auto scaling in GCP - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Process Flow - Request-based auto scaling

Incoming Requests

↓

Monitor Request Rate

↓

Compare with Threshold

↓

Scale Up

↓

Adjust Instances

↓

Handle Requests with New Capacity

↓

Repeat Monitoring

The system watches incoming requests, checks if they are above or below a set limit, then adds or removes instances to handle the load efficiently.

Execution Sample

GCP

1. Monitor request count every minute
2. If requests > 1000, add 1 instance
3. If requests < 500, remove 1 instance
4. Keep instances between 1 and 5

This simple rule adjusts the number of instances based on request volume to keep performance steady.

Process Table

Minute	Request Count	Condition (>1000?)	Condition (<500?)	Action	Instances Before	Instances After
1	800	No	No	No change	2	2
2	1200	Yes	No	Scale Up +1	2	3
3	1500	Yes	No	Scale Up +1	3	4
4	400	No	Yes	Scale Down -1	4	3
5	300	No	Yes	Scale Down -1	3	2
6	2000	Yes	No	Scale Up +1	2	3
7	600	No	No	No change	3	3
8	450	No	Yes	Scale Down -1	3	2
9	100	No	Yes	Scale Down -1	2	1
10	1100	Yes	No	Scale Up +1	1	2
11	400	No	Yes	Scale Down -1	2	1
12	3000	Yes	No	Scale Up +1	1	2
13	3500	Yes	No	Scale Up +1	2	3
14	4000	Yes	No	Scale Up +1	3	4
15	4500	Yes	No	Scale Up +1	4	5
16	5000	Yes	No	No change (max reached)	5	5
17	200	No	Yes	Scale Down -1	5	4
18	100	No	Yes	Scale Down -1	4	3
19	50	No	Yes	Scale Down -1	3	2
20	400	No	Yes	Scale Down -1	2	1
21	300	No	Yes	No change (min reached)	1	1

💡 At minute 21, instances are at minimum (1), so no further scale down occurs despite low requests.

Status Tracker

Variable	Start	After 1	After 2	After 3	After 4	After 5	After 6	After 7	After 8	After 9	After 10	After 11	After 12	After 13	After 14	After 15	After 16	After 17	After 18	After 19	After 20	After 21
Instances	2	2	3	4	3	2	3	3	2	1	2	1	2	3	4	5	5	4	3	2	1	1

Key Moments - 3 Insights

Why does the number of instances not go below 1 even when requests are very low?

What happens when the request count is exactly 1000 or 500?

Why does scaling up stop at 5 instances even if requests keep increasing?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at minute 4. What action is taken and why?

AScale Up by 1 because requests > 1000

BNo change because requests are between 500 and 1000

CScale Down by 1 because requests < 500

DScale Up by 2 because requests are very high

Concept Snapshot

Request-based auto scaling watches incoming requests.
If requests go above a high threshold, it adds instances.
If requests fall below a low threshold, it removes instances.
Instances stay within set minimum and maximum limits.
This keeps service responsive and cost-effective.

Full Transcript

Request-based auto scaling is a way to adjust the number of server instances based on how many requests come in. The system checks the request count every minute. If the count is higher than a set limit, it adds an instance to handle the load. If the count is lower than another limit, it removes an instance to save resources. The number of instances never goes below a minimum or above a maximum to keep the service stable and cost-controlled. This process repeats continuously to match capacity with demand.