0
0
GCPcloud~10 mins

Request-based auto scaling in GCP - Step-by-Step Execution

Choose your learning style9 modes available
Process Flow - Request-based auto scaling
Incoming Requests
Monitor Request Rate
Compare with Threshold
Scale Up
Adjust Instances
Handle Requests with New Capacity
Repeat Monitoring
The system watches incoming requests, checks if they are above or below a set limit, then adds or removes instances to handle the load efficiently.
Execution Sample
GCP
1. Monitor request count every minute
2. If requests > 1000, add 1 instance
3. If requests < 500, remove 1 instance
4. Keep instances between 1 and 5
This simple rule adjusts the number of instances based on request volume to keep performance steady.
Process Table
MinuteRequest CountCondition (>1000?)Condition (<500?)ActionInstances BeforeInstances After
1800NoNoNo change22
21200YesNoScale Up +123
31500YesNoScale Up +134
4400NoYesScale Down -143
5300NoYesScale Down -132
62000YesNoScale Up +123
7600NoNoNo change33
8450NoYesScale Down -132
9100NoYesScale Down -121
101100YesNoScale Up +112
11400NoYesScale Down -121
123000YesNoScale Up +112
133500YesNoScale Up +123
144000YesNoScale Up +134
154500YesNoScale Up +145
165000YesNoNo change (max reached)55
17200NoYesScale Down -154
18100NoYesScale Down -143
1950NoYesScale Down -132
20400NoYesScale Down -121
21300NoYesNo change (min reached)11
💡 At minute 21, instances are at minimum (1), so no further scale down occurs despite low requests.
Status Tracker
VariableStartAfter 1After 2After 3After 4After 5After 6After 7After 8After 9After 10After 11After 12After 13After 14After 15After 16After 17After 18After 19After 20After 21
Instances2234323321212345543211
Key Moments - 3 Insights
Why does the number of instances not go below 1 even when requests are very low?
The system has a minimum limit of 1 instance to ensure the service is always available, as shown in execution_table rows 9, 11, 20, and 21 where scale down stops at 1.
What happens when the request count is exactly 1000 or 500?
The conditions check strictly greater than 1000 and less than 500, so at exactly 1000 or 500 no scaling action occurs. This is implied by the conditions in the execution_table.
Why does scaling up stop at 5 instances even if requests keep increasing?
There is a maximum limit of 5 instances to control costs and resource use, as seen at minute 16 where requests are high but instances remain at 5.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at minute 4. What action is taken and why?
AScale Up by 1 because requests > 1000
BNo change because requests are between 500 and 1000
CScale Down by 1 because requests < 500
DScale Up by 2 because requests are very high
💡 Hint
Check the 'Request Count' and 'Condition (<500?)' columns at minute 4 in execution_table.
At which minute does the number of instances first reach the maximum limit?
AMinute 13
BMinute 15
CMinute 16
DMinute 14
💡 Hint
Look at the 'Instances After' column in execution_table and find when it first hits 5.
If the minimum instances were set to 2 instead of 1, what would happen at minute 21?
AInstances would stay at 2
BInstances would scale down to 0
CInstances would stay at 1
DInstances would scale up to 3
💡 Hint
Refer to variable_tracker for Instances and the minimum limit rule explained in key_moments.
Concept Snapshot
Request-based auto scaling watches incoming requests.
If requests go above a high threshold, it adds instances.
If requests fall below a low threshold, it removes instances.
Instances stay within set minimum and maximum limits.
This keeps service responsive and cost-effective.
Full Transcript
Request-based auto scaling is a way to adjust the number of server instances based on how many requests come in. The system checks the request count every minute. If the count is higher than a set limit, it adds an instance to handle the load. If the count is lower than another limit, it removes an instance to save resources. The number of instances never goes below a minimum or above a maximum to keep the service stable and cost-controlled. This process repeats continuously to match capacity with demand.