GCPcloud~30 mins

Request-based auto scaling in GCP - Mini Project: Build & Apply

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Request-based Auto Scaling on Google Cloud Run

📖 Scenario: You are managing a web application deployed on Google Cloud Run. The app needs to automatically adjust the number of container instances based on incoming HTTP request load to save costs and maintain performance.

🎯 Goal: Build a Cloud Run service configuration that enables request-based auto scaling by setting the maximum number of container instances and concurrency limits.

📋 What You'll Learn

Create a Cloud Run service configuration dictionary with the exact service name and image URL

Add a configuration variable for maximum container instances

Set the concurrency limit to control how many requests each container handles

Complete the service configuration with the auto scaling settings

💡 Why This Matters

🌍 Real World

Cloud Run automatically adjusts the number of container instances based on request load, saving costs and improving performance.

💼 Career

Understanding request-based auto scaling is essential for cloud engineers and developers managing scalable web applications on Google Cloud.

Progress0 / 4 steps

Create initial Cloud Run service configuration

Create a dictionary called cloud_run_service with these exact entries: "name": "projects/my-project/locations/us-central1/services/my-service" and "image": "gcr.io/my-project/my-app-image:latest".

GCP

# Create the cloud_run_service dictionary with name and image
# Your code here

Need a hint?

Use a Python dictionary with keys name and image exactly as shown.

Add maximum container instances configuration

Add a key max_instances to the cloud_run_service dictionary and set it to 5.

GCP

cloud_run_service = {
    "name": "projects/my-project/locations/us-central1/services/my-service",
    "image": "gcr.io/my-project/my-app-image:latest"
}
# Add max_instances key with value 5 to cloud_run_service
# Your code here

Need a hint?

Add the max_instances key with the value 5 inside the dictionary.

Set concurrency limit for request handling

Add a key concurrency to the cloud_run_service dictionary and set it to 80 to control how many requests each container can handle simultaneously.

GCP

cloud_run_service = {
    "name": "projects/my-project/locations/us-central1/services/my-service",
    "image": "gcr.io/my-project/my-app-image:latest",
    "max_instances": 5
}
# Add concurrency key with value 80 to cloud_run_service
# Your code here

Need a hint?

Add the concurrency key with the value 80 inside the dictionary.

Complete the Cloud Run service configuration with auto scaling settings

Add a nested dictionary under the key autoscaling in cloud_run_service with the exact entry "target_concurrency": 80.

GCP

cloud_run_service = {
    "name": "projects/my-project/locations/us-central1/services/my-service",
    "image": "gcr.io/my-project/my-app-image:latest",
    "max_instances": 5,
    "concurrency": 80
}
# Add autoscaling dictionary with target_concurrency 80
# Your code here

Need a hint?

Inside cloud_run_service, add the autoscaling key with a dictionary value containing "target_concurrency": 80.