0
0
GCPcloud~30 mins

Concurrency and scaling in GCP - Mini Project: Build & Apply

Choose your learning style9 modes available
Concurrency and Scaling with Google Cloud Run
📖 Scenario: You are building a simple web service that needs to handle multiple users at the same time. To do this efficiently, you will use Google Cloud Run, which automatically scales your service based on demand.In this project, you will create a Cloud Run service configuration that sets up concurrency and scaling limits to control how many requests each instance can handle and how many instances can run.
🎯 Goal: Create a Cloud Run service configuration YAML file that sets the concurrency to 10 requests per instance and limits the maximum number of instances to 5.
📋 What You'll Learn
Create a YAML file named cloudrun-service.yaml for Cloud Run service configuration.
Set the concurrency to exactly 10 in the configuration.
Set the maximum number of instances to exactly 5.
Include the required metadata and spec fields for a valid Cloud Run service.
💡 Why This Matters
🌍 Real World
Cloud Run is used to deploy containerized web services that automatically scale based on traffic. Setting concurrency and max instances helps control performance and cost.
💼 Career
Understanding concurrency and scaling in Cloud Run is essential for cloud engineers and developers to build efficient, cost-effective serverless applications.
Progress0 / 4 steps
1
Create the basic Cloud Run service YAML structure
Create a YAML file named cloudrun-service.yaml with the basic structure for a Cloud Run service. Include apiVersion set to serving.knative.dev/v1, kind set to Service, and a metadata section with name set to my-service.
GCP
Need a hint?

Start by defining the API version, kind, and metadata with the service name.

2
Add the spec and template sections for the service
Add a spec section with a template inside it. Under template, add spec and then containers list with one container. The container should have image set to gcr.io/cloudrun/hello.
GCP
Need a hint?

Remember to nest spec and template properly and add the container image.

3
Set concurrency to 10 requests per instance
Inside the template section, add metadata with annotations. Add the annotation autoscaling.knative.dev/concurrency with the value 10 as a string.
GCP
Need a hint?

Annotations go under template.metadata. The concurrency value must be a string.

4
Limit maximum instances to 5
Add another annotation under autoscaling.knative.dev/concurrency inside template.metadata.annotations. Set autoscaling.knative.dev/maxScale to the string 5 to limit the maximum number of instances.
GCP
Need a hint?

Make sure the maxScale annotation is a string and placed alongside concurrency.