GCPcloud~30 mins

Concurrency and scaling in GCP - Mini Project: Build & Apply

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concurrency and Scaling with Google Cloud Run

📖 Scenario: You are building a simple web service that needs to handle multiple users at the same time. To do this efficiently, you will use Google Cloud Run, which automatically scales your service based on demand.In this project, you will create a Cloud Run service configuration that sets up concurrency and scaling limits to control how many requests each instance can handle and how many instances can run.

🎯 Goal: Create a Cloud Run service configuration YAML file that sets the concurrency to 10 requests per instance and limits the maximum number of instances to 5.

📋 What You'll Learn

Create a YAML file named cloudrun-service.yaml for Cloud Run service configuration.

Set the concurrency to exactly 10 in the configuration.

Set the maximum number of instances to exactly 5.

Include the required metadata and spec fields for a valid Cloud Run service.

💡 Why This Matters

🌍 Real World

Cloud Run is used to deploy containerized web services that automatically scale based on traffic. Setting concurrency and max instances helps control performance and cost.

💼 Career

Understanding concurrency and scaling in Cloud Run is essential for cloud engineers and developers to build efficient, cost-effective serverless applications.

Progress0 / 4 steps

Create the basic Cloud Run service YAML structure

Create a YAML file named cloudrun-service.yaml with the basic structure for a Cloud Run service. Include apiVersion set to serving.knative.dev/v1, kind set to Service, and a metadata section with name set to my-service.

GCP

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-service
# Your code here

Need a hint?

Start by defining the API version, kind, and metadata with the service name.

Add the spec and template sections for the service

Add a spec section with a template inside it. Under template, add spec and then containers list with one container. The container should have image set to gcr.io/cloudrun/hello.

GCP

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-service
spec:
  template:
    spec:
      containers:
      - image: gcr.io/cloudrun/hello
# Your code here

Need a hint?

Remember to nest spec and template properly and add the container image.

Set concurrency to 10 requests per instance

Inside the template section, add metadata with annotations. Add the annotation autoscaling.knative.dev/concurrency with the value 10 as a string.

GCP

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-service
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/concurrency: "10"
    spec:
      containers:
      - image: gcr.io/cloudrun/hello
# Your code here

Need a hint?

Annotations go under template.metadata. The concurrency value must be a string.

Limit maximum instances to 5

Add another annotation under autoscaling.knative.dev/concurrency inside template.metadata.annotations. Set autoscaling.knative.dev/maxScale to the string 5 to limit the maximum number of instances.

GCP

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: my-service
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/concurrency: "10"
        autoscaling.knative.dev/maxScale: "5"
    spec:
      containers:
      - image: gcr.io/cloudrun/hello
# Your code here

Need a hint?

Make sure the maxScale annotation is a string and placed alongside concurrency.