0
0
GcpHow-ToBeginner · 3 min read

How to Set Concurrency in Cloud Run for Google Cloud

To set concurrency in Cloud Run, use the --concurrency flag with the gcloud run deploy command or set it in the Cloud Console. This controls how many requests each container instance can handle at the same time, improving resource use and performance.
📐

Syntax

The concurrency setting controls how many requests a single Cloud Run container instance can process simultaneously.

Use the following syntax with the Google Cloud CLI:

  • gcloud run deploy SERVICE_NAME --concurrency=NUMBER

Where:

  • SERVICE_NAME is your Cloud Run service name.
  • NUMBER is the maximum concurrent requests per container instance (1 to 1000).
bash
gcloud run deploy SERVICE_NAME --concurrency=NUMBER
💻

Example

This example deploys a Cloud Run service named hello-world with concurrency set to 10. This means each container instance will handle up to 10 requests at the same time.

bash
gcloud run deploy hello-world --image=gcr.io/cloudrun/hello --concurrency=10 --region=us-central1 --platform=managed
Output
Deploying container to Cloud Run service [hello-world] in project [YOUR_PROJECT_ID] region [us-central1] Done. Service [hello-world] revision [hello-world-00001-abc] has been deployed and is serving 100 percent of traffic.
⚠️

Common Pitfalls

Common mistakes when setting concurrency include:

  • Setting concurrency too high can cause slow response times if your container cannot handle many requests at once.
  • Setting concurrency to 1 disables request multiplexing, which can increase costs.
  • Forgetting to specify the --concurrency flag defaults concurrency to 80.
  • Not matching concurrency settings with your container's actual ability to handle parallel requests.
bash
Wrong:
gcloud run deploy my-service --concurrency=0

Right:
gcloud run deploy my-service --concurrency=1
📊

Quick Reference

OptionDescriptionNotes
--concurrency=NUMBERSets max concurrent requests per container instanceValid range: 1 to 1000
Default80If not set, concurrency defaults to 80
Concurrency=1One request at a time per instanceGood for non-thread-safe apps
Concurrency>1Multiple requests at onceBetter resource use if app supports it

Key Takeaways

Use the --concurrency flag with gcloud run deploy to set concurrency.
Concurrency controls how many requests a container instance handles at once.
Set concurrency based on your app's ability to handle parallel requests.
Default concurrency is 80 if not specified.
Concurrency=1 means one request at a time, which can increase costs.