How to Set Concurrency in Cloud Run for Google Cloud
To set concurrency in
Cloud Run, use the --concurrency flag with the gcloud run deploy command or set it in the Cloud Console. This controls how many requests each container instance can handle at the same time, improving resource use and performance.Syntax
The concurrency setting controls how many requests a single Cloud Run container instance can process simultaneously.
Use the following syntax with the Google Cloud CLI:
gcloud run deploy SERVICE_NAME --concurrency=NUMBER
Where:
SERVICE_NAMEis your Cloud Run service name.NUMBERis the maximum concurrent requests per container instance (1 to 1000).
bash
gcloud run deploy SERVICE_NAME --concurrency=NUMBER
Example
This example deploys a Cloud Run service named hello-world with concurrency set to 10. This means each container instance will handle up to 10 requests at the same time.
bash
gcloud run deploy hello-world --image=gcr.io/cloudrun/hello --concurrency=10 --region=us-central1 --platform=managedOutput
Deploying container to Cloud Run service [hello-world] in project [YOUR_PROJECT_ID] region [us-central1]
Done.
Service [hello-world] revision [hello-world-00001-abc] has been deployed and is serving 100 percent of traffic.
Common Pitfalls
Common mistakes when setting concurrency include:
- Setting concurrency too high can cause slow response times if your container cannot handle many requests at once.
- Setting concurrency to 1 disables request multiplexing, which can increase costs.
- Forgetting to specify the
--concurrencyflag defaults concurrency to 80. - Not matching concurrency settings with your container's actual ability to handle parallel requests.
bash
Wrong: gcloud run deploy my-service --concurrency=0 Right: gcloud run deploy my-service --concurrency=1
Quick Reference
| Option | Description | Notes |
|---|---|---|
| --concurrency=NUMBER | Sets max concurrent requests per container instance | Valid range: 1 to 1000 |
| Default | 80 | If not set, concurrency defaults to 80 |
| Concurrency=1 | One request at a time per instance | Good for non-thread-safe apps |
| Concurrency>1 | Multiple requests at once | Better resource use if app supports it |
Key Takeaways
Use the --concurrency flag with gcloud run deploy to set concurrency.
Concurrency controls how many requests a container instance handles at once.
Set concurrency based on your app's ability to handle parallel requests.
Default concurrency is 80 if not specified.
Concurrency=1 means one request at a time, which can increase costs.