What happens when a Cloud Run service receives a sudden spike of 1000 requests at the same time?
Think about how serverless platforms handle sudden traffic.
Cloud Run automatically scales the number of instances based on incoming requests, allowing it to handle sudden spikes without manual intervention.
You want to deploy a new version of your Cloud Run service and gradually send 20% of traffic to it while keeping 80% on the current version. How can you achieve this?
Cloud Run supports gradual rollouts using built-in features.
Cloud Run allows you to split traffic between revisions of the same service by percentage, enabling gradual rollouts without extra infrastructure.
You want to restrict access to your Cloud Run service so only authenticated users from your Google Workspace domain can invoke it. Which configuration achieves this?
Think about how Cloud Run integrates with IAM and Google identities.
Cloud Run supports IAM-based authentication. Granting the 'Cloud Run Invoker' role to your domain users and requiring authentication restricts access properly.
If you set the concurrency of a Cloud Run service to 1, what is the effect on instance scaling and request handling?
Consider how concurrency affects instance utilization and scaling.
Concurrency set to 1 means each instance processes one request at a time, so Cloud Run scales out more instances to handle multiple requests concurrently.
Your Cloud Run service experiences high latency on the first request after a period of inactivity. Which approach best reduces this cold start delay?
Think about how to keep instances ready without manual pings.
Setting minimum instances keeps some instances running even when idle, reducing cold start latency without extra maintenance.