In Google Cloud Run, when request-based auto scaling is enabled, what happens if your service suddenly receives a large number of requests?
Think about how Cloud Run balances load and scales instances smoothly.
Cloud Run uses request-based auto scaling to add instances gradually based on concurrent requests per instance. It does not instantly create all instances nor reject requests.
You want to limit the maximum number of instances your Cloud Run service can scale to during high traffic. Which configuration setting controls this limit?
Think about which setting limits the upper bound of instances.
The max-instances setting limits how many instances Cloud Run can create, controlling maximum scale.
You have a Cloud Run service that processes image uploads. To optimize cost and performance, which architecture choice best supports efficient request-based auto scaling?
Think about balancing instance count and concurrency for cost and performance.
Higher concurrency with moderate max-instances allows efficient use of resources and smooth scaling.
When using request-based auto scaling in Cloud Run, what is a key security best practice to protect your service from sudden traffic spikes caused by malicious requests?
Think about how to prevent bad traffic from causing resource waste.
Using Cloud Armor helps filter malicious traffic, protecting your service and controlling scaling costs.
You want to optimize your Cloud Run service to balance cost and latency during unpredictable traffic. Which combination of settings and practices achieves this best?
Think about trade-offs between keeping instances warm and controlling max scale.
Moderate concurrency and max-instances with minimum instances enabled balances latency and cost effectively.