0
0
MLOpsdevops~20 mins

Auto-scaling inference endpoints in MLOps - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Auto-scaling Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
Understanding Auto-scaling Triggers

Which metric is most commonly used to trigger auto-scaling of inference endpoints in a cloud environment?

ATime of day
BDisk space usage
CNumber of active users on the website
DCPU utilization percentage
Attempts:
2 left
💡 Hint

Think about what resource usage directly affects the ability to handle inference requests.

💻 Command Output
intermediate
1:30remaining
Interpreting Auto-scaling CLI Output

Given the following CLI output from an auto-scaling tool monitoring an inference endpoint, what is the current number of active instances?

Endpoint: model-v1
Instances: 3
CPU Utilization: 75%
Scaling Status: Stable
A3
B75
C1
D0
Attempts:
2 left
💡 Hint

Look for the line that indicates how many instances are running.

Configuration
advanced
2:00remaining
Configuring Auto-scaling Policy

Which YAML snippet correctly configures an auto-scaling policy to scale out when CPU usage exceeds 70% and scale in when below 30%?

A
autoScaling:
  minInstances: 1
  maxInstances: 5
  targetCPUUtilizationPercentage: 50
B
autoScaling:
  minInstances: 1
  maxInstances: 5
  cpuUtilization:
    scaleOut: 70
    scaleIn: 30
C
autoScaling:
  minInstances: 1
  maxInstances: 5
  scaleOutCPUThreshold: 70
  scaleInCPUThreshold: 30
D
autoScaling:
  minInstances: 1
  maxInstances: 5
  targetCPUUtilizationPercentage: 70
Attempts:
2 left
💡 Hint

Look for explicit scale out and scale in thresholds.

Troubleshoot
advanced
2:00remaining
Diagnosing Auto-scaling Failure

An inference endpoint is not scaling out despite high CPU usage. Which of the following is the most likely cause?

AThe endpoint has no traffic
BThe maxInstances limit is set to 1
CThe CPU usage is below the scale-out threshold
DThe auto-scaling feature is disabled in the cloud provider
Attempts:
2 left
💡 Hint

Check if the scaling limits allow more instances to be created.

🔀 Workflow
expert
2:30remaining
Auto-scaling Workflow for Inference Endpoint

Arrange the steps in the correct order for an auto-scaling workflow of an inference endpoint.

A1,2,3,4
B2,1,3,4
C1,3,2,4
D1,2,4,3
Attempts:
2 left
💡 Hint

Think about monitoring first, then triggering, then adding instances, then routing traffic.