Challenge - 5 Problems

🎖️

Auto-scaling Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

1:30remaining

Understanding Auto-scaling Triggers

Which metric is most commonly used to trigger auto-scaling of inference endpoints in a cloud environment?

ATime of day

BDisk space usage

CNumber of active users on the website

DCPU utilization percentage

Attempts:

2 left

💻 Command Output

intermediate

1:30remaining

Interpreting Auto-scaling CLI Output

Given the following CLI output from an auto-scaling tool monitoring an inference endpoint, what is the current number of active instances?

Endpoint: model-v1
Instances: 3
CPU Utilization: 75%
Scaling Status: Stable

B75

Attempts:

2 left

❓ Configuration

advanced

2:00remaining

Configuring Auto-scaling Policy

Which YAML snippet correctly configures an auto-scaling policy to scale out when CPU usage exceeds 70% and scale in when below 30%?

autoScaling:
  minInstances: 1
  maxInstances: 5
  targetCPUUtilizationPercentage: 50

autoScaling:
  minInstances: 1
  maxInstances: 5
  cpuUtilization:
    scaleOut: 70
    scaleIn: 30

autoScaling:
  minInstances: 1
  maxInstances: 5
  scaleOutCPUThreshold: 70
  scaleInCPUThreshold: 30

autoScaling:
  minInstances: 1
  maxInstances: 5
  targetCPUUtilizationPercentage: 70

Attempts:

2 left

❓ Troubleshoot

advanced

2:00remaining

Diagnosing Auto-scaling Failure

An inference endpoint is not scaling out despite high CPU usage. Which of the following is the most likely cause?

AThe endpoint has no traffic

BThe maxInstances limit is set to 1

CThe CPU usage is below the scale-out threshold

DThe auto-scaling feature is disabled in the cloud provider

Attempts:

2 left

🔀 Workflow

expert

2:30remaining

Auto-scaling Workflow for Inference Endpoint

Arrange the steps in the correct order for an auto-scaling workflow of an inference endpoint.

A1,2,3,4

B2,1,3,4

C1,3,2,4

D1,2,4,3

Attempts:

2 left

Practice

(1/5)

1. What is the main purpose of auto-scaling inference endpoints in ML services?

easy

A. To automatically adjust the number of servers based on traffic

B. To manually add servers when traffic increases

C. To reduce the accuracy of ML models during high traffic

D. To store more data for training models

Auto-scaling inference endpoints in MLOps - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand auto-scaling concept

Step 2: Identify the purpose in ML inference

Final Answer:

Quick Check:

Solution

Step 1: Identify minimum server setting

Step 2: Differentiate from other settings

Final Answer:

Quick Check:

Solution

Step 1: Compare current usage to target utilization

Step 2: Determine scaling action

Final Answer:

Quick Check:

Solution

Step 1: Analyze scaling limits

Step 2: Check target utilization impact

Final Answer:

Quick Check:

Solution

Step 1: Set minimum and maximum servers correctly

Step 2: Set target utilization to 60%

Step 3: Verify options

Final Answer:

Quick Check: