Bird
Raised Fist0
MLOpsdevops~20 mins

Prediction distribution monitoring in MLOps - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
Prediction Distribution Monitoring Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding prediction distribution monitoring purpose

What is the main goal of prediction distribution monitoring in an ML system?

ATo detect changes in the prediction distribution that may affect model performance
BTo track the number of API calls made to the prediction service
CTo monitor the hardware usage of the ML deployment environment
DTo optimize the training speed of the machine learning model
Attempts:
2 left
💡 Hint

Think about what can cause a model to perform worse after deployment.

💻 Command Output
intermediate
2:00remaining
Interpreting output of a distribution drift detection tool

Given the following output from a drift detection tool monitoring prediction probabilities, what does it indicate?

{"drift_detected": true, "p_value": 0.01, "metric": "kolmogorov_smirnov"}
AThe tool failed to run due to a syntax error
BThe model predictions are exactly the same as baseline
CThe prediction distribution has significantly changed compared to baseline
DThe p_value indicates no significant change in distribution
Attempts:
2 left
💡 Hint

Recall that a low p-value means strong evidence against the null hypothesis.

Configuration
advanced
3:00remaining
Configuring a monitoring job for prediction distribution

Which configuration snippet correctly sets up a monitoring job to track prediction probability distribution using a Kolmogorov-Smirnov test every hour?

A
monitoring_job:
  frequency: every_minute
  metric: kolmogorov_smirnov
  data_source: model_weights
  alert_threshold: 0.05
B
monitoring_job:
  frequency: daily
  metric: accuracy
  data_source: input_features
  alert_threshold: 0.01
C
monitoring_job:
  frequency: hourly
  metric: mean_squared_error
  data_source: prediction_probabilities
  alert_threshold: 0.1
D
monitoring_job:
  frequency: hourly
  metric: kolmogorov_smirnov
  data_source: prediction_probabilities
  alert_threshold: 0.05
Attempts:
2 left
💡 Hint

Focus on frequency, metric type, and data source relevant to prediction distribution.

Troubleshoot
advanced
2:30remaining
Diagnosing missing alerts in prediction distribution monitoring

An ML engineer notices no alerts are triggered despite clear changes in prediction distribution. Which is the most likely cause?

AThe alert threshold is set too high, preventing alerts from triggering
BThe prediction service is running on outdated hardware
CThe monitoring job frequency is set to daily instead of hourly
DThe model training data was too large
Attempts:
2 left
💡 Hint

Consider how alert thresholds affect sensitivity.

🔀 Workflow
expert
3:00remaining
Steps to implement prediction distribution monitoring in production

What is the correct order of steps to implement prediction distribution monitoring for a deployed ML model?

A1,2,3,4
B1,3,2,4
C2,1,3,4
D3,1,2,4
Attempts:
2 left
💡 Hint

Think about what you need before deploying monitoring and alerting.

Practice

(1/5)
1. What is the main purpose of prediction distribution monitoring in MLOps?
easy
A. To monitor the training data quality only
B. To track changes in the model's output predictions over time
C. To improve the speed of model training
D. To increase the size of the prediction dataset

Solution

  1. Step 1: Understand prediction distribution monitoring

    It focuses on watching the outputs (predictions) of a model to detect changes or shifts.
  2. Step 2: Differentiate from other monitoring types

    It is not about training data quality or training speed but about output behavior over time.
  3. Final Answer:

    To track changes in the model's output predictions over time -> Option B
  4. Quick Check:

    Prediction monitoring = track output changes [OK]
Hint: Focus on what is monitored: model outputs, not inputs or speed [OK]
Common Mistakes:
  • Confusing prediction monitoring with data quality monitoring
  • Thinking it speeds up training
  • Assuming it increases dataset size
2. Which of the following is the correct way to calculate the distribution of predictions in Python using NumPy?
easy
A. np.sort(predictions, bins=10)
B. np.mean(predictions, bins=10)
C. np.sum(predictions, bins=10)
D. np.histogram(predictions, bins=10)

Solution

  1. Step 1: Identify the function for distribution calculation

    NumPy's np.histogram calculates the frequency distribution of values in bins.
  2. Step 2: Check other options

    np.mean calculates average, np.sum sums values, and np.sort sorts values, none calculate distribution.
  3. Final Answer:

    np.histogram(predictions, bins=10) -> Option D
  4. Quick Check:

    Distribution = histogram [OK]
Hint: Use np.histogram to get frequency counts in bins [OK]
Common Mistakes:
  • Using mean or sum instead of histogram for distribution
  • Trying to sort to get distribution
  • Passing wrong arguments to functions
3. Given the following Python code snippet for monitoring prediction distribution, what will be the output?
import numpy as np
predictions = np.array([0.1, 0.4, 0.35, 0.8, 0.9])
hist, bins = np.histogram(predictions, bins=3)
print(hist)
medium
A. [3 1 1]
B. [1 2 2]
C. [2 1 2]
D. [2 2 1]

Solution

  1. Step 1: Understand bin edges

    With bins=3, the range 0.1 to 0.9 is split into 3 equal parts: approx [0.1-0.4), [0.4-0.7), [0.7-1.0].
  2. Step 2: Count predictions in each bin

    Bin 1: 0.1, 0.4 (0.4 is right edge, goes to next bin) -> 0.1 only -> 1 count Bin 2: 0.4, 0.35 -> 0.35 and 0.4 -> 2 counts Bin 3: 0.8, 0.9 -> 2 counts
  3. Step 3: Correct bin counts

    Actually, np.histogram includes left edge, excludes right except last bin. So bins: [0.1,0.4), [0.4,0.7), [0.7,1.0] Values: 0.1 in bin1 0.35 in bin1 0.4 in bin2 0.8 in bin3 0.9 in bin3 Counts: bin1=2, bin2=1, bin3=2
  4. Final Answer:

    [2 1 2] -> Option C
  5. Quick Check:

    Histogram counts = [2,1,2] [OK]
Hint: Remember np.histogram includes left edge, excludes right edge except last bin [OK]
Common Mistakes:
  • Miscounting values on bin edges
  • Assuming bins include right edge
  • Confusing bin counts order
4. You have this monitoring code snippet that throws an error:
import numpy as np
predictions = [0.2, 0.5, 0.7]
hist, bins = np.histogram(predictions, bins='five')
print(hist)
What is the cause of the error?
medium
A. The bins parameter must be an integer or sequence, not a string
B. The predictions list must be a NumPy array, not a list
C. The print statement syntax is incorrect
D. np.histogram does not accept more than 3 values

Solution

  1. Step 1: Check bins parameter type

    np.histogram expects bins as an integer or a sequence of bin edges, not a string like 'five'.
  2. Step 2: Verify other parts

    Predictions can be a list or array, print syntax is correct, and np.histogram accepts any length array.
  3. Final Answer:

    The bins parameter must be an integer or sequence, not a string -> Option A
  4. Quick Check:

    Bins must be int or list, not string [OK]
Hint: Bins must be number or list, never a string [OK]
Common Mistakes:
  • Thinking list input causes error
  • Blaming print syntax
  • Assuming np.histogram limits input size
5. You want to detect if your model's prediction distribution has shifted significantly from the baseline. Which approach is best to implement in your monitoring pipeline?
hard
A. Calculate the KL divergence between baseline and current prediction distributions regularly
B. Only check if the average prediction value changes
C. Retrain the model every day regardless of prediction changes
D. Ignore distribution changes and focus on input data monitoring

Solution

  1. Step 1: Understand distribution shift detection

    KL divergence measures how one distribution differs from another, ideal for detecting prediction shifts.
  2. Step 2: Evaluate other options

    Checking only average misses distribution shape changes; retraining blindly wastes resources; ignoring prediction changes misses key signals.
  3. Final Answer:

    Calculate the KL divergence between baseline and current prediction distributions regularly -> Option A
  4. Quick Check:

    Use KL divergence for distribution shift detection [OK]
Hint: Use KL divergence to compare distributions, not just averages [OK]
Common Mistakes:
  • Monitoring only average values
  • Retraining without monitoring
  • Ignoring prediction distribution shifts