Challenge - 5 Problems

🎖️

Model Drift Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

1:30remaining

What is model drift in machine learning?

Which of the following best describes model drift?

AWhen the data distribution changes causing the model's predictions to become less accurate

BWhen the model's performance improves over time without retraining

CWhen the model is trained on more data than originally planned

DWhen the model's architecture is changed during training

Attempts:

2 left

❓ Metrics

intermediate

1:30remaining

Detecting model drift using metrics

You have a classification model deployed in production. Which metric change would most likely indicate model drift?

ATraining loss decreases during model training

BPrecision and recall remain constant

CSudden drop in F1-score on new data compared to training data

DAccuracy increases steadily over time

Attempts:

2 left

❓ Predict Output

advanced

2:00remaining

Output of drift detection code snippet

What is the output of this Python code that compares feature distributions to detect drift?

ML Python

import numpy as np
from scipy.stats import ks_2samp

reference = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
new_data = np.array([2, 3, 4, 5, 6, 7, 8, 9, 10])

stat, p_value = ks_2samp(reference, new_data)

if p_value < 0.05:
    print('Drift detected')
else:
    print('No drift detected')

ATypeError

BDrift detected

CSyntaxError

DNo drift detected

Attempts:

2 left

❓ Hyperparameter

advanced

1:30remaining

Choosing parameters for drift detection sensitivity

In a drift detection system using a statistical test, which parameter controls how sensitive the system is to detecting drift?

AThe significance level (alpha) threshold for the test

BThe number of features in the dataset

CThe learning rate of the model

DThe batch size of data processed

Attempts:

2 left

🔧 Debug

expert

2:30remaining

Why does this drift detection code fail to detect drift?

Consider this code snippet for detecting drift using population stability index (PSI). Why does it fail to detect drift when new data distribution changes significantly? ```python import numpy as np def psi(expected, actual, buckets=10): def scale_range(input, min, max): input = input - np.min(input) input = input / np.max(input) * (max - min) input = input + min return input breakpoints = np.arange(0, buckets + 1) / buckets * 100 expected_percents = np.histogram(scale_range(expected, 0, 100), bins=breakpoints)[0] / len(expected) actual_percents = np.histogram(scale_range(actual, 0, 100), bins=breakpoints)[0] / len(actual) psi_value = np.sum((expected_percents - actual_percents) * np.log(expected_percents / actual_percents)) return psi_value reference = np.random.normal(0, 1, 1000) new_data = np.random.normal(5, 1, 1000) print(psi(reference, new_data)) ```

AThe PSI formula is incorrect and should use absolute differences

BThe scale_range function modifies the input array in place causing incorrect binning

CThe breakpoints array is not sorted, causing histogram errors

DThe new_data and reference arrays have different lengths causing division errors

Attempts:

2 left