Challenge - 5 Problems

🎖️

Retraining Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Understanding when to retrain a model

You have a machine learning model deployed for predicting house prices. Over time, the model's accuracy drops significantly. What is the most likely reason you need to retrain the model?

AThe model was trained with too many features initially.

BThe model's code has a syntax error causing wrong predictions.

CThe data distribution has changed, so the model no longer fits new data well.

DThe model's training data was too large, causing slow predictions.

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output of incremental retraining code

What will be the output of the following Python code snippet that simulates incremental retraining?

ML Python

class Model:
    def __init__(self):
        self.data_count = 0
    def train(self, new_data):
        self.data_count += len(new_data)
    def predict(self):
        return self.data_count

model = Model()
model.train([1, 2, 3])
model.train([4, 5])
print(model.predict())

ANone

Attempts:

2 left

❓ Model Choice

advanced

2:00remaining

Choosing a retraining strategy for streaming data

You have a model that receives continuous streaming data. Which retraining strategy is best to keep the model updated without retraining from scratch every time?

AIncremental learning that updates the model with new data only.

BBatch retraining using all historical data every day.

CTraining a new model from scratch every hour.

DFreezing the model and never retraining.

Attempts:

2 left

❓ Metrics

advanced

2:00remaining

Evaluating retraining effectiveness

After retraining a classification model, which metric change best indicates the retraining improved the model?

AAccuracy increased and loss decreased on validation data.

BTraining loss increased but validation accuracy increased.

CValidation loss increased and accuracy stayed the same.

DAccuracy decreased but training loss decreased.

Attempts:

2 left

🔧 Debug

expert

3:00remaining

Debugging retraining code with data leakage

Consider this retraining code snippet. What is the main issue causing overly optimistic validation results?

ML Python

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

X = [[i] for i in range(100)]
y = [0]*50 + [1]*50

# Split data
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Retrain model
model = LogisticRegression()
model.fit(X_train + X_val, y_train + y_val)

# Evaluate
score = model.score(X_val, y_val)
print(f"Validation accuracy: {score}")

AThe random_state is fixed, causing biased splits.

BThe test_size parameter is too large, reducing training data.

CThe model is not fitted before scoring.

DThe model is trained on validation data, causing data leakage.

Attempts:

2 left