Challenge - 5 Problems
Evaluation Metrics Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Calculate accuracy from predictions and true labels
Given the true labels and predicted labels below, what is the accuracy of the model?
NLP
true_labels = [1, 0, 1, 1, 0, 0, 1] pred_labels = [1, 0, 0, 1, 0, 1, 1] correct = sum(t == p for t, p in zip(true_labels, pred_labels)) accuracy = correct / len(true_labels) print(round(accuracy, 2))
Attempts:
2 left
💡 Hint
Count how many predictions match the true labels, then divide by total labels.
✗ Incorrect
Accuracy is the number of correct predictions divided by total predictions. Here, 5 out of 7 predictions are correct, so accuracy is 5/7 ≈ 0.71.
🧠 Conceptual
intermediate1:30remaining
Understanding the F1 score
Which statement best describes the F1 score in classification tasks?
Attempts:
2 left
💡 Hint
Think about how F1 combines two important metrics to balance false positives and false negatives.
✗ Incorrect
F1 score combines precision and recall using the harmonic mean, giving a single metric that balances both false positives and false negatives.
❓ Predict Output
advanced2:00remaining
Confusion matrix values from predictions
Given the true and predicted labels below, what is the value of True Positives (TP) in the confusion matrix?
NLP
true_labels = [0, 1, 1, 0, 1, 0, 1, 1] pred_labels = [0, 1, 0, 0, 1, 1, 1, 0] TP = sum(1 for t, p in zip(true_labels, pred_labels) if t == 1 and p == 1) print(TP)
Attempts:
2 left
💡 Hint
Count how many times the true label and predicted label are both 1.
✗ Incorrect
True Positives are cases where the true label is 1 and the prediction is also 1. Here, that happens three times.
❓ Model Choice
advanced1:30remaining
Choosing metric for imbalanced data
You have a dataset where 95% of samples belong to class A and 5% to class B. Which metric is best to evaluate your model's performance on class B?
Attempts:
2 left
💡 Hint
Think about which metric balances false positives and false negatives well for rare classes.
✗ Incorrect
Accuracy can be misleading with imbalanced data. F1 score balances precision and recall, making it better for rare classes.
❓ Metrics
expert2:30remaining
Calculate macro F1 score from class-wise precision and recall
Given the following precision and recall for three classes, what is the macro F1 score?
Class 1: precision=0.8, recall=0.6
Class 2: precision=0.7, recall=0.7
Class 3: precision=0.9, recall=0.5
Attempts:
2 left
💡 Hint
Calculate F1 for each class, then average them.
✗ Incorrect
F1 for each class = 2 * (precision * recall) / (precision + recall). Then average all three F1 scores for macro F1.