Bird
Raised Fist0
NLPml~15 mins

Evaluation metrics (accuracy, F1, confusion matrix) in NLP - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Evaluation metrics (accuracy, F1, confusion matrix)
Problem:You have trained a text classification model to identify positive and negative movie reviews. The model's training accuracy is 90%, but you want to understand how well it performs on unseen data using evaluation metrics.
Current Metrics:Training accuracy: 90%, Validation accuracy: 85%
Issue:Accuracy alone does not give a full picture of model performance, especially if classes are imbalanced. You need to compute F1 score and confusion matrix to better evaluate the model.
Your Task
Calculate accuracy, F1 score, and confusion matrix on the validation set to better understand model performance.
Use sklearn metrics functions only.
Do not retrain or change the model.
Use the provided validation predictions and true labels.
Hint 1
Hint 2
Hint 3
Solution
NLP
from sklearn.metrics import accuracy_score, f1_score, confusion_matrix

# Example true labels and predicted labels for validation set
true_labels = [1, 0, 1, 1, 0, 0, 1, 0, 1, 0]
predicted_labels = [1, 0, 1, 0, 0, 0, 1, 1, 1, 0]

# Calculate accuracy
acc = accuracy_score(true_labels, predicted_labels)

# Calculate F1 score (binary classification)
f1 = f1_score(true_labels, predicted_labels)

# Calculate confusion matrix
cm = confusion_matrix(true_labels, predicted_labels)

print(f"Accuracy: {acc:.2f}")
print(f"F1 Score: {f1:.2f}")
print("Confusion Matrix:")
print(cm)
Added code to compute accuracy, F1 score, and confusion matrix using sklearn.
Used example true and predicted labels to demonstrate metric calculations.
Printed results clearly for easy interpretation.
Results Interpretation

Before: Only accuracy was known (85% on validation).

After: Accuracy is 80%, F1 score is 0.80, and confusion matrix shows 4 true negatives, 4 true positives, 1 false positive, and 1 false negative.

Accuracy alone can be misleading. F1 score and confusion matrix provide deeper insight into model performance, especially for imbalanced classes.
Bonus Experiment
Try calculating precision and recall along with the existing metrics to get even more insight.
💡 Hint
Use precision_score and recall_score from sklearn.metrics with the same true and predicted labels.

Practice

(1/5)
1. What does the accuracy metric measure in a classification model?
easy
A. The proportion of correct predictions out of all predictions
B. The balance between precision and recall
C. The number of false positives only
D. The total number of classes in the dataset

Solution

  1. Step 1: Understand accuracy definition

    Accuracy is defined as the number of correct predictions divided by the total number of predictions made.
  2. Step 2: Compare options with definition

    Only The proportion of correct predictions out of all predictions correctly describes accuracy as the proportion of correct predictions out of all predictions.
  3. Final Answer:

    The proportion of correct predictions out of all predictions -> Option A
  4. Quick Check:

    Accuracy = Correct predictions / Total predictions [OK]
Hint: Accuracy = correct predictions divided by total predictions [OK]
Common Mistakes:
  • Confusing accuracy with F1 score
  • Thinking accuracy measures only false positives
  • Believing accuracy counts number of classes
2. Which of the following is the correct formula for F1 score?
easy
A. Precision + Recall
B. 2 * (Precision * Recall) / (Precision + Recall)
C. True Positives / Total Samples
D. True Negatives / (True Negatives + False Positives)

Solution

  1. Step 1: Recall F1 score formula

    F1 score is the harmonic mean of precision and recall, calculated as 2 times their product divided by their sum.
  2. Step 2: Match formula with options

    2 * (Precision * Recall) / (Precision + Recall) matches the correct formula: 2 * (Precision * Recall) / (Precision + Recall).
  3. Final Answer:

    2 * (Precision * Recall) / (Precision + Recall) -> Option B
  4. Quick Check:

    F1 = 2PR/(P+R) [OK]
Hint: F1 score = 2 * Precision * Recall / (Precision + Recall) [OK]
Common Mistakes:
  • Adding precision and recall instead of harmonic mean
  • Using true positives over total samples as F1
  • Confusing F1 with specificity
3. Given the confusion matrix below for a binary classifier:
[[50, 10],
 [5, 35]]

What is the accuracy of the model?
medium
A. 75%
B. 70%
C. 90%
D. 85%

Solution

  1. Step 1: Identify confusion matrix values

    True Positives (TP) = 50, False Positives (FP) = 10, False Negatives (FN) = 5, True Negatives (TN) = 35.
  2. Step 2: Calculate accuracy

    Accuracy = (TP + TN) / (TP + FP + FN + TN) = (50 + 35) / (50 + 10 + 5 + 35) = 85 / 100 = 0.85 or 85%.
  3. Final Answer:

    85% -> Option D
  4. Quick Check:

    Accuracy = (TP+TN)/Total = 85/100 = 85% [OK]
Hint: Accuracy = (TP + TN) / total samples [OK]
Common Mistakes:
  • Adding false positives or false negatives to numerator
  • Calculating only TP / total samples
  • Mixing up TP and TN values
4. You have this confusion matrix:
[[40, 20],
 [10, 30]]

Which line of code correctly calculates precision for the positive class?
medium
A. precision = TP / (TP + FP)
B. precision = TP / (TP + FN)
C. precision = TN / (TN + FP)
D. precision = TP / (TP + TN)

Solution

  1. Step 1: Recall precision formula

    Precision is the ratio of true positives to all predicted positives: TP / (TP + FP).
  2. Step 2: Match formula with options

    precision = TP / (TP + FP) correctly uses TP / (TP + FP). precision = TP / (TP + FN) uses recall formula, C and D are incorrect.
  3. Final Answer:

    precision = TP / (TP + FP) -> Option A
  4. Quick Check:

    Precision = TP / (TP + FP) [OK]
Hint: Precision = true positives / predicted positives [OK]
Common Mistakes:
  • Using TP / (TP + FN) which is recall
  • Confusing TN with TP in precision
  • Dividing by TP + TN instead of TP + FP
5. A model has precision = 0.8 and recall = 0.5. What is the F1 score? Choose the closest value.
hard
A. 0.70
B. 0.65
C. 0.62
D. 0.75

Solution

  1. Step 1: Recall F1 score formula

    F1 = 2 * (Precision * Recall) / (Precision + Recall) = 2 * (0.8 * 0.5) / (0.8 + 0.5).
  2. Step 2: Calculate F1 score

    Calculate numerator: 2 * 0.4 = 0.8. Calculate denominator: 1.3. F1 = 0.8 / 1.3 ≈ 0.615.
  3. Final Answer:

    0.62 -> Option C
  4. Quick Check:

    F1 ≈ 0.62 from 0.8 precision and 0.5 recall [OK]
Hint: F1 is harmonic mean: 2PR/(P+R), plug values carefully [OK]
Common Mistakes:
  • Averaging precision and recall instead of harmonic mean
  • Mixing up precision and recall values
  • Rounding too early causing wrong final answer