NLPml~15 mins

Evaluation metrics (accuracy, F1, confusion matrix) in NLP - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Evaluation metrics (accuracy, F1, confusion matrix)

Problem:You have trained a text classification model to identify positive and negative movie reviews. The model's training accuracy is 90%, but you want to understand how well it performs on unseen data using evaluation metrics.

Current Metrics:Training accuracy: 90%, Validation accuracy: 85%

Issue:Accuracy alone does not give a full picture of model performance, especially if classes are imbalanced. You need to compute F1 score and confusion matrix to better evaluate the model.

Your Task

Calculate accuracy, F1 score, and confusion matrix on the validation set to better understand model performance.

Use sklearn metrics functions only.

Do not retrain or change the model.

Use the provided validation predictions and true labels.

Hint 1

Hint 2

Hint 3

Solution

NLP

from sklearn.metrics import accuracy_score, f1_score, confusion_matrix

# Example true labels and predicted labels for validation set
true_labels = [1, 0, 1, 1, 0, 0, 1, 0, 1, 0]
predicted_labels = [1, 0, 1, 0, 0, 0, 1, 1, 1, 0]

# Calculate accuracy
acc = accuracy_score(true_labels, predicted_labels)

# Calculate F1 score (binary classification)
f1 = f1_score(true_labels, predicted_labels)

# Calculate confusion matrix
cm = confusion_matrix(true_labels, predicted_labels)

print(f"Accuracy: {acc:.2f}")
print(f"F1 Score: {f1:.2f}")
print("Confusion Matrix:")
print(cm)

Added code to compute accuracy, F1 score, and confusion matrix using sklearn.

Used example true and predicted labels to demonstrate metric calculations.

Printed results clearly for easy interpretation.

Results Interpretation

Before: Only accuracy was known (85% on validation).

After: Accuracy is 80%, F1 score is 0.80, and confusion matrix shows 4 true negatives, 4 true positives, 1 false positive, and 1 false negative.

Accuracy alone can be misleading. F1 score and confusion matrix provide deeper insight into model performance, especially for imbalanced classes.

Bonus Experiment

Try calculating precision and recall along with the existing metrics to get even more insight.

💡 Hint

Use precision_score and recall_score from sklearn.metrics with the same true and predicted labels.

Practice

(1/5)

1. What does the accuracy metric measure in a classification model?

easy

A. The proportion of correct predictions out of all predictions

B. The balance between precision and recall

C. The number of false positives only

D. The total number of classes in the dataset

Evaluation metrics (accuracy, F1, confusion matrix) in NLP - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand accuracy definition

Step 2: Compare options with definition

Final Answer:

Quick Check:

Solution

Step 1: Recall F1 score formula

Step 2: Match formula with options

Final Answer:

Quick Check:

Solution

Step 1: Identify confusion matrix values

Step 2: Calculate accuracy

Final Answer:

Quick Check:

Solution

Step 1: Recall precision formula

Step 2: Match formula with options

Final Answer:

Quick Check:

Solution

Step 1: Recall F1 score formula

Step 2: Calculate F1 score

Final Answer:

Quick Check: