When classes are imbalanced, accuracy can be misleading because the model may just predict the majority class well. Instead, Precision, Recall, and F1-score are important. Recall tells us how many actual minority cases we catch, and Precision tells us how many predicted minority cases are correct. F1-score balances both. These metrics help us understand if the model is truly learning the minority class or just ignoring it.
Imbalanced class handling (SMOTE, class weights) in ML Python - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
or
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - Imbalanced class handling (SMOTE, class weights)
Which metric matters for Imbalanced Class Handling and WHY
Confusion Matrix Example
Actual \ Predicted | Positive | Negative
-------------------|----------|---------
Positive (Minority)| 40 | 10
Negative (Majority)| 20 | 930
Total samples = 1000
TP = 40, FN = 10, FP = 20, TN = 930
From this matrix:
- Precision = 40 / (40 + 20) = 0.67
- Recall = 40 / (40 + 10) = 0.80
- F1-score = 2 * (0.67 * 0.80) / (0.67 + 0.80) ≈ 0.73
Precision vs Recall Tradeoff with Examples
In imbalanced data, improving one metric often lowers the other:
- High Precision, Low Recall: The model is very sure when it predicts minority class but misses many actual minority cases. Example: A spam filter that rarely marks good emails as spam but misses many spam emails.
- High Recall, Low Precision: The model catches most minority cases but also has many false alarms. Example: A cancer detector that finds almost all cancer cases but sometimes wrongly flags healthy people.
Using SMOTE or class weights helps balance this tradeoff by giving the model more minority examples or more importance to minority errors.
What Good vs Bad Metric Values Look Like
For imbalanced class problems:
- Good: Precision and Recall both above 0.7, F1-score above 0.7, showing balanced detection and correctness.
- Bad: High accuracy (e.g., 95%) but very low Recall (e.g., 10%) on minority class, meaning the model ignores minority cases.
Good models catch minority cases well without too many false alarms.
Common Pitfalls in Metrics for Imbalanced Classes
- Accuracy Paradox: High accuracy can hide poor minority class detection.
- Data Leakage: If minority class examples leak into training and test sets, metrics look better but model won't generalize.
- Overfitting: Using SMOTE incorrectly can cause the model to memorize synthetic samples, inflating metrics.
- Ignoring Class Distribution: Not adjusting metrics or thresholds for imbalance leads to misleading results.
Self Check
Your model has 98% accuracy but only 12% recall on the minority (fraud) class. Is it good for production?
Answer: No. The model misses 88% of fraud cases, which is dangerous. Despite high accuracy, it fails to detect most fraud. You should improve recall using techniques like SMOTE or class weights.
Key Result
For imbalanced classes, focus on Precision, Recall, and F1-score rather than accuracy to truly measure minority class detection.
Practice
1. What is the main purpose of using SMOTE in machine learning?
easy
Solution
Step 1: Understand SMOTE's role in imbalanced data
SMOTE stands for Synthetic Minority Over-sampling Technique and it creates new synthetic samples for the minority class.Step 2: Compare options with SMOTE's function
Only To create synthetic samples for minority classes to balance the dataset correctly describes SMOTE's purpose to balance classes by adding synthetic minority samples.Final Answer:
To create synthetic samples for minority classes to balance the dataset -> Option AQuick Check:
SMOTE = Synthetic samples for minority [OK]
Hint: SMOTE = make new minority samples to balance [OK]
Common Mistakes:
- Thinking SMOTE removes majority samples
- Confusing SMOTE with feature engineering
- Assuming SMOTE shuffles data
2. Which of the following is the correct way to set class weights in scikit-learn's LogisticRegression?
easy
Solution
Step 1: Recall scikit-learn parameter for class weights
The correct parameter name isclass_weightand it accepts 'balanced' to auto-adjust weights.Step 2: Match options with correct syntax
Only LogisticRegression(class_weight='balanced') uses the exact parameterclass_weight='balanced'.Final Answer:
LogisticRegression(class_weight='balanced') -> Option AQuick Check:
Parameter name is class_weight [OK]
Hint: Use class_weight='balanced' exactly in model init [OK]
Common Mistakes:
- Using wrong parameter names like weight_class
- Misspelling class_weight
- Passing weights instead of class_weight
3. Given this code snippet using SMOTE, what will be the shape of X_resampled and y_resampled?
from imblearn.over_sampling import SMOTE X = [[1], [2], [3], [4], [5], [6]] y = [0, 0, 0, 1, 1, 1] smote = SMOTE(random_state=42) X_resampled, y_resampled = smote.fit_resample(X, y) print(len(X_resampled), len(y_resampled))
medium
Solution
Step 1: Count original class samples
Class 0 has 3 samples, class 1 has 3 samples, so dataset is balanced initially.Step 2: Understand SMOTE behavior on balanced data
SMOTE will create synthetic samples to balance minority class to majority class size. Here both classes are equal, so no new samples are needed.Step 3: Check actual output
Since classes are equal, no new samples are added. So output length remains 6.Final Answer:
6 6 -> Option BQuick Check:
Balanced classes, no new samples added [OK]
Hint: SMOTE adds samples only if classes are imbalanced [OK]
Common Mistakes:
- Assuming SMOTE always doubles data
- Ignoring original class counts
- Confusing sample count with feature count
4. You wrote this code to apply class weights but the model accuracy is very low. What is the likely error?
from sklearn.linear_model import LogisticRegression
model = LogisticRegression(class_weight={'0':1, '1':10})
model.fit(X_train, y_train)medium
Solution
Step 1: Check class_weight dictionary keys
Class labels in class_weight must match label types in y_train. Usually labels are integers 0 and 1, not strings '0' and '1'.Step 2: Understand impact of wrong keys
If keys are strings but labels are integers, weights won't apply correctly, causing poor model performance.Final Answer:
Class weights keys should be integers, not strings -> Option CQuick Check:
Keys must match label types [OK]
Hint: Match class_weight keys to label data types exactly [OK]
Common Mistakes:
- Using string keys instead of integer keys
- Thinking class_weight can't be a dict
- Believing weights must sum to 1
5. You have a dataset with 95% class 0 and 5% class 1. You want to train a model that handles this imbalance. Which approach is best to improve minority class recall?
hard
Solution
Step 1: Understand dataset imbalance
With 95% vs 5%, the minority class is very small and model may ignore it.Step 2: Combine SMOTE and class weights
SMOTE creates synthetic minority samples to balance data, while class_weight='balanced' tells model to focus more on minority class during training.Step 3: Why combining is best
Using both together improves minority recall better than using either alone or ignoring imbalance.Final Answer:
Use SMOTE to create synthetic minority samples and set class_weight='balanced' in the model -> Option DQuick Check:
Combine oversampling + class weights for best minority recall [OK]
Hint: Combine SMOTE and class_weight='balanced' for best results [OK]
Common Mistakes:
- Using only one method and expecting best recall
- Ignoring imbalance completely
- Assuming oversampling alone fixes all issues
