Challenge - 5 Problems
Imbalanced Data Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate2:00remaining
Why use SMOTE for imbalanced data?
Imagine you have a dataset where one class is much smaller than the other. Why would you use SMOTE (Synthetic Minority Over-sampling Technique) instead of just duplicating minority class samples?
Attempts:
2 left
💡 Hint
Think about how creating new data points can help the model generalize better than just copying existing ones.
✗ Incorrect
SMOTE generates new synthetic minority class samples by interpolating between existing ones. This helps the model learn a more general decision boundary rather than memorizing duplicates.
❓ Predict Output
intermediate1:30remaining
Output of class weight usage in logistic regression
What will be the output of the following code snippet regarding the model's class weight attribute?
ML Python
from sklearn.linear_model import LogisticRegression model = LogisticRegression(class_weight={0:1, 1:5}) print(model.class_weight)
Attempts:
2 left
💡 Hint
Check what value is assigned to class_weight in the model initialization.
✗ Incorrect
The class_weight parameter is explicitly set to a dictionary {0:1, 1:5}, so printing model.class_weight will output this dictionary.
❓ Metrics
advanced2:00remaining
Choosing the best metric for imbalanced classification
You trained a model on a dataset with 95% of class 0 and 5% of class 1. Which metric is best to evaluate your model's performance on the minority class?
Attempts:
2 left
💡 Hint
Think about which metric helps find most of the minority class samples.
✗ Incorrect
Recall for the minority class measures how many actual positive samples are correctly identified, which is crucial in imbalanced data to avoid missing minority cases.
🔧 Debug
advanced2:30remaining
Why does this SMOTE code raise an error?
What error will this code raise and why?
from imblearn.over_sampling import SMOTE
X = [[1,2],[3,4],[5,6]]
y = [0,0,1]
smote = SMOTE(sampling_strategy='minority')
X_res, y_res = smote.fit_resample(X, y)
Attempts:
2 left
💡 Hint
SMOTE needs enough samples in the minority class to create synthetic samples.
✗ Incorrect
SMOTE requires at least k_neighbors+1 samples in the minority class to generate synthetic samples. Here, only one minority sample exists, so it raises a ValueError.
❓ Model Choice
expert3:00remaining
Best approach for highly imbalanced multi-class classification
You have a multi-class dataset with 4 classes, where one class is only 1% of data. You want to improve model performance on the rare class. Which approach is best?
Attempts:
2 left
💡 Hint
Think about how to handle imbalance without losing data or misleading metrics.
✗ Incorrect
Using class weights allows the model to pay more attention to the rare class during training without changing data distribution or removing samples.