Challenge - 5 Problems

🎖️

Imbalanced Data Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Why use SMOTE for imbalanced data?

Imagine you have a dataset where one class is much smaller than the other. Why would you use SMOTE (Synthetic Minority Over-sampling Technique) instead of just duplicating minority class samples?

ASMOTE creates new synthetic samples by mixing existing minority samples, which helps the model learn better decision boundaries.

BSMOTE removes majority class samples to balance the dataset, reducing training time.

CSMOTE duplicates minority samples exactly to increase their count without changing data distribution.

DSMOTE randomly deletes samples from both classes to balance the dataset.

Attempts:

2 left

❓ Predict Output

intermediate

1:30remaining

Output of class weight usage in logistic regression

What will be the output of the following code snippet regarding the model's class weight attribute?

ML Python

from sklearn.linear_model import LogisticRegression
model = LogisticRegression(class_weight={0:1, 1:5})
print(model.class_weight)

A{0: 1, 1: 5}

Bbalanced

CNone

DRaises TypeError

Attempts:

2 left

❓ Metrics

advanced

2:00remaining

Choosing the best metric for imbalanced classification

You trained a model on a dataset with 95% of class 0 and 5% of class 1. Which metric is best to evaluate your model's performance on the minority class?

APrecision for class 1, to measure how many predicted positives are correct.

BMean Squared Error, because it measures prediction error.

CAccuracy, because it shows overall correct predictions.

DRecall for class 1, to measure how many actual positives are found.

Attempts:

2 left

🔧 Debug

advanced

2:30remaining

Why does this SMOTE code raise an error?

What error will this code raise and why? from imblearn.over_sampling import SMOTE X = [[1,2],[3,4],[5,6]] y = [0,0,1] smote = SMOTE(sampling_strategy='minority') X_res, y_res = smote.fit_resample(X, y)

AValueError: Expected 2D array, got 1D array instead.

BNo error, code runs successfully.

CValueError: At least 6 samples are needed to perform SMOTE.

DTypeError: 'list' object has no attribute 'fit'.

Attempts:

2 left

❓ Model Choice

expert

3:00remaining

Best approach for highly imbalanced multi-class classification

You have a multi-class dataset with 4 classes, where one class is only 1% of data. You want to improve model performance on the rare class. Which approach is best?

AApply SMOTE oversampling to all classes equally before training.

BUse class weights in the model to give higher importance to the rare class.

CRemove the rare class samples to simplify the problem.

DUse accuracy as the only metric to evaluate the model.

Attempts:

2 left

Practice

(1/5)

1. What is the main purpose of using SMOTE in machine learning?

easy

A. To create synthetic samples for minority classes to balance the dataset

B. To reduce the size of the majority class by removing samples

C. To increase the number of features in the dataset

D. To randomly shuffle the dataset before training

Imbalanced class handling (SMOTE, class weights) in ML Python - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand SMOTE's role in imbalanced data

Step 2: Compare options with SMOTE's function

Final Answer:

Quick Check:

Solution

Step 1: Recall scikit-learn parameter for class weights

Step 2: Match options with correct syntax

Final Answer:

Quick Check:

Solution

Step 1: Count original class samples

Step 2: Understand SMOTE behavior on balanced data

Step 3: Check actual output

Final Answer:

Quick Check:

Solution

Step 1: Check class_weight dictionary keys

Step 2: Understand impact of wrong keys

Final Answer:

Quick Check:

Solution

Step 1: Understand dataset imbalance

Step 2: Combine SMOTE and class weights

Step 3: Why combining is best

Final Answer:

Quick Check: