Challenge - 5 Problems
Stratified K-fold Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate2:00remaining
Why use Stratified K-fold instead of regular K-fold?
Imagine you have a dataset with two classes: 90% are class A and 10% are class B. You want to split the data into 5 folds for cross-validation.
Why is Stratified K-fold better than regular K-fold in this case?
Attempts:
2 left
❓ Predict Output
intermediate2:00remaining
Output of StratifiedKFold split indices
Given this code, what is the output of the printed train indices for the first fold?
ML Python
from sklearn.model_selection import StratifiedKFold import numpy as np X = np.array([[i] for i in range(10)]) y = np.array([0,0,0,0,1,1,1,1,1,1]) skf = StratifiedKFold(n_splits=2, shuffle=False) for fold, (train_index, test_index) in enumerate(skf.split(X, y)): if fold == 0: print(train_index.tolist())
Attempts:
2 left
❓ Model Choice
advanced2:00remaining
Choosing the best cross-validation method for imbalanced data
You have a dataset with 95% of samples in class 0 and 5% in class 1. You want to evaluate a classification model's performance reliably.
Which cross-validation method is best to use?
Attempts:
2 left
❓ Hyperparameter
advanced2:00remaining
Effect of increasing n_splits in StratifiedKFold
What is the effect of increasing the number of splits (n_splits) in StratifiedKFold on the training and validation sets?
Attempts:
2 left
🔧 Debug
expert2:00remaining
Why does this StratifiedKFold code raise an error?
Consider this code snippet:
from sklearn.model_selection import StratifiedKFold
import numpy as np
X = np.array([[i] for i in range(6)])
y = np.array([0, 0, 1, 1, 1, 1])
skf = StratifiedKFold(n_splits=3)
for train_index, test_index in skf.split(X, y):
print("TRAIN:", train_index, "TEST:", test_index)Running this code raises a ValueError. What is the cause?
Attempts:
2 left