0
0
ML Pythonprogramming~20 mins

Cross-validation (K-fold) in ML Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Cross-validation Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Why use K-fold cross-validation?

Imagine you have a small dataset and want to estimate how well your machine learning model will perform on new data. Why is K-fold cross-validation a better choice than a single train-test split?

AIt uses all data points for both training and testing, reducing bias in performance estimates.
BIt trains the model multiple times on the same training set to improve accuracy.
CIt splits data randomly once, which is faster and more reliable than multiple splits.
DIt only tests the model on the largest portion of data to get the best score.
Attempts:
2 left
Predict Output
intermediate
2:00remaining
Output of K-fold split indices

What will be the output of the following Python code that uses KFold from scikit-learn?

ML Python
from sklearn.model_selection import KFold
import numpy as np

X = np.array([10, 20, 30, 40, 50])
kf = KFold(n_splits=2, shuffle=False)

splits = []
for train_index, test_index in kf.split(X):
    splits.append((train_index.tolist(), test_index.tolist()))

print(splits)
A[([3, 4], [0, 1, 2]), ([0, 1, 2], [3, 4])]
B[([0, 1, 2], [3, 4]), ([3, 4], [0, 1, 2])]
C[([1, 2, 3], [0, 4]), ([0, 4], [1, 2, 3])]
D[([2, 3, 4], [0, 1]), ([0, 1], [2, 3, 4])]
Attempts:
2 left
Model Choice
advanced
2:00remaining
Choosing K for K-fold cross-validation

You have a dataset with 1000 samples. You want to use K-fold cross-validation to estimate model performance. Which choice of K balances bias and variance best?

AK = 5, because it provides a good balance between bias and variance for many datasets.
BK = 1000, because leave-one-out cross-validation always gives the best estimate.
CK = 1, because using the whole dataset for training and testing is most accurate.
DK = 2, because fewer folds reduce computation time and variance.
Attempts:
2 left
Metrics
advanced
2:00remaining
Calculating average accuracy from K-fold results

You performed 4-fold cross-validation and got these accuracy scores for each fold: [0.82, 0.85, 0.80, 0.83]. What is the correct average accuracy to report?

A0.8250
B0.83
C0.825
D0.83 with standard deviation 0.02
Attempts:
2 left
🔧 Debug
expert
2:00remaining
Why does this K-fold code raise an error?

Consider this Python code snippet using KFold. It raises an error. What is the cause?

ML Python
from sklearn.model_selection import KFold
import numpy as np

X = np.array([1, 2, 3])
kf = KFold(n_splits=5)

for train_index, test_index in kf.split(X):
    print('Train:', train_index, 'Test:', test_index)
AIndexError because test indices exceed array length.
BValueError because n_splits (5) cannot be greater than the number of samples (3).
CTypeError because KFold expects a list, not a numpy array.
DRuntimeError because KFold requires shuffle=true for small datasets.
Attempts:
2 left