0
0
ML Pythonprogramming~20 mins

Handling missing values in ML Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Missing Values Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Why is it important to handle missing values before training a machine learning model?

Choose the best reason why missing values must be handled before training a model.

AMissing values always improve model accuracy by adding randomness.
BMissing values can cause errors or unexpected behavior in many machine learning algorithms.
CModels automatically ignore missing values, so handling them is optional.
DMissing values reduce the size of the dataset, which speeds up training.
Attempts:
2 left
Predict Output
intermediate
2:00remaining
What is the output of this code that fills missing values with the column mean?

Given the following code, what will be the resulting DataFrame?

ML Python
import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, np.nan, 3], 'B': [4, 5, np.nan]})
df_filled = df.fillna(df.mean())
print(df_filled)
A
     A    B
0  1.0  4.0
1  2.0  5.0
2  3.0  4.5
B
     A    B
0  1.0  4.0
1  NaN  5.0
2  3.0  NaN
C
     A    B
0  1.0  4.0
1  2.0  NaN
2  3.0  4.5
D
     A    B
0  1.0  4.0
1  3.0  5.0
2  3.0  4.5
Attempts:
2 left
Model Choice
advanced
2:00remaining
Which model type is most robust to missing values without imputation?

Choose the model that can handle missing values internally without needing to fill them first.

ADecision Trees
BLinear Regression
CK-Nearest Neighbors
DSupport Vector Machines
Attempts:
2 left
Metrics
advanced
2:00remaining
How does improper handling of missing values affect model evaluation metrics?

What is the most likely effect on accuracy if missing values are dropped from the test set but not from the training set?

AAccuracy will be unaffected because missing values are only in training data.
BAccuracy will be artificially low because the model sees fewer test samples.
CAccuracy will be exactly the same as if missing values were imputed.
DAccuracy will be artificially high because the test set is cleaner than training data.
Attempts:
2 left
🔧 Debug
expert
2:00remaining
What error does this code raise when imputing missing values with scikit-learn's SimpleImputer?

Examine the code below and select the error it produces when run.

ML Python
from sklearn.impute import SimpleImputer
import numpy as np

X = np.array([[1, 2], [np.nan, 3], [7, 6]])
imputer = SimpleImputer(strategy='median')
X_imputed = imputer.fit_transform(X)
print(X_imputed)
AValueError: Cannot use median strategy with non-numeric data
BTypeError: 'SimpleImputer' object is not callable
C
No error; output is [[1. 2.]
 [4. 3.]
 [7. 6.]]
DAttributeError: 'numpy.ndarray' object has no attribute 'fit_transform'
Attempts:
2 left