Challenge - 5 Problems
Data Preparation Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate2:00remaining
Why does data cleaning take so much time in ML projects?
In machine learning, data cleaning is a major part of data preparation. Why does it usually take the most time?
Attempts:
2 left
🧠 Conceptual
intermediate2:00remaining
What is the main reason feature engineering is time-consuming?
Feature engineering is a key step in data preparation. Why does it often consume a lot of time?
Attempts:
2 left
❓ Metrics
advanced2:00remaining
Measuring data quality impact on model accuracy
You have two datasets: Dataset A is clean and Dataset B has 20% missing values. You train the same model on both. Which metric difference best shows the impact of data quality?
Attempts:
2 left
🔧 Debug
advanced2:00remaining
Why does this data scaling code cause poor model results?
Consider this Python snippet for scaling features before training:
from sklearn.preprocessing import StandardScaler scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.fit_transform(X_test)
Why might this cause poor model performance?
Attempts:
2 left
❓ Model Choice
expert2:00remaining
Choosing the best approach to handle missing data in a large dataset
You have a large dataset with 30% missing values scattered randomly. Which approach is best to prepare data for a machine learning model?
Attempts:
2 left