0
0
ML Pythonprogramming~5 mins

Handling missing values in ML Python - Cheat Sheet & Quick Revision

Choose your learning style9 modes available
Recall & Review
beginner
What are missing values in a dataset?
Missing values are data points that are not recorded or are absent in a dataset. They can happen due to errors, skipped questions, or data loss.
Click to reveal answer
beginner
Name two simple methods to handle missing values.
Two simple methods are: 1) Removing rows or columns with missing values, 2) Filling missing values with a fixed value like mean, median, or mode.
Click to reveal answer
intermediate
Why is it sometimes better to fill missing values instead of removing them?
Filling missing values keeps more data for learning, which can improve model accuracy. Removing too many rows or columns can lose important information.
Click to reveal answer
beginner
What is mean imputation?
Mean imputation replaces missing values in a numeric column with the average (mean) of the available values in that column.
Click to reveal answer
intermediate
What is a potential downside of using mean imputation?
Mean imputation can reduce data variability and may bias the model because it ignores the natural spread of data and relationships between features.
Click to reveal answer
What does 'dropping missing values' mean?
AReplacing missing values with zeros
BRemoving rows or columns that contain missing data
CIgnoring missing values during training
DPredicting missing values using a model
Which method fills missing numeric values with the average of the column?
AMean imputation
BMedian imputation
CMode imputation
DForward fill
Why might removing all rows with missing values be a bad idea?
AIt fills missing values automatically
BIt always improves model accuracy
CIt can cause loss of too much data
DIt changes the data types
Which of these is NOT a common way to fill missing values?
AUsing mean or median
BUsing mode for categorical data
CUsing random noise
DDropping the entire dataset
What is a risk of using mean imputation?
AIt can bias the model by reducing variability
BIt removes all missing values permanently
CIt increases data variability
DIt only works for categorical data
Explain why handling missing values is important before training a machine learning model.
Describe two common techniques to handle missing values and when you might use each.