Recall & Review
beginner
What are missing values in a dataset?
Missing values are data points that are not recorded or are absent in a dataset. They can happen due to errors, skipped questions, or data loss.
Click to reveal answer
beginner
Name two simple methods to handle missing values.
Two simple methods are: 1) Removing rows or columns with missing values, 2) Filling missing values with a fixed value like mean, median, or mode.
Click to reveal answer
intermediate
Why is it sometimes better to fill missing values instead of removing them?
Filling missing values keeps more data for learning, which can improve model accuracy. Removing too many rows or columns can lose important information.
Click to reveal answer
beginner
What is mean imputation?
Mean imputation replaces missing values in a numeric column with the average (mean) of the available values in that column.
Click to reveal answer
intermediate
What is a potential downside of using mean imputation?
Mean imputation can reduce data variability and may bias the model because it ignores the natural spread of data and relationships between features.
Click to reveal answer
What does 'dropping missing values' mean?
Which method fills missing numeric values with the average of the column?
Why might removing all rows with missing values be a bad idea?
Which of these is NOT a common way to fill missing values?
What is a risk of using mean imputation?
Explain why handling missing values is important before training a machine learning model.
Describe two common techniques to handle missing values and when you might use each.