Recall & Review
beginner
What is missing data in a dataset?
Missing data refers to the absence of values in some parts of a dataset where data should be present. It can happen due to errors, non-response, or data corruption.
Click to reveal answer
beginner
Why is it important to handle missing data before analysis?
Handling missing data is important because it can bias results, reduce accuracy, and cause errors in calculations or models if left untreated.
Click to reveal answer
intermediate
What can happen if missing data is ignored in a machine learning model?
Ignoring missing data can lead to wrong predictions, poor model performance, and unreliable insights because the model learns from incomplete information.
Click to reveal answer
beginner
Name two common ways to handle missing data.
Two common ways are: 1) Removing rows or columns with missing values, 2) Filling missing values with a statistic like mean, median, or a fixed value.
Click to reveal answer
intermediate
How does missing data affect the quality of data analysis?
Missing data can reduce the quality by causing biased estimates, reducing sample size, and making the analysis less reliable or valid.
Click to reveal answer
What is a common consequence of ignoring missing data in analysis?
✗ Incorrect
Ignoring missing data often leads to biased results because the analysis is based on incomplete information.
Which method is NOT a way to handle missing data?
✗ Incorrect
Ignoring missing values completely can cause errors and bias; handling them is necessary.
Why can missing data reduce the sample size?
✗ Incorrect
Removing rows with missing data reduces the number of samples available for analysis.
Which of these is a simple way to fill missing data?
✗ Incorrect
Replacing missing values with the mean is a common simple method to fill missing data.
What does missing data often indicate in real-life datasets?
✗ Incorrect
Missing data usually indicates errors, non-response, or gaps in how data was collected.
Explain why handling missing data is important before doing any data analysis.
Think about how incomplete data affects the story the data tells.
You got /4 concepts.
Describe two common methods to handle missing data and when you might use them.
Consider simple ways to fix or avoid missing data problems.
You got /4 concepts.