Recall & Review
beginner
What is data preparation in machine learning?
Data preparation is the process of cleaning, organizing, and transforming raw data into a suitable format for training machine learning models.
Click to reveal answer
beginner
Why does data preparation take most of the time in ML projects?
Because real-world data is often messy, incomplete, and inconsistent, requiring many steps like cleaning, handling missing values, and formatting before it can be used effectively.
Click to reveal answer
beginner
Name three common tasks involved in data preparation.
1. Cleaning data (removing errors and duplicates), 2. Handling missing values (filling or removing), 3. Transforming data (normalizing or encoding).
Click to reveal answer
intermediate
How does poor data quality affect machine learning models?
Poor data quality can cause models to learn wrong patterns, leading to inaccurate predictions and poor performance.
Click to reveal answer
intermediate
What is the role of feature engineering in data preparation?
Feature engineering creates new input features from raw data to help models learn better and improve prediction accuracy.
Click to reveal answer
Which of the following is NOT a common data preparation task?
Why is data preparation often the longest step in ML projects?
What can happen if you skip data preparation?
Feature engineering is important because it:
Which of these is a sign of poor data quality?
Explain why data preparation usually takes the most time in machine learning projects.
Describe the main steps involved in data preparation and why each is important.