Recall & Review
beginner
What is the main goal of Exploratory Data Analysis (EDA)?
The main goal of EDA is to understand the data by summarizing its main characteristics, often using visual methods, to find patterns, spot anomalies, test hypotheses, and check assumptions.
Click to reveal answer
beginner
Name three common steps in an EDA process.
1. Data cleaning (handling missing values and errors), 2. Data summarization (calculating statistics like mean, median), 3. Data visualization (plots like histograms, scatter plots).
Click to reveal answer
beginner
What Python library is commonly used for data visualization in EDA?
Matplotlib and Seaborn are commonly used Python libraries for data visualization during EDA.
Click to reveal answer
intermediate
Why is checking for missing values important in EDA?
Missing values can affect analysis results and model performance. Identifying them helps decide how to handle them, like filling or removing, to keep data quality.
Click to reveal answer
beginner
What does a box plot show in EDA?
A box plot shows the distribution of data through its quartiles, highlights the median, and identifies outliers.
Click to reveal answer
Which of the following is NOT typically part of EDA?
✗ Incorrect
Model training is part of machine learning, not EDA. EDA focuses on understanding and preparing data.
What Python function shows the first few rows of a DataFrame?
✗ Incorrect
df.head() shows the first few rows, useful to quickly see sample data.
Which plot is best to check the distribution of a single numeric variable?
✗ Incorrect
Histograms show how data values are distributed across intervals.
What does df.describe() provide in pandas?
✗ Incorrect
df.describe() gives summary statistics for numeric columns.
Why do we check for outliers in EDA?
✗ Incorrect
Outliers can distort analysis and models, so identifying them is important.
Describe the key steps you would follow to perform Exploratory Data Analysis on a new dataset.
Think about how you get to know your data before using it.
You got /6 concepts.
Explain why visualization is important in Exploratory Data Analysis and name two types of plots you might use.
Visuals help tell the story of the data.
You got /4 concepts.