Recall & Review
beginner
What does IQR stand for in data analysis?
IQR stands for Interquartile Range. It measures the middle 50% spread of the data between the 25th percentile (Q1) and the 75th percentile (Q3).
Click to reveal answer
beginner
How do you calculate the IQR from a dataset?
IQR = Q3 - Q1, where Q1 is the 25th percentile and Q3 is the 75th percentile of the data.
Click to reveal answer
intermediate
Why is IQR useful for detecting outliers?
IQR helps find outliers by identifying data points that fall below Q1 - 1.5*IQR or above Q3 + 1.5*IQR, which are unusually far from the middle 50% of data.
Click to reveal answer
beginner
Show the pandas code to calculate Q1, Q3, and IQR for a DataFrame column named 'data'.
Q1 = df['data'].quantile(0.25)
Q3 = df['data'].quantile(0.75)
IQR = Q3 - Q1
Click to reveal answer
intermediate
How do you filter out outliers using IQR in pandas?
Use the condition: keep rows where values are >= Q1 - 1.5*IQR and <= Q3 + 1.5*IQR.
Example:
filtered_df = df[(df['data'] >= Q1 - 1.5*IQR) & (df['data'] <= Q3 + 1.5*IQR)]
Click to reveal answer
What does the IQR measure in a dataset?
✗ Incorrect
IQR measures the spread between the 25th percentile (Q1) and 75th percentile (Q3), which covers the middle 50% of the data.
Which formula identifies outliers using IQR?
✗ Incorrect
Outliers are values that fall below Q1 minus 1.5 times the IQR or above Q3 plus 1.5 times the IQR.
In pandas, how do you get the 25th percentile of a column 'data'?
✗ Incorrect
The quantile method with 0.25 returns the 25th percentile (Q1) of the column.
What is the purpose of filtering data using IQR in pandas?
✗ Incorrect
Filtering with IQR helps remove outliers that are unusually far from the central data range.
If Q1 = 10 and Q3 = 20, what is the IQR?
✗ Incorrect
IQR = Q3 - Q1 = 20 - 10 = 10.
Explain how to detect outliers using the IQR method in pandas.
Think about the steps from calculating quartiles to filtering data.
You got /5 concepts.
Why is the IQR method preferred over using min and max values for outlier detection?
Consider how extreme values affect range versus IQR.
You got /4 concepts.