0
0
Pandasdata~20 mins

Outlier detection with IQR in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
IQR Outlier Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the output of this IQR outlier detection code?
Given the DataFrame df below, what does the outliers variable contain after running the code?
Pandas
import pandas as pd

df = pd.DataFrame({'values': [10, 12, 14, 15, 18, 19, 20, 100]})
Q1 = df['values'].quantile(0.25)
Q3 = df['values'].quantile(0.75)
IQR = Q3 - Q1
outliers = df[(df['values'] < Q1 - 1.5 * IQR) | (df['values'] > Q3 + 1.5 * IQR)]
print(outliers)
A
   values
7     100
BEmpty DataFrame\nColumns: [values]\nIndex: []
C
   values
0      10
7    100
D
   values
0     10
Attempts:
2 left
💡 Hint
Calculate Q1 and Q3, then find the IQR. Check which values fall outside the range Q1 - 1.5*IQR to Q3 + 1.5*IQR.
data_output
intermediate
1:30remaining
How many outliers are detected by this IQR method?
Using the DataFrame df and the IQR method below, how many rows are identified as outliers?
Pandas
import pandas as pd

df = pd.DataFrame({'scores': [55, 60, 65, 70, 75, 80, 85, 90, 95, 200]})
Q1 = df['scores'].quantile(0.25)
Q3 = df['scores'].quantile(0.75)
IQR = Q3 - Q1
outliers = df[(df['scores'] < Q1 - 1.5 * IQR) | (df['scores'] > Q3 + 1.5 * IQR)]
print(len(outliers))
A2
B0
C1
D3
Attempts:
2 left
💡 Hint
Calculate the IQR and check which values fall outside the bounds.
visualization
advanced
2:30remaining
Which boxplot correctly shows the outliers detected by IQR?
You run this code to detect outliers and plot a boxplot. Which option shows the correct boxplot visualization?
Pandas
import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({'data': [5, 7, 8, 9, 10, 12, 15, 100]})
plt.boxplot(df['data'])
plt.show()
ABoxplot with one point far below the lower whisker at 5
BBoxplot with one point far above the upper whisker at 100
CBoxplot with multiple points above whiskers at 15 and 100
DBoxplot with no points outside whiskers
Attempts:
2 left
💡 Hint
Outliers appear as points outside the whiskers. The value 100 is much larger than others.
🔧 Debug
advanced
1:30remaining
What error does this IQR outlier detection code raise?
Identify the error raised by this code snippet:
Pandas
import pandas as pd

df = pd.DataFrame({'vals': [1, 2, 3, 4, 5]})
Q1 = df['vals'].quantile(0.25)
Q3 = df['vals'].quantile(0.75)
IQR = Q3 - Q1
outliers = df[(df['vals'] < Q1 - 1.5 * IQR) | (df['vals'] > Q3 + 1.5 * IQR)]
print(outlier)
ASyntaxError: invalid syntax
BTypeError: unsupported operand type(s) for -: 'str' and 'float'
CKeyError: 'vals'
DNameError: name 'outlier' is not defined
Attempts:
2 left
💡 Hint
Check variable names carefully in the print statement.
🚀 Application
expert
3:00remaining
Which option correctly filters out outliers using IQR in a DataFrame with multiple columns?
You have a DataFrame df with columns A and B. You want to remove rows where A or B have outliers based on IQR. Which code correctly does this?
Pandas
import pandas as pd

df = pd.DataFrame({'A': [10, 12, 14, 100, 15], 'B': [20, 22, 23, 24, 200]})
A
Q1 = df.quantile(0.25)
Q3 = df.quantile(0.75)
IQR = Q3 - Q1
filtered = df[(df['A'] &gt;= Q1['A'] - 1.5 * IQR['A']) &amp; (df['A'] &lt;= Q3['A'] + 1.5 * IQR['A']) &amp; (df['B'] &gt;= Q1['B'] - 1.5 * IQR['B']) &amp; (df['B'] &lt;= Q3['B'] + 1.5 * IQR['B'])]
B
Q1 = df.quantile(0.25)
Q3 = df.quantile(0.75)
IQR = Q3 - Q1
filtered = df[(df &gt;= Q1 - 1.5 * IQR) &amp; (df &lt;= Q3 + 1.5 * IQR)]
C
Q1_A = df['A'].quantile(0.25)
Q3_A = df['A'].quantile(0.75)
IQR_A = Q3_A - Q1_A
Q1_B = df['B'].quantile(0.25)
Q3_B = df['B'].quantile(0.75)
IQR_B = Q3_B - Q1_B
filtered = df[~((df['A'] &lt; Q1_A - 1.5 * IQR_A) | (df['A'] &gt; Q3_A + 1.5 * IQR_A) | (df['B'] &lt; Q1_B - 1.5 * IQR_B) | (df['B'] &gt; Q3_B + 1.5 * IQR_B))]
D
Q1 = df.quantile(0.25)
Q3 = df.quantile(0.75)
IQR = Q3 - Q1
filtered = df[(df['A'] &gt;= Q1['A'] - 1.5 * IQR['A']) | (df['A'] &lt;= Q3['A'] + 1.5 * IQR['A']) | (df['B'] &gt;= Q1['B'] - 1.5 * IQR['B']) | (df['B'] &lt;= Q3['B'] + 1.5 * IQR['B'])]
Attempts:
2 left
💡 Hint
Use logical AND to keep rows where both columns are within the IQR bounds.