Challenge - 5 Problems
Group Filter Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of group filtering with filter()
What is the output DataFrame after applying
filter() to keep groups with mean value greater than 3?Data Analysis Python
import pandas as pd df = pd.DataFrame({ 'group': ['A', 'A', 'B', 'B', 'C', 'C'], 'value': [2, 4, 5, 6, 1, 2] }) result = df.groupby('group').filter(lambda x: x['value'].mean() > 3) print(result)
Attempts:
2 left
💡 Hint
Calculate the mean of 'value' for each group and keep groups where mean > 3.
✗ Incorrect
Group 'B' has mean (5+6)/2 = 5.5 > 3, so only group 'B' rows remain after filtering.
❓ data_output
intermediate2:00remaining
Number of groups after filtering
After filtering groups with sum of 'score' >= 10, how many groups remain?
Data Analysis Python
import pandas as pd df = pd.DataFrame({ 'team': ['X', 'X', 'Y', 'Y', 'Z', 'Z'], 'score': [4, 7, 3, 2, 6, 5] }) filtered = df.groupby('team').filter(lambda g: g['score'].sum() >= 10) num_groups = filtered['team'].nunique() print(num_groups)
Attempts:
2 left
💡 Hint
Sum scores per team and count teams with sum >= 10.
✗ Incorrect
Team X sum is 11, team Y sum is 5, team Z sum is 11. Teams X and Z remain, so 2 groups.
🔧 Debug
advanced2:00remaining
Identify the error in group filtering code
What error does this code raise when filtering groups with mean > 2?
df.groupby('cat').filter(lambda x: x['val'].mean > 2)Data Analysis Python
import pandas as pd df = pd.DataFrame({'cat': ['a', 'a', 'b'], 'val': [1, 3, 4]}) result = df.groupby('cat').filter(lambda x: x['val'].mean > 2)
Attempts:
2 left
💡 Hint
Check if mean is called as a method or accessed as an attribute.
✗ Incorrect
mean is a method and must be called with parentheses (). Missing () causes a method object to be returned, which is not callable in comparison.
🚀 Application
advanced2:00remaining
Filter groups with more than 2 rows and average value below 5
Which option correctly filters groups with more than 2 rows and average 'score' less than 5?
Data Analysis Python
import pandas as pd df = pd.DataFrame({ 'group': ['G1', 'G1', 'G1', 'G2', 'G2', 'G3', 'G3', 'G3'], 'score': [4, 3, 2, 6, 7, 1, 2, 3] })
Attempts:
2 left
💡 Hint
Use both conditions with 'and' and strict inequalities as stated.
✗ Incorrect
Only groups with more than 2 rows and mean score less than 5 are kept. G1 has 3 rows and mean 3, G3 has 3 rows and mean 2, G2 has 2 rows so excluded.
🧠 Conceptual
expert2:00remaining
Understanding filter() behavior on groupby objects
Which statement about
filter() on a pandas GroupBy object is TRUE?Attempts:
2 left
💡 Hint
Think about how filter works on groups, not individual rows.
✗ Incorrect
filter() applies the function to each group and keeps groups where the function returns True. It returns a filtered DataFrame, not a Series or in-place modification.