0
0
Pandasdata~20 mins

filter() for group-level filtering in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Master of group-level filtering
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of group filtering with filter()
What is the output DataFrame after applying the filter to keep groups with sum of values greater than 10?
Pandas
import pandas as pd

df = pd.DataFrame({
    'Category': ['A', 'A', 'B', 'B', 'C', 'C'],
    'Value': [4, 8, 3, 2, 7, 6]
})
filtered = df.groupby('Category').filter(lambda x: x['Value'].sum() > 10)
print(filtered)
A
Empty DataFrame
Columns: [Category, Value]
Index: []
B
  Category  Value
2        B      3
3        B      2
C
  Category  Value
0        A      4
1        A      8
2        B      3
3        B      2
4        C      7
5        C      6
D
  Category  Value
0        A      4
1        A      8
4        C      7
5        C      6
Attempts:
2 left
💡 Hint
Sum the 'Value' column for each group and keep groups where sum > 10.
data_output
intermediate
1:30remaining
Number of rows after group-level filter
After filtering groups where the mean of 'Score' is at least 75, how many rows remain in the DataFrame?
Pandas
import pandas as pd

df = pd.DataFrame({
    'Team': ['X', 'X', 'Y', 'Y', 'Z', 'Z', 'Z'],
    'Score': [80, 70, 60, 90, 75, 85, 65]
})
filtered = df.groupby('Team').filter(lambda g: g['Score'].mean() >= 75)
print(len(filtered))
A4
B5
C7
D3
Attempts:
2 left
💡 Hint
Calculate mean score per team and count rows of teams meeting the condition.
🔧 Debug
advanced
1:30remaining
Identify the error in group filtering code
What error will this code raise when run?
Pandas
import pandas as pd

df = pd.DataFrame({
    'Group': ['G1', 'G1', 'G2', 'G2'],
    'Value': [10, 20, 5, 15]
})
filtered = df.groupby('Group').filter(lambda x: x['Value'].sum > 20)
print(filtered)
ATypeError: '>' not supported between instances of 'method' and 'int'
BKeyError: 'Value'
CSyntaxError: invalid syntax
DNo error, prints filtered DataFrame
Attempts:
2 left
💡 Hint
Check if sum is called as a method or accessed as an attribute.
🚀 Application
advanced
2:00remaining
Filter groups with at least 3 rows and max value > 50
Which code correctly filters groups in DataFrame df where each group has at least 3 rows and the maximum 'Score' is greater than 50?
Pandas
import pandas as pd

df = pd.DataFrame({
    'Category': ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C', 'C'],
    'Score': [40, 55, 60, 30, 45, 70, 65, 50, 40]
})
Adf.groupby('Category').filter(lambda g: len(g) > 3 and g['Score'].max() >= 50)
Bdf.groupby('Category').filter(lambda g: len(g) >= 3 and g['Score'].max() > 50)
Cdf.groupby('Category').filter(lambda g: g['Score'].max() > 50 or len(g) >= 3)
Ddf.groupby('Category').filter(lambda g: len(g) >= 3 and g['Score'].max() >= 50)
Attempts:
2 left
💡 Hint
Both conditions must be true: group size at least 3 and max score greater than 50.
🧠 Conceptual
expert
1:30remaining
Understanding filter() behavior on empty groups
If a group in a pandas groupby object is empty, what will be the behavior of filter() when the filtering function returns True for that group?
AThe empty group will be excluded from the result regardless of the filter function.
BThe entire DataFrame will be returned unfiltered.
Cfilter() will raise a ValueError due to empty group.
DThe empty group will be included in the result as an empty DataFrame slice.
Attempts:
2 left
💡 Hint
Consider if empty groups have any rows to include.