0
0
Data Analysis Pythondata~20 mins

filter() for group-level filtering in Data Analysis Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Group Filter Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of group filtering with filter()
What is the output DataFrame after applying filter() to keep groups with mean value greater than 3?
Data Analysis Python
import pandas as pd

df = pd.DataFrame({
    'group': ['A', 'A', 'B', 'B', 'C', 'C'],
    'value': [2, 4, 5, 6, 1, 2]
})
result = df.groupby('group').filter(lambda x: x['value'].mean() > 3)
print(result)
A
  group  value
0     A      2
1     A      4
2     B      5
3     B      6
B
  group  value
2     B      5
3     B      6
C
  group  value
0     A      2
1     A      4
2     B      5
3     B      6
4     C      1
5     C      2
D
Empty DataFrame
Columns: [group, value]
Index: []
Attempts:
2 left
💡 Hint
Calculate the mean of 'value' for each group and keep groups where mean > 3.
data_output
intermediate
2:00remaining
Number of groups after filtering
After filtering groups with sum of 'score' >= 10, how many groups remain?
Data Analysis Python
import pandas as pd

df = pd.DataFrame({
    'team': ['X', 'X', 'Y', 'Y', 'Z', 'Z'],
    'score': [4, 7, 3, 2, 6, 5]
})
filtered = df.groupby('team').filter(lambda g: g['score'].sum() >= 10)
num_groups = filtered['team'].nunique()
print(num_groups)
A0
B1
C3
D2
Attempts:
2 left
💡 Hint
Sum scores per team and count teams with sum >= 10.
🔧 Debug
advanced
2:00remaining
Identify the error in group filtering code
What error does this code raise when filtering groups with mean > 2?
df.groupby('cat').filter(lambda x: x['val'].mean > 2)
Data Analysis Python
import pandas as pd

df = pd.DataFrame({'cat': ['a', 'a', 'b'], 'val': [1, 3, 4]})
result = df.groupby('cat').filter(lambda x: x['val'].mean > 2)
ANo error, outputs filtered DataFrame
BAttributeError: 'DataFrameGroupBy' object has no attribute 'filter'
CTypeError: 'method' object is not callable
DSyntaxError: invalid syntax
Attempts:
2 left
💡 Hint
Check if mean is called as a method or accessed as an attribute.
🚀 Application
advanced
2:00remaining
Filter groups with more than 2 rows and average value below 5
Which option correctly filters groups with more than 2 rows and average 'score' less than 5?
Data Analysis Python
import pandas as pd

df = pd.DataFrame({
    'group': ['G1', 'G1', 'G1', 'G2', 'G2', 'G3', 'G3', 'G3'],
    'score': [4, 3, 2, 6, 7, 1, 2, 3]
})
Adf.groupby('group').filter(lambda g: len(g) > 2 and g['score'].mean() < 5)
Bdf.groupby('group').filter(lambda g: len(g) >= 2 and g['score'].mean() <= 5)
Cdf.groupby('group').filter(lambda g: len(g) > 3 and g['score'].mean() < 5)
Ddf.groupby('group').filter(lambda g: len(g) > 2 or g['score'].mean() < 5)
Attempts:
2 left
💡 Hint
Use both conditions with 'and' and strict inequalities as stated.
🧠 Conceptual
expert
2:00remaining
Understanding filter() behavior on groupby objects
Which statement about filter() on a pandas GroupBy object is TRUE?
Afilter() returns a DataFrame containing only groups where the filter function returns True for the entire group.
Bfilter() modifies the original DataFrame in place by removing groups that do not meet the condition.
Cfilter() applies the function to each row individually and returns rows where the function returns True.
Dfilter() returns a Series with group names as index and boolean values indicating if group passed the filter.
Attempts:
2 left
💡 Hint
Think about how filter works on groups, not individual rows.