How to Use Multiple Aggregations with GroupBy in pandas
Use
groupby() on your DataFrame and then apply agg() with a list or dictionary of aggregation functions to perform multiple aggregations on grouped data. You can specify different functions for different columns by passing a dictionary to agg().Syntax
The basic syntax to perform multiple aggregations after grouping data in pandas is:
df.groupby('column').agg(['func1', 'func2']): Applies multiple functions to all grouped columns.df.groupby('column').agg({'col1': ['func1', 'func2'], 'col2': 'func3'}): Applies different functions to specific columns.
Here, groupby() splits data by the given column, and agg() applies aggregation functions like sum, mean, or count.
python
df.groupby('group_column').agg(['sum', 'mean']) df.groupby('group_column').agg({'col1': ['sum', 'mean'], 'col2': 'count'})
Example
This example shows how to group data by a column and apply multiple aggregation functions to different columns.
python
import pandas as pd data = { 'Team': ['A', 'A', 'B', 'B', 'C', 'C'], 'Points': [10, 15, 10, 20, 15, 10], 'Assists': [5, 7, 8, 6, 7, 5] } df = pd.DataFrame(data) result = df.groupby('Team').agg({'Points': ['sum', 'mean'], 'Assists': ['mean', 'max']}) print(result)
Output
Points Assists
sum mean mean max
Team
A 25 12.5 6.0 7
B 30 15.0 7.0 8
C 25 12.5 6.0 7
Common Pitfalls
Common mistakes when using multiple aggregations with groupby include:
- Passing a list of functions directly to
agg()without specifying columns, which applies all functions to all columns and may cause unexpected results. - Using aggregation function names as strings incorrectly (e.g., typos).
- Not resetting the index after aggregation if you want a flat DataFrame.
Always check the output structure because multiple aggregations create multi-level column headers.
python
import pandas as pd data = {'Category': ['X', 'X', 'Y', 'Y'], 'Value': [1, 2, 3, 4]} df = pd.DataFrame(data) # Wrong: applying list of functions without specifying columns wrong = df.groupby('Category').agg(['sum', 'mean']) print(wrong) # Right: specify columns with functions right = df.groupby('Category').agg({'Value': ['sum', 'mean']}) print(right)
Output
Value
sum mean
Category
X 3 1.5
Y 7 3.5
Value
sum mean
Category
X 3 1.5
Y 7 3.5
Quick Reference
Summary tips for multiple aggregations with groupby:
- Use
agg()with a list to apply multiple functions to all columns. - Use a dictionary in
agg()to apply different functions to specific columns. - Aggregation functions can be strings like
'sum','mean', or custom functions. - Result columns have multi-level headers; use
reset_index()anddf.columns = df.columns.map('_'.join)to flatten.
Key Takeaways
Use df.groupby().agg() with lists or dictionaries to apply multiple aggregation functions.
Specify functions per column with a dictionary for precise control.
Multiple aggregations create multi-level column headers; flatten if needed.
Common errors include applying functions to all columns unintentionally.
Check function names carefully to avoid typos in aggregation.