How to use named aggregation pandas

PandasHow-ToBeginner · 3 min read

How to Use Named Aggregation in pandas for GroupBy

Use named aggregation in pandas by passing keyword arguments to agg() with keys as new column names and values as tuples specifying the column to aggregate and the aggregation function. This allows you to apply multiple aggregations with clear, custom output column names in a single groupby call.

📐

Syntax

The syntax for named aggregation in pandas groupby is:

df.groupby('group_column').agg(
    new_col_name1 = ('original_col1', 'agg_func1'),
    new_col_name2 = ('original_col2', 'agg_func2'),
    ...
)

Here, each new_col_name is the name you want in the output, original_col is the column to aggregate, and agg_func is the aggregation function like 'sum', 'mean', or a custom function.

python

df.groupby('group_column').agg(
    new_col_name1 = ('original_col1', 'agg_func1'),
    new_col_name2 = ('original_col2', 'agg_func2')
)

💻

Example

This example groups data by the 'Team' column and calculates the total 'Points' and average 'Assists' for each team with custom output column names.

python

import pandas as pd

data = {
    'Team': ['A', 'A', 'B', 'B', 'C'],
    'Points': [10, 15, 10, 20, 30],
    'Assists': [5, 7, 8, 6, 9]
}
df = pd.DataFrame(data)

result = df.groupby('Team').agg(
    Total_Points = ('Points', 'sum'),
    Average_Assists = ('Assists', 'mean')
)

print(result)

Output

Total_Points Average_Assists Team A 25 6.0 B 30 7.0 C 30 9.0

⚠️

Common Pitfalls

Common mistakes include:

Using a list or dict without naming the output columns, which leads to unclear or multi-level column names.
Passing aggregation functions directly without tuples, which is the older style and less flexible.
Mixing positional and named aggregations incorrectly.

Always use the tuple format (column, function) with a new column name as the key for clarity.

python

import pandas as pd

data = {'Team': ['A', 'A', 'B'], 'Points': [10, 15, 10]}
df = pd.DataFrame(data)

# Wrong: no named aggregation, unclear columns
wrong = df.groupby('Team').agg({'Points': ['sum', 'mean']})

# Right: named aggregation with clear column names
right = df.groupby('Team').agg(
    Total_Points = ('Points', 'sum'),
    Average_Points = ('Points', 'mean')
)

print('Wrong aggregation output:')
print(wrong)
print('\nRight aggregation output:')
print(right)

Output

Wrong aggregation output: Points sum mean Team A 25 12.5 B 10 10.0 Right aggregation output: Total_Points Average_Points Team A 25 12.5 B 10 10.0

📊

Quick Reference

Tips for using named aggregation:

Use agg() with keyword arguments where keys are new column names.
Each value is a tuple: (column_to_aggregate, aggregation_function).
Aggregation functions can be strings like 'sum', 'mean', or custom functions.
Named aggregation works only with pandas version 0.25.0 and later.

✅

Key Takeaways

Named aggregation lets you assign custom output column names in groupby aggregations.

Use the syntax: new_name = ('column', 'agg_func') inside agg() for clarity.

Avoid unnamed or multi-level columns by always naming your aggregations.

Works with pandas 0.25.0+ for clean, readable grouped summaries.