How to Use Transform in pandas GroupBy for Data Transformation
transform with groupby in pandas to apply a function to each group and return a result aligned with the original data's index. This keeps the same shape as the input, unlike agg which reduces the size. It is useful for adding group-level calculations back to the original DataFrame.Syntax
The basic syntax for using transform with groupby is:
df.groupby('column')['value_column'].transform(function)
Here, df is your DataFrame, 'column' is the column to group by, and function is the operation applied to each group.
The transform function returns a Series or DataFrame with the same index as the original, so it can be added as a new column.
df.groupby('group_column')['value_column'].transform(function)
Example
This example shows how to calculate the mean of a column within groups and add it as a new column to the original DataFrame using transform.
import pandas as pd data = {'Team': ['A', 'A', 'B', 'B', 'C', 'C'], 'Points': [10, 15, 10, 20, 10, 30]} df = pd.DataFrame(data) # Calculate mean points per team and add as new column df['MeanPoints'] = df.groupby('Team')['Points'].transform('mean') print(df)
Common Pitfalls
One common mistake is using agg instead of transform when you want to keep the original DataFrame shape. agg reduces the result to one row per group, which cannot be directly added back to the original DataFrame.
Another pitfall is applying functions that return different shapes or multiple values per group, which transform does not support.
import pandas as pd data = {'Team': ['A', 'A', 'B', 'B'], 'Points': [10, 15, 10, 20]} df = pd.DataFrame(data) # Wrong: agg returns smaller DataFrame mean_points_agg = df.groupby('Team')['Points'].agg('mean') # This cannot be assigned directly as a new column because shapes differ # df['MeanPoints'] = mean_points_agg # This will raise an error # Right: transform returns same shape df['MeanPoints'] = df.groupby('Team')['Points'].transform('mean') print(df)
Quick Reference
Use this quick guide when working with transform in pandas groupby:
| Action | Description | Returns |
|---|---|---|
| groupby('col').transform(func) | Apply func to each group, return aligned with original index | Same shape as original |
| groupby('col').agg(func) | Aggregate func per group, reduce size | One row per group |
| transform with 'mean', 'sum', 'max', etc. | Calculate group-level stats for each row | Same shape, values repeated per group |
| transform with custom function | Apply any function returning same shape per group | Same shape as original |