The transform() function helps you change data within groups while keeping the original data shape. It lets you add new info or adjust values based on group details.
transform() for group-level operations in Data Analysis Python
DataFrame.groupby('column')['target_column'].transform(function)
The transform() applies a function to each group and returns a result with the same size as the original data.
This means you can add or replace columns without changing the number of rows.
df.groupby('Team')['Score'].transform('mean')
df.groupby('Category')['Value'].transform(lambda x: x - x.min())
df.groupby('Group')['Sales'].transform('max')
This code creates a table of players with their scores and teams. It then adds the average score for each team to every player in that team. Finally, it shows how much each player's score differs from their team's average.
import pandas as pd data = {'Team': ['A', 'A', 'B', 'B', 'B'], 'Player': ['John', 'Mike', 'Anna', 'Tom', 'Sara'], 'Score': [10, 15, 10, 20, 30]} df = pd.DataFrame(data) # Calculate average score per team and add as new column df['Team_Avg'] = df.groupby('Team')['Score'].transform('mean') # Calculate score difference from team average df['Diff_from_Avg'] = df['Score'] - df['Team_Avg'] print(df)
transform() keeps the original number of rows, unlike agg() which reduces rows.
You can use built-in functions like 'mean', 'max', or your own custom functions with lambda.
It is useful when you want to add group-level info back to the original data.
transform() applies a function to groups and returns a result matching the original data size.
It helps add or adjust columns based on group calculations without changing row count.
Use it to compare individual values to group stats or fill missing data within groups.