0
0
Pandasdata~30 mins

transform() for group-level operations in Pandas - Mini Project: Build & Apply

Choose your learning style9 modes available
Using transform() for Group-Level Operations in pandas
📖 Scenario: You work in a retail company analyzing sales data. You want to understand how each product's sales compare to the average sales of its category.
🎯 Goal: Build a pandas DataFrame with sales data, then use transform() to calculate the average sales per category and add it as a new column.
📋 What You'll Learn
Create a pandas DataFrame named sales_data with columns 'Category' and 'Sales' using the exact data provided.
Create a variable named group_column and set it to the string 'Category'.
Use groupby() on sales_data with group_column and apply transform() with 'mean' to calculate average sales per category, storing the result in a new column 'Average_Sales'.
Print the sales_data DataFrame to display the final result.
💡 Why This Matters
🌍 Real World
Retail analysts often compare individual product sales to category averages to identify strong or weak performers.
💼 Career
Data scientists and analysts use group-level operations like transform() to create features and insights for reports and machine learning.
Progress0 / 4 steps
1
Create the sales data DataFrame
Create a pandas DataFrame called sales_data with two columns: 'Category' and 'Sales'. Use these exact entries: 'Category': ['Electronics', 'Electronics', 'Clothing', 'Clothing', 'Clothing', 'Furniture'], 'Sales': [200, 150, 100, 120, 90, 300].
Pandas
Need a hint?

Use pd.DataFrame with a dictionary containing the two columns and their exact lists.

2
Set the group column variable
Create a variable called group_column and set it to the string 'Category'.
Pandas
Need a hint?

Just assign the string 'Category' to the variable group_column.

3
Calculate average sales per category using transform()
Use groupby() on sales_data with group_column and apply transform('mean') on the 'Sales' column. Assign the result to a new column in sales_data called 'Average_Sales'.
Pandas
Need a hint?

Use sales_data.groupby(group_column)['Sales'].transform('mean') to get average sales per category and assign it to sales_data['Average_Sales'].

4
Print the final DataFrame
Print the sales_data DataFrame to display the original sales and the new 'Average_Sales' column.
Pandas
Need a hint?

Use print(sales_data) to show the DataFrame with the new column.