0
0
Data Analysis Pythondata~30 mins

transform() for group-level operations in Data Analysis Python - Mini Project: Build & Apply

Choose your learning style9 modes available
Using transform() for Group-Level Operations in Data Science
📖 Scenario: Imagine you work for a retail company. You have sales data for different stores and want to analyze how each store's daily sales compare to the average sales of that store.
🎯 Goal: You will create a small sales dataset, calculate the average sales per store, and then use transform() to add a new column showing how each day's sales compare to the store's average.
📋 What You'll Learn
Create a pandas DataFrame with sales data for multiple stores
Create a variable to hold the store names to group by
Use transform() to calculate the average sales per store and add it as a new column
Print the final DataFrame showing daily sales and average sales per store
💡 Why This Matters
🌍 Real World
Retail companies often analyze sales data by store to understand performance and identify trends.
💼 Career
Data analysts and data scientists use group-level operations like transform() to prepare data for reports and decision-making.
Progress0 / 4 steps
1
Create the sales data DataFrame
Create a pandas DataFrame called sales_data with these exact columns and values:
'Store': ['Store A', 'Store A', 'Store B', 'Store B', 'Store C', 'Store C']
'Day': ['Monday', 'Tuesday', 'Monday', 'Tuesday', 'Monday', 'Tuesday']
'Sales': [200, 220, 150, 180, 300, 310]
Data Analysis Python
Hint

Use pd.DataFrame with a dictionary containing the exact keys and lists of values.

2
Create the group variable
Create a variable called group_column and set it to the string 'Store' to specify the column to group by.
Data Analysis Python
Hint

Just assign the string 'Store' to the variable group_column.

3
Calculate average sales per store using transform()
Use transform() on sales_data.groupby(group_column)['Sales'] to calculate the average sales per store. Add the result as a new column called 'Average_Sales' in sales_data.
Data Analysis Python
Hint

Use groupby() on group_column, select 'Sales', then apply transform('mean').

4
Print the final DataFrame
Write a print() statement to display the sales_data DataFrame with the new 'Average_Sales' column.
Data Analysis Python
Hint

Use print(sales_data) to show the DataFrame with the new column.