0
0
Pandasdata~30 mins

GroupBy with transform for normalization in Pandas - Mini Project: Build & Apply

Choose your learning style9 modes available
GroupBy with transform for normalization
📖 Scenario: You work in a company that collects sales data from different stores. Each store has sales numbers for several products. You want to compare sales within each store by normalizing the sales numbers.
🎯 Goal: You will create a pandas DataFrame with sales data, then use groupby and transform to normalize sales within each store. Finally, you will print the normalized sales.
📋 What You'll Learn
Create a pandas DataFrame named sales_data with columns 'Store' and 'Sales' using exact values
Create a variable mean_sales that stores the mean sales per store using groupby and transform
Create a new column 'Normalized_Sales' in sales_data by subtracting mean_sales from 'Sales'
Print the sales_data DataFrame to show the normalized sales
💡 Why This Matters
🌍 Real World
Normalizing sales data within groups helps compare performance fairly across stores by removing store-level differences.
💼 Career
Data analysts and data scientists often use groupby and transform in pandas to prepare and normalize data for analysis and reporting.
Progress0 / 4 steps
1
Create the sales data DataFrame
Create a pandas DataFrame called sales_data with two columns: 'Store' and 'Sales'. Use these exact values: 'Store': ['A', 'A', 'B', 'B', 'B', 'C'], 'Sales': [100, 150, 200, 210, 190, 300].
Pandas
Need a hint?

Use pd.DataFrame with a dictionary containing the two columns and their values.

2
Calculate mean sales per store using groupby and transform
Create a variable called mean_sales that stores the mean sales for each store. Use sales_data.groupby('Store')['Sales'].transform('mean') to calculate it.
Pandas
Need a hint?

Use groupby('Store')['Sales'].transform('mean') to get the mean sales for each row's store.

3
Create normalized sales column
Create a new column in sales_data called 'Normalized_Sales' by subtracting mean_sales from the 'Sales' column.
Pandas
Need a hint?

Subtract the mean_sales Series from the 'Sales' column and assign it to a new column.

4
Print the normalized sales DataFrame
Print the sales_data DataFrame to display the original sales and the normalized sales values.
Pandas
Need a hint?

Use print(sales_data) to show the DataFrame with normalized sales.