0
0
Pandasdata~30 mins

GroupBy performance considerations in Pandas - Mini Project: Build & Apply

Choose your learning style9 modes available
GroupBy Performance Considerations
📖 Scenario: You work as a data analyst for a retail company. You have sales data for different products and stores. You want to find the total sales per store. But you also want to learn how to do this efficiently using pandas groupby.
🎯 Goal: Build a small pandas DataFrame with sales data, set a configuration for filtering, use groupby to calculate total sales per store efficiently, and print the result.
📋 What You'll Learn
Create a pandas DataFrame with exact sales data
Create a filter threshold variable
Use groupby on the DataFrame with the exact column names
Print the grouped result showing total sales per store
💡 Why This Matters
🌍 Real World
Retail companies often analyze sales data by store to understand performance and make decisions.
💼 Career
Data analysts and data scientists use pandas groupby to summarize and analyze data efficiently.
Progress0 / 4 steps
1
Create the sales DataFrame
Create a pandas DataFrame called sales_data with these exact columns and rows:
Store: ['Store A', 'Store B', 'Store A', 'Store C', 'Store B']
Product: ['Apples', 'Bananas', 'Oranges', 'Apples', 'Oranges']
Sales: [100, 150, 200, 130, 170]
Pandas
Need a hint?

Use pd.DataFrame with a dictionary where keys are column names and values are lists of data.

2
Set a sales filter threshold
Create a variable called min_sales and set it to 150. This will help filter stores with sales above this value later.
Pandas
Need a hint?

Just assign the number 150 to the variable min_sales.

3
Group by Store and calculate total sales
Use groupby on sales_data by the 'Store' column. Then calculate the sum of 'Sales' for each store. Store the result in a variable called total_sales_per_store. Then filter total_sales_per_store to keep only stores with sales greater than or equal to min_sales.
Pandas
Need a hint?

Use groupby('Store')['Sales'].sum() to get total sales per store. Then filter with total_sales_per_store[total_sales_per_store >= min_sales].

4
Print the total sales per store
Print the total_sales_per_store variable to see the total sales for each store that meets the sales threshold.
Pandas
Need a hint?

Use print(total_sales_per_store) to display the result.