Grouping Data by Multiple Columns with pandas
📖 Scenario: You work in a small store that sells different types of fruits in various cities. You have a list of sales records showing the city, fruit type, and quantity sold. You want to find out how many fruits were sold for each fruit type in each city.
🎯 Goal: Build a pandas DataFrame with sales data, then group the data by City and Fruit columns to find the total quantity sold for each group.
📋 What You'll Learn
Create a pandas DataFrame named
sales_data with columns City, Fruit, and Quantity using the exact data provided.Create a variable named
group_columns that holds a list of the column names 'City' and 'Fruit'.Use the
groupby method on sales_data with group_columns and sum the Quantity for each group, saving the result in a variable named grouped_sales.Print the
grouped_sales DataFrame to display the total quantity sold for each fruit in each city.💡 Why This Matters
🌍 Real World
Grouping data by multiple columns is common in sales analysis, customer segmentation, and many business reports where you want to summarize data by categories.
💼 Career
Data analysts and data scientists often use grouping to aggregate data and find insights from large datasets in fields like marketing, finance, and operations.
Progress0 / 4 steps