0
0
Pandasdata~30 mins

Grouping by multiple columns in Pandas - Mini Project: Build & Apply

Choose your learning style9 modes available
Grouping Data by Multiple Columns with pandas
📖 Scenario: You work in a small store that sells different types of fruits in various cities. You have a list of sales records showing the city, fruit type, and quantity sold. You want to find out how many fruits were sold for each fruit type in each city.
🎯 Goal: Build a pandas DataFrame with sales data, then group the data by City and Fruit columns to find the total quantity sold for each group.
📋 What You'll Learn
Create a pandas DataFrame named sales_data with columns City, Fruit, and Quantity using the exact data provided.
Create a variable named group_columns that holds a list of the column names 'City' and 'Fruit'.
Use the groupby method on sales_data with group_columns and sum the Quantity for each group, saving the result in a variable named grouped_sales.
Print the grouped_sales DataFrame to display the total quantity sold for each fruit in each city.
💡 Why This Matters
🌍 Real World
Grouping data by multiple columns is common in sales analysis, customer segmentation, and many business reports where you want to summarize data by categories.
💼 Career
Data analysts and data scientists often use grouping to aggregate data and find insights from large datasets in fields like marketing, finance, and operations.
Progress0 / 4 steps
1
Create the sales data DataFrame
Create a pandas DataFrame called sales_data with these exact rows and columns: City, Fruit, and Quantity. Use the data: ('New York', 'Apple', 10), ('New York', 'Banana', 5), ('Los Angeles', 'Apple', 7), ('Los Angeles', 'Banana', 3), ('New York', 'Apple', 4).
Pandas
Need a hint?

Use pd.DataFrame with a dictionary where keys are column names and values are lists of column values.

2
Create the grouping columns list
Create a variable called group_columns and set it to a list containing the strings 'City' and 'Fruit'.
Pandas
Need a hint?

Use square brackets to create a list with the two column names as strings.

3
Group the data and sum quantities
Use the groupby method on sales_data with group_columns and sum the Quantity column for each group. Save the result in a variable called grouped_sales.
Pandas
Need a hint?

Use sales_data.groupby(group_columns)['Quantity'].sum() to group and sum.

4
Print the grouped sales result
Print the grouped_sales variable to display the total quantity sold for each fruit in each city.
Pandas
Need a hint?

Use print(grouped_sales) to display the grouped sums.