Grouping by multiple columns helps you organize data into smaller parts based on several categories at once. This makes it easier to analyze patterns and summaries for each group.
Grouping by multiple columns in Pandas
df.groupby(['column1', 'column2', ...])['target_column'].aggregation_function()
You list the columns to group by inside a list.
After grouping, you choose which column to aggregate and what function to use, like sum(), mean(), or count().
df.groupby(['City', 'Month'])['Sales'].sum()
df.groupby(['Class', 'Subject'])['Score'].mean()
df.groupby(['Store', 'Product'])['Quantity'].count()
This code groups sales data by both store and product. Then it adds up the sales numbers for each group. The result shows total sales for each product in each store.
import pandas as pd data = { 'Store': ['A', 'A', 'B', 'B', 'A', 'B'], 'Product': ['Apple', 'Banana', 'Apple', 'Banana', 'Apple', 'Banana'], 'Sales': [10, 15, 5, 20, 7, 8] } df = pd.DataFrame(data) # Group by Store and Product, then sum the Sales result = df.groupby(['Store', 'Product'])['Sales'].sum() print(result)
Grouping by multiple columns creates a multi-level index in the result.
You can reset the index with reset_index() if you want a flat table.
Aggregation functions can be sum(), mean(), count(), max(), min(), and more.
Grouping by multiple columns helps analyze data by several categories at once.
Use a list of column names inside groupby() to group by more than one column.
Apply aggregation functions to get summaries like sums or averages for each group.