Grouping helps us organize data by categories to see patterns or summaries easily.
0
0
Single and multiple column grouping in Data Analysis Python
Introduction
You want to find the total sales per product category.
You need to count how many students are in each class and gender.
You want to calculate the average temperature for each city and month.
You want to summarize expenses by department and month.
Syntax
Data Analysis Python
df.groupby('column_name').agg({'column_to_aggregate': 'aggregation_function'}) df.groupby(['col1', 'col2']).agg({'column_to_aggregate': 'aggregation_function'})
Use a single column name as a string for grouping by one column.
Use a list of column names to group by multiple columns.
Examples
Groups data by 'Category' and sums all numeric columns.
Data Analysis Python
df.groupby('Category').sum()
Groups data by 'Category' and 'Region', then calculates the average of numeric columns.
Data Analysis Python
df.groupby(['Category', 'Region']).mean()
Groups by 'Category' and applies different aggregation functions to columns.
Data Analysis Python
df.groupby('Category').agg({'Sales': 'sum', 'Quantity': 'mean'})
Sample Program
This program creates a small sales dataset. It first groups data by 'Category' and sums numeric columns. Then it groups by both 'Category' and 'Region' and calculates the average quantity.
Data Analysis Python
import pandas as pd # Create sample data sales_data = pd.DataFrame({ 'Category': ['Fruit', 'Fruit', 'Vegetable', 'Vegetable', 'Fruit'], 'Region': ['North', 'South', 'North', 'South', 'North'], 'Sales': [100, 150, 200, 130, 120], 'Quantity': [10, 15, 20, 13, 12] }) # Group by single column and sum sales single_group = sales_data.groupby('Category').sum() # Group by multiple columns and calculate mean quantity multi_group = sales_data.groupby(['Category', 'Region']).agg({'Quantity': 'mean'}) print('Single column grouping (sum):\n', single_group) print('\nMultiple column grouping (mean):\n', multi_group)
OutputSuccess
Important Notes
Grouping returns a new table with the grouped keys as index.
You can use many aggregation functions like sum, mean, count, min, max.
After grouping, you can reset the index if you want normal columns using reset_index().
Summary
Grouping helps summarize data by one or more columns.
Use groupby() with a column name or list of columns.
Apply aggregation functions like sum or mean to get useful summaries.