The groupby() function helps you split data into groups based on some criteria. Then you can analyze each group separately.
0
0
groupby() basics in Data Analysis Python
Introduction
You want to find the average sales per store from a big sales list.
You need to count how many students are in each class from a school dataset.
You want to sum expenses by category from your budget data.
You want to see the maximum temperature recorded each day from weather data.
Syntax
Data Analysis Python
df.groupby('column_name')This creates groups based on unique values in the specified column.
You usually follow it with an aggregation like .sum() or .mean().
Examples
Groups data by unique values in the 'Category' column.
Data Analysis Python
df.groupby('Category')Groups by 'Store' and sums the 'Sales' for each store.
Data Analysis Python
df.groupby('Store')['Sales'].sum()
Groups by both 'City' and 'Year', then finds average temperature for each group.
Data Analysis Python
df.groupby(['City', 'Year'])['Temperature'].mean()
Sample Program
This code groups sales data by store and sums the sales for each store.
Data Analysis Python
import pandas as pd # Create sample data sales_data = pd.DataFrame({ 'Store': ['A', 'A', 'B', 'B', 'C', 'C'], 'Sales': [100, 150, 200, 250, 300, 350] }) # Group by 'Store' and sum sales sales_sum = sales_data.groupby('Store')['Sales'].sum() print(sales_sum)
OutputSuccess
Important Notes
After groupby(), you must use an aggregation function like sum(), mean(), or count() to get results.
You can group by multiple columns by passing a list, like groupby(['col1', 'col2']).
The result of groupby() with aggregation is usually a new DataFrame or Series.
Summary
groupby() splits data into groups based on column values.
Use aggregation functions after grouping to analyze each group.
It helps summarize and understand data by categories.