PandasHow-ToBeginner · 3 min read

How to Group by Single Column in pandas: Simple Guide

Use the groupby() method on a pandas DataFrame and pass the column name as a string to group by a single column. Then apply aggregation functions like sum(), mean(), or count() to get grouped results.

📐

Syntax

The basic syntax to group by a single column in pandas is:

df.groupby('column_name'): Groups the DataFrame df by the values in column_name.
After grouping, you can apply aggregation functions like sum(), mean(), or count() to summarize the groups.

python

df.groupby('column_name').aggregation_function()

💻

Example

This example shows how to group a DataFrame by a single column and calculate the sum of another column for each group.

python

import pandas as pd

data = {'Category': ['A', 'B', 'A', 'B', 'C', 'A'],
        'Values': [10, 20, 30, 40, 50, 60]}
df = pd.DataFrame(data)

# Group by 'Category' and sum the 'Values'
grouped = df.groupby('Category')['Values'].sum()
print(grouped)

Output

Category A 100 B 60 C 50 Name: Values, dtype: int64

⚠️

Common Pitfalls

Common mistakes when grouping by a single column include:

Passing a list instead of a string for a single column, e.g., df.groupby(['Category']) works but is unnecessary for one column.
Forgetting to select the column to aggregate after grouping, which can cause unexpected results.
Not applying an aggregation function, which returns a DataFrameGroupBy object instead of summarized data.

python

import pandas as pd

data = {'Category': ['A', 'B', 'A'], 'Values': [1, 2, 3]}
df = pd.DataFrame(data)

# Wrong: grouping without aggregation
grouped_wrong = df.groupby('Category')
print(type(grouped_wrong))  # Shows DataFrameGroupBy object

# Right: grouping with aggregation
grouped_right = df.groupby('Category')['Values'].sum()
print(grouped_right)

Output

<class 'pandas.core.groupby.generic.DataFrameGroupBy'> Category A 4 B 2 Name: Values, dtype: int64

📊

Quick Reference

Operation	Example	Description
Group by single column	df.groupby('col')	Groups data by values in 'col'
Sum values	df.groupby('col')['val'].sum()	Sum of 'val' for each group
Mean values	df.groupby('col')['val'].mean()	Average of 'val' for each group
Count rows	df.groupby('col').size()	Count of rows in each group

✅

Key Takeaways

Use df.groupby('column_name') to group data by one column in pandas.

Always apply an aggregation function like sum(), mean(), or count() after grouping.

Selecting the column to aggregate after grouping avoids confusion and errors.

Passing a string for a single column is simpler than using a list with one element.

The groupby() method returns a GroupBy object that needs aggregation to see results.