0
0
PandasHow-ToBeginner · 3 min read

How to Iterate Over Groups in pandas: Simple Guide

Use DataFrame.groupby() to split data into groups, then iterate over these groups with a for loop. Each loop returns a tuple with the group name and the group data as a DataFrame.
📐

Syntax

The basic syntax to iterate over groups in pandas is:

  • grouped = df.groupby('column_name') creates groups based on unique values in the specified column.
  • for group_name, group_data in grouped: loops over each group.
  • group_name is the unique value of the group.
  • group_data is a DataFrame containing rows of that group.
python
grouped = df.groupby('column_name')
for group_name, group_data in grouped:
    print(group_name)
    print(group_data)
💻

Example

This example shows how to group a DataFrame by the 'Category' column and iterate over each group to print the group name and its rows.

python
import pandas as pd

data = {'Category': ['A', 'B', 'A', 'B', 'C'],
        'Value': [10, 20, 15, 25, 30]}
df = pd.DataFrame(data)

grouped = df.groupby('Category')

for group_name, group_data in grouped:
    print(f"Group: {group_name}")
    print(group_data)
    print('---')
Output
Group: A Category Value 0 A 10 2 A 15 --- Group: B Category Value 1 B 20 3 B 25 --- Group: C Category Value 4 C 30 ---
⚠️

Common Pitfalls

Common mistakes when iterating over groups include:

  • Forgetting that group_data is a DataFrame, so you can use all DataFrame operations on it.
  • Trying to modify the original DataFrame inside the loop without using apply or other methods.
  • Assuming groups are sorted; groups appear in sorted order by default.
python
import pandas as pd

data = {'Category': ['A', 'B', 'A', 'B'], 'Value': [1, 2, 3, 4]}
df = pd.DataFrame(data)

grouped = df.groupby('Category')

# Wrong: trying to modify original df inside loop
for name, group in grouped:
    group['Value'] = group['Value'] * 2  # This does NOT change df

print(df)

# Right: use transform to modify original df

df['Value'] = grouped['Value'].transform(lambda x: x * 2)
print(df)
Output
Category Value 0 A 1 1 B 2 2 A 3 3 B 4 Category Value 0 A 2 1 B 4 2 A 6 3 B 8
📊

Quick Reference

Summary tips for iterating over groups in pandas:

  • Use df.groupby('col') to create groups.
  • Loop with for name, group in grouped:.
  • group is a DataFrame for that group.
  • Use apply or transform to modify data efficiently.
  • Groups are sorted by default.

Key Takeaways

Use DataFrame.groupby() to split data into groups by column values.
Iterate groups with a for loop receiving group name and group DataFrame.
Group data is a DataFrame; you can apply all DataFrame operations on it.
Modifying original data inside a loop requires transform or apply, not direct assignment.
Groups are sorted by default when iterating.