How to Iterate Over Rows in pandas DataFrame
To iterate over rows in a pandas DataFrame, use
iterrows() to get each row as an index and Series, or itertuples() for faster access as named tuples. For applying a function to each row, use apply() with axis=1.Syntax
Here are common ways to iterate over rows in a pandas DataFrame:
for index, row in df.iterrows():— loops over rows, each row is a Series.for row in df.itertuples():— loops over rows as named tuples, faster than iterrows.df.apply(function, axis=1)— applies a function to each row.
python
for index, row in df.iterrows(): # use index and row (Series) for row in df.itertuples(): # use row as namedtuple def my_func(row): # process row return row['column'] * 2 result = df.apply(my_func, axis=1)
Example
This example shows how to iterate over rows using iterrows() and itertuples(), printing values from each row.
python
import pandas as pd data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]} df = pd.DataFrame(data) print('Using iterrows():') for index, row in df.iterrows(): print(f"Index: {index}, Name: {row['Name']}, Age: {row['Age']}") print('\nUsing itertuples():') for row in df.itertuples(): print(f"Index: {row.Index}, Name: {row.Name}, Age: {row.Age}")
Output
Using iterrows():
Index: 0, Name: Alice, Age: 25
Index: 1, Name: Bob, Age: 30
Index: 2, Name: Charlie, Age: 35
Using itertuples():
Index: 0, Name: Alice, Age: 25
Index: 1, Name: Bob, Age: 30
Index: 2, Name: Charlie, Age: 35
Common Pitfalls
Common mistakes when iterating rows in pandas include:
- Modifying the DataFrame inside the loop does not change the original DataFrame.
iterrows()returns a copy of each row, so changes torowdo not affectdf.itertuples()is faster but returns named tuples, so you cannot assign new values to rows.- For large DataFrames, row iteration is slow; vectorized operations or
apply()are better.
python
import pandas as pd data = {'A': [1, 2, 3]} df = pd.DataFrame(data) # Wrong: trying to modify df inside iterrows for index, row in df.iterrows(): row['A'] = row['A'] * 2 # This does NOT change df print(df) # Original df unchanged # Right: use loc to modify df for index, row in df.iterrows(): df.loc[index, 'A'] = row['A'] * 2 print(df) # df updated correctly
Output
A
0 1
1 2
2 3
A
0 2
1 4
2 6
Quick Reference
| Method | Description | Return Type | Performance |
|---|---|---|---|
| iterrows() | Iterate rows as (index, Series) | Tuple (index, Series) | Slower |
| itertuples() | Iterate rows as named tuples | Namedtuple | Faster |
| apply(func, axis=1) | Apply function to each row | Depends on function | Moderate |
Key Takeaways
Use iterrows() to loop over rows as index and Series pairs.
Use itertuples() for faster iteration with named tuples.
Modifying rows inside iterrows() does not change the original DataFrame; use df.loc to update.
For better performance, prefer vectorized operations or apply() over explicit loops.
Iterating rows is simple but slow for large data; avoid if possible.