0
0
PandasHow-ToBeginner · 3 min read

How to Use ffill and bfill in pandas for Missing Data

In pandas, ffill (forward fill) fills missing values by copying the last valid value forward, while bfill (backward fill) fills missing values by using the next valid value backward. You can apply them using DataFrame.fillna(method='ffill') or DataFrame.fillna(method='bfill') to handle missing data easily.
📐

Syntax

The ffill and bfill methods are used with fillna() to fill missing values in pandas objects.

  • df.fillna(method='ffill'): fills missing values by propagating the last valid observation forward.
  • df.fillna(method='bfill'): fills missing values by using the next valid observation backward.

You can also specify the axis parameter to fill along rows (axis=0) or columns (axis=1).

python
df.fillna(method='ffill', axis=0)
df.fillna(method='bfill', axis=1)
💻

Example

This example shows how to use ffill and bfill to fill missing values in a DataFrame.

python
import pandas as pd
import numpy as np

data = {'A': [1, np.nan, np.nan, 4], 'B': [np.nan, 2, np.nan, 4], 'C': [1, 2, 3, np.nan]}
df = pd.DataFrame(data)

# Forward fill missing values
ffill_df = df.fillna(method='ffill')

# Backward fill missing values
bfill_df = df.fillna(method='bfill')

print('Original DataFrame:')
print(df)
print('\nAfter forward fill (ffill):')
print(ffill_df)
print('\nAfter backward fill (bfill):')
print(bfill_df)
Output
Original DataFrame: A B C 0 1.0 NaN 1.0 1 NaN 2.0 2.0 2 NaN NaN 3.0 3 4.0 4.0 NaN After forward fill (ffill): A B C 0 1.0 NaN 1.0 1 1.0 2.0 2.0 2 1.0 2.0 3.0 3 4.0 4.0 3.0 After backward fill (bfill): A B C 0 1.0 2.0 1.0 1 4.0 2.0 2.0 2 4.0 4.0 3.0 3 4.0 4.0 NaN
⚠️

Common Pitfalls

Common mistakes when using ffill and bfill include:

  • Not specifying the axis parameter correctly, which can lead to unexpected filling direction.
  • Using ffill or bfill on data without any valid values to fill from, resulting in unchanged NaNs.
  • Assuming these methods fill all missing values; they only fill where a valid value exists before (ffill) or after (bfill).

Example of a wrong approach and the correct fix:

python
import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [np.nan, np.nan, 3]})

# Wrong: forward fill with no initial valid value
wrong_fill = df.fillna(method='ffill')

# Correct: backward fill to fill initial NaNs
correct_fill = df.fillna(method='bfill')

print('Wrong fill (ffill):')
print(wrong_fill)
print('\nCorrect fill (bfill):')
print(correct_fill)
Output
Wrong fill (ffill): A 0 NaN 1 NaN 2 3.0 Correct fill (bfill): A 0 3.0 1 3.0 2 3.0
📊

Quick Reference

MethodDescriptionDefault axis
fillna(method='ffill')Fill missing values forward using last valid valueaxis=0 (rows)
fillna(method='bfill')Fill missing values backward using next valid valueaxis=0 (rows)
axis=0Fill down each columnDefault
axis=1Fill across each rowOptional

Key Takeaways

Use fillna(method='ffill') to fill missing values by carrying forward the last valid value.
Use fillna(method='bfill') to fill missing values by using the next valid value backward.
Specify axis=0 to fill down columns or axis=1 to fill across rows.
ffill and bfill only fill where valid values exist before or after missing data.
If no valid values exist in the fill direction, missing values remain unchanged.