0
0
MlopsHow-ToBeginner · 3 min read

How to Use fillna for Missing Values in Python

Use the fillna() method from pandas to replace missing values in your data with a specified value or method. This helps prepare your data for machine learning models by filling gaps instead of dropping rows or columns.
📐

Syntax

The fillna() method is called on a pandas DataFrame or Series to fill missing values. You specify the value or method to replace the missing data.

  • value: The value to replace missing entries with (number, string, dict, etc.).
  • method: Use 'ffill' (forward fill) or 'bfill' (backward fill) to propagate non-missing values.
  • inplace: If True, modifies the original data instead of returning a new object.
python
DataFrame.fillna(value=None, method=None, inplace=False)
💻

Example

This example shows how to fill missing values in a DataFrame with a specific number and how to use forward fill to propagate values.

python
import pandas as pd
import numpy as np

# Create a sample DataFrame with missing values
data = {'A': [1, np.nan, 3, np.nan, 5],
        'B': [np.nan, 2, np.nan, 4, 5]}
df = pd.DataFrame(data)

# Fill missing values with 0
df_filled_zero = df.fillna(0)

# Fill missing values using forward fill method
df_filled_ffill = df.fillna(method='ffill')

print('Original DataFrame:')
print(df)
print('\nFilled with 0:')
print(df_filled_zero)
print('\nForward fill:')
print(df_filled_ffill)
Output
Original DataFrame: A B 0 1.0 NaN 1 NaN 2.0 2 3.0 NaN 3 NaN 4.0 4 5.0 5.0 Filled with 0: A B 0 1.0 0.0 1 0.0 2.0 2 3.0 0.0 3 0.0 4.0 4 5.0 5.0 Forward fill: A B 0 1.0 NaN 1 1.0 2.0 2 3.0 2.0 3 3.0 4.0 4 5.0 5.0
⚠️

Common Pitfalls

Common mistakes when using fillna() include:

  • Using inplace=True without assigning the result, which can cause confusion if you expect a new object.
  • Trying to fill with incompatible types (e.g., filling numeric columns with strings).
  • Not handling missing values in categorical columns properly.
  • Assuming fillna() modifies the original data without inplace=True.
python
import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, np.nan, 3]})

# Wrong: This does not change df unless inplace=True or reassigned
ndf = df.fillna(0)
print('Without reassignment:')
print(df)

# Right: Reassign or use inplace
ndf = df.fillna(0)
print('\nWith reassignment:')
print(ndf)

# Or inplace
# df.fillna(0, inplace=True)
# print('\nWith inplace:')
# print(df)
Output
Without reassignment: A 0 1.0 1 NaN 2 3.0 With reassignment: A 0 1.0 1 0.0 2 3.0
📊

Quick Reference

Summary tips for using fillna():

  • Use value to fill with a constant.
  • Use method='ffill' or 'bfill' to propagate values.
  • Use inplace=True to modify data directly.
  • Always check data types before filling.
  • For machine learning, filling missing values helps avoid errors during model training.

Key Takeaways

Use pandas fillna() to replace missing values with a constant or method.
Remember to assign the result or use inplace=True to keep changes.
Choose fill values compatible with your data types.
Forward fill and backward fill help propagate existing values.
Filling missing data is essential before training machine learning models.