0
0
PandasHow-ToBeginner · 3 min read

How to Replace NaN with Mean in pandas DataFrame

Use the fillna() method with the mean value of the column to replace NaN values in a pandas DataFrame. Calculate the mean using df['column'].mean() and then apply df['column'].fillna(mean_value, inplace=True) to update the data.
📐

Syntax

The basic syntax to replace NaN values with the mean in a pandas DataFrame column is:

  • df['column'].mean(): Calculates the mean of the specified column, ignoring NaN values.
  • df['column'].fillna(value, inplace=True): Replaces NaN values with the given value. Setting inplace=True updates the DataFrame directly.
python
mean_value = df['column'].mean()
df['column'].fillna(mean_value, inplace=True)
💻

Example

This example shows how to replace NaN values in the 'Age' column with the mean age in a pandas DataFrame.

python
import pandas as pd
import numpy as np

data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, np.nan, 30, np.nan]}
df = pd.DataFrame(data)

mean_age = df['Age'].mean()
df['Age'].fillna(mean_age, inplace=True)

print(df)
Output
Name Age 0 Alice 25.0 1 Bob 27.5 2 Charlie 30.0 3 David 27.5
⚠️

Common Pitfalls

Common mistakes when replacing NaN with mean include:

  • Not calculating the mean before filling, which causes errors.
  • Forgetting to set inplace=True, so the DataFrame is not updated.
  • Applying mean replacement on non-numeric columns, which will fail.

Always ensure the column is numeric and the mean is calculated first.

python
import pandas as pd
import numpy as np

data = {'Score': [10, np.nan, 20]}
df = pd.DataFrame(data)

# Wrong: fillna with mean without calculating mean
# df['Score'].fillna(df['Score'].fillna(), inplace=True)  # This does nothing

# Correct way:
mean_score = df['Score'].mean()
df['Score'].fillna(mean_score, inplace=True)

print(df)
Output
Score 0 10.0 1 15.0 2 20.0
📊

Quick Reference

Summary tips for replacing NaN with mean in pandas:

  • Use df['col'].mean() to get the mean.
  • Use fillna(mean_value, inplace=True) to replace NaN.
  • Check column data type is numeric before applying.
  • Use inplace=True to modify the DataFrame directly.

Key Takeaways

Calculate the mean of the column before replacing NaN values.
Use fillna() with inplace=True to update the DataFrame directly.
Only apply mean replacement on numeric columns to avoid errors.
Always verify the DataFrame after replacement to confirm changes.