0
0
Data-analysis-pythonHow-ToBeginner ยท 3 min read

How to Fill Missing Values in Python: Simple Methods Explained

In Python, you can fill missing values using the fillna() method from the pandas library. This method lets you replace missing data with a specific value, the mean, median, or other calculated values to keep your data clean and ready for analysis.
๐Ÿ“

Syntax

The fillna() method is used on pandas Series or DataFrame objects to replace missing values (NaN). It accepts a value or method to fill the gaps.

  • value: A scalar, dict, Series, or DataFrame to use for filling missing values.
  • method: Use 'ffill' to forward fill or 'bfill' to backward fill missing values.
  • inplace: If True, modifies the original object instead of returning a new one.
python
DataFrame.fillna(value=None, method=None, inplace=False)
๐Ÿ’ป

Example

This example shows how to fill missing values in a pandas DataFrame with a fixed value and with the column mean.

python
import pandas as pd
import numpy as np

data = {'A': [1, 2, np.nan, 4], 'B': [np.nan, 2, 3, 4]}
df = pd.DataFrame(data)

# Fill missing values with 0
filled_zero = df.fillna(0)

# Fill missing values with column mean
filled_mean = df.fillna(df.mean())

print("Original DataFrame:\n", df)
print("\nFill missing with 0:\n", filled_zero)
print("\nFill missing with mean:\n", filled_mean)
Output
Original DataFrame: A B 0 1.0 NaN 1 2.0 2.0 2 NaN 3.0 3 4.0 4.0 Fill missing with 0: A B 0 1.0 0.0 1 2.0 2.0 2 0.0 3.0 3 4.0 4.0 Fill missing with mean: A B 0 1.000000 3.0 1 2.000000 2.0 2 2.333333 3.0 3 4.000000 4.0
โš ๏ธ

Common Pitfalls

One common mistake is trying to fill missing values without importing pandas or forgetting to create a DataFrame. Another is using fillna() without assigning the result back or using inplace=True, which means changes won't be saved unless specified.

Also, filling with inappropriate values (like zero for all columns) can distort your data analysis.

python
import pandas as pd
import numpy as np

data = {'A': [1, np.nan, 3]}
df = pd.DataFrame(data)

# Wrong: does not save the filled DataFrame
filled_wrong = df.fillna(0)
print("Original DataFrame after fillna without assignment:\n", df)

# Right: assign back or use inplace
filled_right = df.fillna(0)
print("\nFilled DataFrame with assignment:\n", filled_right)

# Or inplace
# df.fillna(0, inplace=True)
# print(df)
Output
Original DataFrame after fillna without assignment: A 0 1.0 1 NaN 2 3.0 Filled DataFrame with assignment: A 0 1.0 1 0.0 2 3.0
๐Ÿ“Š

Quick Reference

Here is a quick summary of common ways to fill missing values in pandas:

MethodDescriptionExample
fillna(value)Fill missing with a fixed valuedf.fillna(0)
fillna(method='ffill')Forward fill missing valuesdf.fillna(method='ffill')
fillna(method='bfill')Backward fill missing valuesdf.fillna(method='bfill')
fillna(df.mean())Fill missing with column meandf.fillna(df.mean())
fillna(df.median())Fill missing with column mediandf.fillna(df.median())
โœ…

Key Takeaways

Use pandas' fillna() method to replace missing values in DataFrames or Series.
Always assign the result of fillna() back or use inplace=True to save changes.
You can fill missing values with fixed values, forward/backward fill, or statistics like mean or median.
Filling missing values improperly can distort your data analysis results.
Check your data after filling to ensure the replacements make sense.