How to Fill Missing Values in Python: Simple Methods Explained
In Python, you can fill missing values using the
fillna() method from the pandas library. This method lets you replace missing data with a specific value, the mean, median, or other calculated values to keep your data clean and ready for analysis.Syntax
The fillna() method is used on pandas Series or DataFrame objects to replace missing values (NaN). It accepts a value or method to fill the gaps.
value: A scalar, dict, Series, or DataFrame to use for filling missing values.method: Use'ffill'to forward fill or'bfill'to backward fill missing values.inplace: IfTrue, modifies the original object instead of returning a new one.
python
DataFrame.fillna(value=None, method=None, inplace=False)
Example
This example shows how to fill missing values in a pandas DataFrame with a fixed value and with the column mean.
python
import pandas as pd import numpy as np data = {'A': [1, 2, np.nan, 4], 'B': [np.nan, 2, 3, 4]} df = pd.DataFrame(data) # Fill missing values with 0 filled_zero = df.fillna(0) # Fill missing values with column mean filled_mean = df.fillna(df.mean()) print("Original DataFrame:\n", df) print("\nFill missing with 0:\n", filled_zero) print("\nFill missing with mean:\n", filled_mean)
Output
Original DataFrame:
A B
0 1.0 NaN
1 2.0 2.0
2 NaN 3.0
3 4.0 4.0
Fill missing with 0:
A B
0 1.0 0.0
1 2.0 2.0
2 0.0 3.0
3 4.0 4.0
Fill missing with mean:
A B
0 1.000000 3.0
1 2.000000 2.0
2 2.333333 3.0
3 4.000000 4.0
Common Pitfalls
One common mistake is trying to fill missing values without importing pandas or forgetting to create a DataFrame. Another is using fillna() without assigning the result back or using inplace=True, which means changes won't be saved unless specified.
Also, filling with inappropriate values (like zero for all columns) can distort your data analysis.
python
import pandas as pd import numpy as np data = {'A': [1, np.nan, 3]} df = pd.DataFrame(data) # Wrong: does not save the filled DataFrame filled_wrong = df.fillna(0) print("Original DataFrame after fillna without assignment:\n", df) # Right: assign back or use inplace filled_right = df.fillna(0) print("\nFilled DataFrame with assignment:\n", filled_right) # Or inplace # df.fillna(0, inplace=True) # print(df)
Output
Original DataFrame after fillna without assignment:
A
0 1.0
1 NaN
2 3.0
Filled DataFrame with assignment:
A
0 1.0
1 0.0
2 3.0
Quick Reference
Here is a quick summary of common ways to fill missing values in pandas:
| Method | Description | Example |
|---|---|---|
| fillna(value) | Fill missing with a fixed value | df.fillna(0) |
| fillna(method='ffill') | Forward fill missing values | df.fillna(method='ffill') |
| fillna(method='bfill') | Backward fill missing values | df.fillna(method='bfill') |
| fillna(df.mean()) | Fill missing with column mean | df.fillna(df.mean()) |
| fillna(df.median()) | Fill missing with column median | df.fillna(df.median()) |
Key Takeaways
Use pandas' fillna() method to replace missing values in DataFrames or Series.
Always assign the result of fillna() back or use inplace=True to save changes.
You can fill missing values with fixed values, forward/backward fill, or statistics like mean or median.
Filling missing values improperly can distort your data analysis results.
Check your data after filling to ensure the replacements make sense.