How to Use fillna for Missing Values in Python
Use the
fillna() method from pandas to replace missing values in your data with a specified value or method. This helps prepare your data for machine learning models by filling gaps instead of dropping rows or columns.Syntax
The fillna() method is called on a pandas DataFrame or Series to fill missing values. You specify the value or method to replace the missing data.
value: The value to replace missing entries with (number, string, dict, etc.).method: Use'ffill'(forward fill) or'bfill'(backward fill) to propagate non-missing values.inplace: IfTrue, modifies the original data instead of returning a new object.
python
DataFrame.fillna(value=None, method=None, inplace=False)
Example
This example shows how to fill missing values in a DataFrame with a specific number and how to use forward fill to propagate values.
python
import pandas as pd import numpy as np # Create a sample DataFrame with missing values data = {'A': [1, np.nan, 3, np.nan, 5], 'B': [np.nan, 2, np.nan, 4, 5]} df = pd.DataFrame(data) # Fill missing values with 0 df_filled_zero = df.fillna(0) # Fill missing values using forward fill method df_filled_ffill = df.fillna(method='ffill') print('Original DataFrame:') print(df) print('\nFilled with 0:') print(df_filled_zero) print('\nForward fill:') print(df_filled_ffill)
Output
Original DataFrame:
A B
0 1.0 NaN
1 NaN 2.0
2 3.0 NaN
3 NaN 4.0
4 5.0 5.0
Filled with 0:
A B
0 1.0 0.0
1 0.0 2.0
2 3.0 0.0
3 0.0 4.0
4 5.0 5.0
Forward fill:
A B
0 1.0 NaN
1 1.0 2.0
2 3.0 2.0
3 3.0 4.0
4 5.0 5.0
Common Pitfalls
Common mistakes when using fillna() include:
- Using
inplace=Truewithout assigning the result, which can cause confusion if you expect a new object. - Trying to fill with incompatible types (e.g., filling numeric columns with strings).
- Not handling missing values in categorical columns properly.
- Assuming
fillna()modifies the original data withoutinplace=True.
python
import pandas as pd import numpy as np df = pd.DataFrame({'A': [1, np.nan, 3]}) # Wrong: This does not change df unless inplace=True or reassigned ndf = df.fillna(0) print('Without reassignment:') print(df) # Right: Reassign or use inplace ndf = df.fillna(0) print('\nWith reassignment:') print(ndf) # Or inplace # df.fillna(0, inplace=True) # print('\nWith inplace:') # print(df)
Output
Without reassignment:
A
0 1.0
1 NaN
2 3.0
With reassignment:
A
0 1.0
1 0.0
2 3.0
Quick Reference
Summary tips for using fillna():
- Use
valueto fill with a constant. - Use
method='ffill'or'bfill'to propagate values. - Use
inplace=Trueto modify data directly. - Always check data types before filling.
- For machine learning, filling missing values helps avoid errors during model training.
Key Takeaways
Use pandas fillna() to replace missing values with a constant or method.
Remember to assign the result or use inplace=True to keep changes.
Choose fill values compatible with your data types.
Forward fill and backward fill help propagate existing values.
Filling missing data is essential before training machine learning models.