How to Drop Missing Values in pandas DataFrames
Use the
dropna() method in pandas to remove rows or columns with missing values (NaN). You can specify whether to drop rows or columns and control how strictly missing values are detected.Syntax
The dropna() method removes missing values from a DataFrame or Series. Key parameters include:
axis=0: Drop rows with missing values (default).axis=1: Drop columns with missing values.how='any': Drop if any value is missing (default).how='all': Drop only if all values are missing.subset=[columns]: Specify columns to check for missing values.inplace=False: Return a new object by default; set toTrueto modify in place.
python
DataFrame.dropna(axis=0, how='any', subset=None, inplace=False)
Example
This example shows how to drop rows with any missing values and how to drop columns with missing values.
python
import pandas as pd data = {'Name': ['Alice', 'Bob', None, 'David'], 'Age': [25, None, 30, 22], 'City': ['NY', 'LA', 'SF', None]} df = pd.DataFrame(data) # Drop rows with any missing values rows_dropped = df.dropna() # Drop columns with any missing values cols_dropped = df.dropna(axis=1) print('Original DataFrame:') print(df) print('\nAfter dropping rows with missing values:') print(rows_dropped) print('\nAfter dropping columns with missing values:') print(cols_dropped)
Output
Original DataFrame:
Name Age City
0 Alice 25.0 NY
1 Bob NaN LA
2 None 30.0 SF
3 David 22.0 None
After dropping rows with missing values:
Name Age City
0 Alice 25.0 NY
After dropping columns with missing values:
Name
Common Pitfalls
Common mistakes include:
- Forgetting that
dropna()returns a new DataFrame unlessinplace=Trueis set. - Not specifying
axiscorrectly, which can lead to dropping rows instead of columns or vice versa. - Using
dropna()without considering which columns to check, which might remove more data than intended.
python
import pandas as pd data = {'A': [1, None, 3], 'B': [None, 2, 3]} df = pd.DataFrame(data) # Wrong: expecting original df to change but it doesn't wrong = df.dropna() print('Wrong (df unchanged):') print(df) # Right: use inplace=True to modify original df correct = df.dropna(inplace=True) print('\nRight (df changed):') print(df)
Output
Wrong (df unchanged):
A B
0 1.0 NaN
1 NaN 2.0
2 3.0 3.0
Right (df changed):
A B
2 3.0 3.0
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| axis | 0 to drop rows, 1 to drop columns | 0 |
| how | 'any' drops if any missing, 'all' drops if all missing | 'any' |
| subset | List of columns to check for missing values | None |
| inplace | Modify original DataFrame if True | False |
Key Takeaways
Use df.dropna() to remove rows or columns with missing values in pandas.
Set axis=1 to drop columns instead of rows.
Remember dropna() returns a new DataFrame unless inplace=True is set.
Use how='all' to drop only if all values are missing in the row/column.
Specify subset to limit which columns to check for missing values.