How to Drop Missing Values in Pandas: Simple Guide
Use the
dropna() method on a pandas DataFrame or Series to remove rows or columns containing missing values. You can specify axis=0 to drop rows or axis=1 to drop columns with missing data.Syntax
The basic syntax to drop missing values in pandas is:
DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
Explanation:
axis=0: Drop rows containing missing values (default).axis=1: Drop columns containing missing values.how='any': Drop if any missing values are present.how='all': Drop only if all values are missing.thresh=n: Require at leastnnon-missing values to keep the row/column.subset: Specify columns to check for missing values.inplace=False: Return a new DataFrame by default; set toTrueto modify in place.
python
df.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
Example
This example shows how to drop rows with any missing values from a DataFrame.
python
import pandas as pd # Create a sample DataFrame with missing values data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, None, 30, 22], 'City': ['New York', 'Los Angeles', None, 'Chicago']} df = pd.DataFrame(data) print('Original DataFrame:') print(df) # Drop rows with any missing values clean_df = df.dropna() print('\nDataFrame after dropping rows with missing values:') print(clean_df)
Output
Original DataFrame:
Name Age City
0 Alice 25.0 New York
1 Bob NaN Los Angeles
2 Charlie 30.0 None
3 David 22.0 Chicago
DataFrame after dropping rows with missing values:
Name Age City
0 Alice 25.0 New York
3 David 22.0 Chicago
Common Pitfalls
Common mistakes when dropping missing values include:
- Forgetting to assign the result back or use
inplace=True, so the original DataFrame remains unchanged. - Using
dropna()without specifyingaxiswhen you want to drop columns instead of rows. - Not using
subsetwhen you want to check missing values only in specific columns.
python
import pandas as pd df = pd.DataFrame({'A': [1, None, 3], 'B': [4, 5, None]}) # Wrong: This does not change df df.dropna() print('After dropna without assignment:') print(df) # Right: Assign back or use inplace clean_df = df.dropna() print('\nAfter dropna with assignment:') print(clean_df)
Output
After dropna without assignment:
A B
0 1.0 4.0
1 NaN 5.0
2 3.0 NaN
After dropna with assignment:
A B
0 1.0 4.0
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| axis | 0 to drop rows, 1 to drop columns | 0 |
| how | 'any' drops if any missing, 'all' if all missing | 'any' |
| thresh | Require minimum non-NA values to keep | None |
| subset | Columns to check for missing values | None |
| inplace | Modify original DataFrame if True | False |
Key Takeaways
Use df.dropna() to remove rows or columns with missing values in pandas.
Set axis=0 to drop rows, axis=1 to drop columns containing missing data.
Remember to assign the result back or use inplace=True to update the DataFrame.
Use subset parameter to focus on specific columns when dropping missing values.
Use how='all' to drop only if all values are missing in a row or column.