How to Drop Missing Values in Python: Simple Guide
dropna() method from the pandas library on a DataFrame or Series. This method removes rows or columns containing NaN values, helping clean your data easily.Syntax
The dropna() method is used on pandas DataFrames or Series to remove missing values. You can specify whether to drop rows or columns with missing data using the axis parameter. The how parameter controls if rows/columns are dropped when any or all values are missing.
df.dropna(axis=0, how='any'): Drops rows with any missing values.df.dropna(axis=1, how='all'): Drops columns where all values are missing.
df.dropna(axis=0, how='any')
Example
This example shows how to create a DataFrame with missing values and then drop rows that contain any missing values using dropna(). It demonstrates cleaning data by removing incomplete rows.
import pandas as pd # Create a DataFrame with missing values data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, None, 30, 22], 'City': ['New York', 'Los Angeles', None, 'Chicago']} df = pd.DataFrame(data) print("Original DataFrame:") print(df) # Drop rows with any missing values clean_df = df.dropna() print("\nDataFrame after dropping rows with missing values:") print(clean_df)
Common Pitfalls
One common mistake is forgetting that dropna() returns a new DataFrame and does not change the original unless you use inplace=True. Another is not specifying the correct axis when you want to drop columns instead of rows. Also, be careful with how parameter: 'any' drops if any value is missing, 'all' drops only if all are missing.
import pandas as pd df = pd.DataFrame({'A': [1, None, 3], 'B': [None, None, 6]}) # Wrong: original df unchanged wrong = df.dropna() print("Original DataFrame after dropna without inplace:") print(df) # Right: modify original DataFrame df.dropna(inplace=True) print("\nDataFrame after dropna with inplace=True:") print(df)
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| axis | 0 to drop rows, 1 to drop columns | 0 |
| how | 'any' drops if any missing, 'all' drops if all missing | 'any' |
| thresh | Require minimum non-NA values to keep row/column | None |
| subset | Specify columns to check for missing values | None |
| inplace | Modify original DataFrame if True | False |
Key Takeaways
dropna() to remove missing values from DataFrames or Series.dropna() returns a new object unless inplace=True is set.axis=0 to drop rows and axis=1 to drop columns with missing data.how='any' to drop if any missing, or how='all' to drop only if all values are missing.