0
0
MlopsHow-ToBeginner · 3 min read

How to Drop Missing Values in Pandas: Simple Guide

Use the dropna() method on a pandas DataFrame or Series to remove rows or columns containing missing values. You can specify axis=0 to drop rows or axis=1 to drop columns with missing data.
📐

Syntax

The basic syntax to drop missing values in pandas is:

  • DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)

Explanation:

  • axis=0: Drop rows containing missing values (default).
  • axis=1: Drop columns containing missing values.
  • how='any': Drop if any missing values are present.
  • how='all': Drop only if all values are missing.
  • thresh=n: Require at least n non-missing values to keep the row/column.
  • subset: Specify columns to check for missing values.
  • inplace=False: Return a new DataFrame by default; set to True to modify in place.
python
df.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)
💻

Example

This example shows how to drop rows with any missing values from a DataFrame.

python
import pandas as pd

# Create a sample DataFrame with missing values
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, None, 30, 22],
        'City': ['New York', 'Los Angeles', None, 'Chicago']}

df = pd.DataFrame(data)

print('Original DataFrame:')
print(df)

# Drop rows with any missing values
clean_df = df.dropna()

print('\nDataFrame after dropping rows with missing values:')
print(clean_df)
Output
Original DataFrame: Name Age City 0 Alice 25.0 New York 1 Bob NaN Los Angeles 2 Charlie 30.0 None 3 David 22.0 Chicago DataFrame after dropping rows with missing values: Name Age City 0 Alice 25.0 New York 3 David 22.0 Chicago
⚠️

Common Pitfalls

Common mistakes when dropping missing values include:

  • Forgetting to assign the result back or use inplace=True, so the original DataFrame remains unchanged.
  • Using dropna() without specifying axis when you want to drop columns instead of rows.
  • Not using subset when you want to check missing values only in specific columns.
python
import pandas as pd

df = pd.DataFrame({'A': [1, None, 3], 'B': [4, 5, None]})

# Wrong: This does not change df
df.dropna()
print('After dropna without assignment:')
print(df)

# Right: Assign back or use inplace
clean_df = df.dropna()
print('\nAfter dropna with assignment:')
print(clean_df)
Output
After dropna without assignment: A B 0 1.0 4.0 1 NaN 5.0 2 3.0 NaN After dropna with assignment: A B 0 1.0 4.0
📊

Quick Reference

ParameterDescriptionDefault
axis0 to drop rows, 1 to drop columns0
how'any' drops if any missing, 'all' if all missing'any'
threshRequire minimum non-NA values to keepNone
subsetColumns to check for missing valuesNone
inplaceModify original DataFrame if TrueFalse

Key Takeaways

Use df.dropna() to remove rows or columns with missing values in pandas.
Set axis=0 to drop rows, axis=1 to drop columns containing missing data.
Remember to assign the result back or use inplace=True to update the DataFrame.
Use subset parameter to focus on specific columns when dropping missing values.
Use how='all' to drop only if all values are missing in a row or column.