0
0
PandasHow-ToBeginner · 3 min read

How to Drop Missing Values in pandas DataFrames

Use the dropna() method in pandas to remove rows or columns with missing values (NaN). You can specify whether to drop rows or columns and control how strictly missing values are detected.
📐

Syntax

The dropna() method removes missing values from a DataFrame or Series. Key parameters include:

  • axis=0: Drop rows with missing values (default).
  • axis=1: Drop columns with missing values.
  • how='any': Drop if any value is missing (default).
  • how='all': Drop only if all values are missing.
  • subset=[columns]: Specify columns to check for missing values.
  • inplace=False: Return a new object by default; set to True to modify in place.
python
DataFrame.dropna(axis=0, how='any', subset=None, inplace=False)
💻

Example

This example shows how to drop rows with any missing values and how to drop columns with missing values.

python
import pandas as pd

data = {'Name': ['Alice', 'Bob', None, 'David'],
        'Age': [25, None, 30, 22],
        'City': ['NY', 'LA', 'SF', None]}

df = pd.DataFrame(data)

# Drop rows with any missing values
rows_dropped = df.dropna()

# Drop columns with any missing values
cols_dropped = df.dropna(axis=1)

print('Original DataFrame:')
print(df)
print('\nAfter dropping rows with missing values:')
print(rows_dropped)
print('\nAfter dropping columns with missing values:')
print(cols_dropped)
Output
Original DataFrame: Name Age City 0 Alice 25.0 NY 1 Bob NaN LA 2 None 30.0 SF 3 David 22.0 None After dropping rows with missing values: Name Age City 0 Alice 25.0 NY After dropping columns with missing values: Name
⚠️

Common Pitfalls

Common mistakes include:

  • Forgetting that dropna() returns a new DataFrame unless inplace=True is set.
  • Not specifying axis correctly, which can lead to dropping rows instead of columns or vice versa.
  • Using dropna() without considering which columns to check, which might remove more data than intended.
python
import pandas as pd

data = {'A': [1, None, 3], 'B': [None, 2, 3]}
df = pd.DataFrame(data)

# Wrong: expecting original df to change but it doesn't
wrong = df.dropna()
print('Wrong (df unchanged):')
print(df)

# Right: use inplace=True to modify original df
correct = df.dropna(inplace=True)
print('\nRight (df changed):')
print(df)
Output
Wrong (df unchanged): A B 0 1.0 NaN 1 NaN 2.0 2 3.0 3.0 Right (df changed): A B 2 3.0 3.0
📊

Quick Reference

ParameterDescriptionDefault
axis0 to drop rows, 1 to drop columns0
how'any' drops if any missing, 'all' drops if all missing'any'
subsetList of columns to check for missing valuesNone
inplaceModify original DataFrame if TrueFalse

Key Takeaways

Use df.dropna() to remove rows or columns with missing values in pandas.
Set axis=1 to drop columns instead of rows.
Remember dropna() returns a new DataFrame unless inplace=True is set.
Use how='all' to drop only if all values are missing in the row/column.
Specify subset to limit which columns to check for missing values.