0
0
PandasHow-ToBeginner · 3 min read

How to Check Percentage of Missing Values in pandas DataFrame

To check the percentage of missing values in a pandas DataFrame, use df.isnull().mean() * 100. This calculates the fraction of missing values per column and converts it to a percentage.
📐

Syntax

The main syntax to find the percentage of missing values in pandas is:

  • df.isnull(): Returns a DataFrame of the same shape with True where values are missing.
  • .mean(): Calculates the mean of True values per column, treating True as 1 and False as 0.
  • Multiplying by 100 converts the fraction to a percentage.
python
df.isnull().mean() * 100
💻

Example

This example shows how to create a DataFrame with missing values and calculate the percentage of missing data per column.

python
import pandas as pd

data = {'Name': ['Alice', 'Bob', None, 'David'],
        'Age': [25, None, 30, 22],
        'City': ['New York', 'Los Angeles', 'Chicago', None]}

df = pd.DataFrame(data)

missing_percentage = df.isnull().mean() * 100
print(missing_percentage)
Output
Name 25.0 Age 25.0 City 25.0 dtype: float64
⚠️

Common Pitfalls

Some common mistakes when checking missing values percentage include:

  • Using df.isnull().sum() alone, which gives counts, not percentages.
  • Forgetting to multiply by 100 to get percentages.
  • Not considering missing values in rows if you want overall dataset percentage.

Always use mean() to get the fraction and multiply by 100 for percentage.

python
import pandas as pd

data = {'A': [1, None, 3], 'B': [None, None, 6]}
df = pd.DataFrame(data)

# Wrong: gives counts, not percentage
print(df.isnull().sum())

# Right: gives percentage
print(df.isnull().mean() * 100)
Output
A 1 B 2 dtype: int64 A 33.333333 B 66.666667 dtype: float64
📊

Quick Reference

MethodDescriptionOutput Type
df.isnull()Detects missing values, returns boolean DataFrameDataFrame of bool
df.isnull().sum()Counts missing values per columnSeries of int
df.isnull().mean()Fraction of missing values per columnSeries of float
df.isnull().mean() * 100Percentage of missing values per columnSeries of float

Key Takeaways

Use df.isnull().mean() * 100 to get the percentage of missing values per column.
Multiplying by 100 converts the fraction to a readable percentage.
df.isnull().sum() only gives counts, not percentages.
Check missing values per column to understand data quality.
Always verify your DataFrame before analysis to handle missing data properly.