0
0
PandasHow-ToBeginner · 3 min read

How to Calculate Percentage in pandas: Simple Guide

To calculate percentage in pandas, divide the part by the whole and multiply by 100 using vectorized operations. For example, use df['percentage'] = (df['part'] / df['whole']) * 100 to add a percentage column.
📐

Syntax

The basic syntax to calculate percentage in pandas is:

  • df['percentage'] = (df['part'] / df['whole']) * 100

Here, df is your DataFrame, 'part' is the column with the numerator values, and 'whole' is the column with the denominator values. Multiplying by 100 converts the fraction to a percentage.

python
df['percentage'] = (df['part'] / df['whole']) * 100
💻

Example

This example shows how to calculate the percentage of sales per product compared to total sales.

python
import pandas as pd

data = {'product': ['A', 'B', 'C'], 'sales': [50, 30, 20]}
df = pd.DataFrame(data)
total_sales = df['sales'].sum()
df['percentage'] = (df['sales'] / total_sales) * 100
print(df)
Output
product sales percentage 0 A 50 50.000000 1 B 30 30.000000 2 C 20 20.000000
⚠️

Common Pitfalls

Common mistakes when calculating percentages in pandas include:

  • Dividing by zero or missing values causing errors or NaN.
  • Forgetting to multiply by 100, resulting in fractions instead of percentages.
  • Using integer division in older Python versions (less common now) which truncates decimals.

Always check your denominator for zeros or missing data before dividing.

python
import pandas as pd

data = {'part': [10, 0, 5], 'whole': [20, 0, None]}
df = pd.DataFrame(data)

# Wrong: division by zero or None leads to NaN or error
# df['percentage'] = (df['part'] / df['whole']) * 100

# Right: handle zeros and missing values safely
import numpy as np
df['percentage'] = np.where(df['whole'] > 0, (df['part'] / df['whole']) * 100, 0)
print(df)
Output
part whole percentage 0 10 20.0 50.0 1 0 0.0 0.0 2 5 NaN 0.0
📊

Quick Reference

StepDescriptionExample
1Identify numerator column (part)df['part']
2Identify denominator column (whole)df['whole']
3Divide part by wholedf['part'] / df['whole']
4Multiply by 100 to get percentage(df['part'] / df['whole']) * 100
5Handle zero or missing values to avoid errorsUse np.where or fillna

Key Takeaways

Calculate percentage by dividing the part by the whole and multiplying by 100 in pandas.
Always check for zeros or missing values in the denominator to avoid errors.
Use vectorized operations for fast and efficient calculations.
Remember to multiply by 100 to convert fractions to percentages.
Use np.where or similar methods to handle edge cases safely.