How to Calculate Percentage in pandas: Simple Guide
To calculate percentage in
pandas, divide the part by the whole and multiply by 100 using vectorized operations. For example, use df['percentage'] = (df['part'] / df['whole']) * 100 to add a percentage column.Syntax
The basic syntax to calculate percentage in pandas is:
df['percentage'] = (df['part'] / df['whole']) * 100
Here, df is your DataFrame, 'part' is the column with the numerator values, and 'whole' is the column with the denominator values. Multiplying by 100 converts the fraction to a percentage.
python
df['percentage'] = (df['part'] / df['whole']) * 100
Example
This example shows how to calculate the percentage of sales per product compared to total sales.
python
import pandas as pd data = {'product': ['A', 'B', 'C'], 'sales': [50, 30, 20]} df = pd.DataFrame(data) total_sales = df['sales'].sum() df['percentage'] = (df['sales'] / total_sales) * 100 print(df)
Output
product sales percentage
0 A 50 50.000000
1 B 30 30.000000
2 C 20 20.000000
Common Pitfalls
Common mistakes when calculating percentages in pandas include:
- Dividing by zero or missing values causing errors or
NaN. - Forgetting to multiply by 100, resulting in fractions instead of percentages.
- Using integer division in older Python versions (less common now) which truncates decimals.
Always check your denominator for zeros or missing data before dividing.
python
import pandas as pd data = {'part': [10, 0, 5], 'whole': [20, 0, None]} df = pd.DataFrame(data) # Wrong: division by zero or None leads to NaN or error # df['percentage'] = (df['part'] / df['whole']) * 100 # Right: handle zeros and missing values safely import numpy as np df['percentage'] = np.where(df['whole'] > 0, (df['part'] / df['whole']) * 100, 0) print(df)
Output
part whole percentage
0 10 20.0 50.0
1 0 0.0 0.0
2 5 NaN 0.0
Quick Reference
| Step | Description | Example |
|---|---|---|
| 1 | Identify numerator column (part) | df['part'] |
| 2 | Identify denominator column (whole) | df['whole'] |
| 3 | Divide part by whole | df['part'] / df['whole'] |
| 4 | Multiply by 100 to get percentage | (df['part'] / df['whole']) * 100 |
| 5 | Handle zero or missing values to avoid errors | Use np.where or fillna |
Key Takeaways
Calculate percentage by dividing the part by the whole and multiplying by 100 in pandas.
Always check for zeros or missing values in the denominator to avoid errors.
Use vectorized operations for fast and efficient calculations.
Remember to multiply by 100 to convert fractions to percentages.
Use
np.where or similar methods to handle edge cases safely.