How to Use pct_change in pandas for Percentage Change
Use
pct_change() in pandas to calculate the percentage change between the current and a prior element in a Series or DataFrame. It returns the relative change as a decimal fraction by default. You can specify the number of periods to shift with the periods parameter.Syntax
The basic syntax of pct_change() is:
periods: Number of periods to shift for calculating change (default is 1).fill_method: Method to fill missing values before calculation (default is 'pad').limit: Maximum number of consecutive NaNs to fill.freq: Frequency to use for time series data.axis: Axis along which to calculate change (0 for index, 1 for columns).
python
DataFrame.pct_change(periods=1, fill_method='pad', limit=None, freq=None, axis=0)
Example
This example shows how to calculate the percentage change of values in a pandas Series and DataFrame.
python
import pandas as pd # Series example s = pd.Series([100, 120, 150, 180]) print('Series percentage change:') print(s.pct_change()) # DataFrame example df = pd.DataFrame({ 'A': [100, 110, 121, 133], 'B': [200, 220, 242, 266] }) print('\nDataFrame percentage change:') print(df.pct_change())
Output
Series percentage change:
0 NaN
1 0.200000
2 0.250000
3 0.200000
dtype: float64
DataFrame percentage change:
A B
0 NaN NaN
1 0.100000 0.100000
2 0.100000 0.100000
3 0.099174 0.099174
Common Pitfalls
Common mistakes when using pct_change() include:
- Not handling the first row which always returns
NaNbecause there is no prior data to compare. - Using
periodsincorrectly, which can lead to unexpected results if you want to compare with a different lag. - Ignoring missing values that can affect the calculation if not handled properly.
python
import pandas as pd # Wrong: expecting no NaN in first row s = pd.Series([100, 120, 150]) print(s.pct_change()) # First value is NaN # Right: fill NaN if needed print(s.pct_change().fillna(0)) # Replace NaN with 0 for first row
Output
0 NaN
1 0.200000
2 0.250000
dtype: float64
0 0.00
1 0.20
2 0.25
dtype: float64
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| periods | Number of periods to shift for calculating change | 1 |
| fill_method | Method to fill missing values before calculation | 'pad' |
| limit | Maximum number of consecutive NaNs to fill | None |
| freq | Frequency for time series data | None |
| axis | Axis to calculate change (0=index, 1=columns) | 0 |
Key Takeaways
Use pct_change() to find percentage change between rows in Series or DataFrame.
The first row always returns NaN because there is no previous data to compare.
Adjust the periods parameter to compare with different lag intervals.
Handle NaN values after pct_change() if you want to avoid missing data in results.
pct_change() works along specified axis and supports time series frequency.