0
0
PandasHow-ToBeginner · 3 min read

How to Use pct_change in pandas for Percentage Change

Use pct_change() in pandas to calculate the percentage change between the current and a prior element in a Series or DataFrame. It returns the relative change as a decimal fraction by default. You can specify the number of periods to shift with the periods parameter.
📐

Syntax

The basic syntax of pct_change() is:

  • periods: Number of periods to shift for calculating change (default is 1).
  • fill_method: Method to fill missing values before calculation (default is 'pad').
  • limit: Maximum number of consecutive NaNs to fill.
  • freq: Frequency to use for time series data.
  • axis: Axis along which to calculate change (0 for index, 1 for columns).
python
DataFrame.pct_change(periods=1, fill_method='pad', limit=None, freq=None, axis=0)
💻

Example

This example shows how to calculate the percentage change of values in a pandas Series and DataFrame.

python
import pandas as pd

# Series example
s = pd.Series([100, 120, 150, 180])
print('Series percentage change:')
print(s.pct_change())

# DataFrame example
df = pd.DataFrame({
    'A': [100, 110, 121, 133],
    'B': [200, 220, 242, 266]
})
print('\nDataFrame percentage change:')
print(df.pct_change())
Output
Series percentage change: 0 NaN 1 0.200000 2 0.250000 3 0.200000 dtype: float64 DataFrame percentage change: A B 0 NaN NaN 1 0.100000 0.100000 2 0.100000 0.100000 3 0.099174 0.099174
⚠️

Common Pitfalls

Common mistakes when using pct_change() include:

  • Not handling the first row which always returns NaN because there is no prior data to compare.
  • Using periods incorrectly, which can lead to unexpected results if you want to compare with a different lag.
  • Ignoring missing values that can affect the calculation if not handled properly.
python
import pandas as pd

# Wrong: expecting no NaN in first row
s = pd.Series([100, 120, 150])
print(s.pct_change())  # First value is NaN

# Right: fill NaN if needed
print(s.pct_change().fillna(0))  # Replace NaN with 0 for first row
Output
0 NaN 1 0.200000 2 0.250000 dtype: float64 0 0.00 1 0.20 2 0.25 dtype: float64
📊

Quick Reference

ParameterDescriptionDefault
periodsNumber of periods to shift for calculating change1
fill_methodMethod to fill missing values before calculation'pad'
limitMaximum number of consecutive NaNs to fillNone
freqFrequency for time series dataNone
axisAxis to calculate change (0=index, 1=columns)0

Key Takeaways

Use pct_change() to find percentage change between rows in Series or DataFrame.
The first row always returns NaN because there is no previous data to compare.
Adjust the periods parameter to compare with different lag intervals.
Handle NaN values after pct_change() if you want to avoid missing data in results.
pct_change() works along specified axis and supports time series frequency.