0
0
MlopsHow-ToBeginner · 3 min read

How to Decompose Time Series in Python with sklearn

To decompose a time series in Python, use the seasonal_decompose function from the statsmodels library, which works well alongside sklearn workflows. This function splits the series into trend, seasonal, and residual parts for better analysis and modeling.
📐

Syntax

The seasonal_decompose function has this basic syntax:

  • series: The time series data as a pandas Series.
  • model: Type of decomposition, either 'additive' or 'multiplicative'.
  • period: The number of observations per cycle (e.g., 12 for monthly data with yearly seasonality).

The function returns an object with trend, seasonal, and resid components.

python
from statsmodels.tsa.seasonal import seasonal_decompose

decomposition = seasonal_decompose(series, model='additive', period=period)
trend = decomposition.trend
seasonal = decomposition.seasonal
residual = decomposition.resid
💻

Example

This example shows how to decompose a monthly time series of airline passengers into trend, seasonal, and residual parts using seasonal_decompose.

python
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose

# Load sample data
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv'
data = pd.read_csv(url, parse_dates=['Month'], index_col='Month')

# Decompose the time series
result = seasonal_decompose(data['Passengers'], model='multiplicative', period=12)

# Plot the components
result.plot()
plt.show()
Output
A plot window showing four graphs: observed data, trend, seasonal, and residual components.
⚠️

Common Pitfalls

  • Not specifying the correct period can lead to wrong seasonal decomposition.
  • Using additive model on data with multiplicative seasonality causes poor results.
  • Passing non-time-indexed data or missing values can cause errors.

Always check your data frequency and clean missing values before decomposition.

python
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose

# Wrong: no period specified for monthly data
try:
    seasonal_decompose(data['Passengers'], model='multiplicative')
except ValueError as e:
    print(f'Error: {e}')

# Right: specify period=12 for monthly data
result = seasonal_decompose(data['Passengers'], model='multiplicative', period=12)
print('Decomposition successful with correct period.')
Output
Error: You must specify a period or x must be a pandas object with a DatetimeIndex with a freq. Decomposition successful with correct period.
📊

Quick Reference

Remember these tips for time series decomposition:

  • Use seasonal_decompose from statsmodels.
  • Set period to your data's seasonal cycle length.
  • Choose model='additive' if seasonal changes are constant, or 'multiplicative' if they change proportionally.
  • Ensure your data is a pandas Series with a proper datetime index.

Key Takeaways

Use statsmodels' seasonal_decompose to split time series into trend, seasonal, and residual parts.
Always specify the correct period matching your data's seasonality.
Choose additive or multiplicative model based on how seasonal effects behave.
Ensure your data is clean and indexed by datetime for accurate decomposition.
Decomposition helps understand patterns before applying sklearn models.