How to Decompose Time Series in Python with sklearn
To decompose a time series in Python, use the
seasonal_decompose function from the statsmodels library, which works well alongside sklearn workflows. This function splits the series into trend, seasonal, and residual parts for better analysis and modeling.Syntax
The seasonal_decompose function has this basic syntax:
series: The time series data as a pandas Series.model: Type of decomposition, either'additive'or'multiplicative'.period: The number of observations per cycle (e.g., 12 for monthly data with yearly seasonality).
The function returns an object with trend, seasonal, and resid components.
python
from statsmodels.tsa.seasonal import seasonal_decompose decomposition = seasonal_decompose(series, model='additive', period=period) trend = decomposition.trend seasonal = decomposition.seasonal residual = decomposition.resid
Example
This example shows how to decompose a monthly time series of airline passengers into trend, seasonal, and residual parts using seasonal_decompose.
python
import pandas as pd import matplotlib.pyplot as plt from statsmodels.tsa.seasonal import seasonal_decompose # Load sample data url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv' data = pd.read_csv(url, parse_dates=['Month'], index_col='Month') # Decompose the time series result = seasonal_decompose(data['Passengers'], model='multiplicative', period=12) # Plot the components result.plot() plt.show()
Output
A plot window showing four graphs: observed data, trend, seasonal, and residual components.
Common Pitfalls
- Not specifying the correct
periodcan lead to wrong seasonal decomposition. - Using
additivemodel on data with multiplicative seasonality causes poor results. - Passing non-time-indexed data or missing values can cause errors.
Always check your data frequency and clean missing values before decomposition.
python
import pandas as pd from statsmodels.tsa.seasonal import seasonal_decompose # Wrong: no period specified for monthly data try: seasonal_decompose(data['Passengers'], model='multiplicative') except ValueError as e: print(f'Error: {e}') # Right: specify period=12 for monthly data result = seasonal_decompose(data['Passengers'], model='multiplicative', period=12) print('Decomposition successful with correct period.')
Output
Error: You must specify a period or x must be a pandas object with a DatetimeIndex with a freq.
Decomposition successful with correct period.
Quick Reference
Remember these tips for time series decomposition:
- Use
seasonal_decomposefromstatsmodels. - Set
periodto your data's seasonal cycle length. - Choose
model='additive'if seasonal changes are constant, or'multiplicative'if they change proportionally. - Ensure your data is a pandas Series with a proper datetime index.
Key Takeaways
Use statsmodels' seasonal_decompose to split time series into trend, seasonal, and residual parts.
Always specify the correct period matching your data's seasonality.
Choose additive or multiplicative model based on how seasonal effects behave.
Ensure your data is clean and indexed by datetime for accurate decomposition.
Decomposition helps understand patterns before applying sklearn models.