How to Use SARIMA in Python for Time Series Forecasting
Use the
SARIMAX class from the statsmodels library to build SARIMA models in Python. Define seasonal and non-seasonal orders, fit the model on your time series data, and then use it to forecast future values.Syntax
The SARIMA model in Python is implemented via the SARIMAX class from statsmodels.tsa.statespace.sarimax. You specify the non-seasonal order (p, d, q) and seasonal order (P, D, Q, s), where s is the seasonal period.
Key parts:
order=(p, d, q): non-seasonal ARIMA parametersseasonal_order=(P, D, Q, s): seasonal ARIMA parametersfit(): fits the model to dataforecast(steps): predicts future values
python
from statsmodels.tsa.statespace.sarimax import SARIMAX model = SARIMAX(data, order=(p, d, q), seasonal_order=(P, D, Q, s)) model_fit = model.fit() forecast = model_fit.forecast(steps=10)
Example
This example shows how to fit a SARIMA model on monthly airline passenger data and forecast the next 12 months.
python
import pandas as pd from statsmodels.tsa.statespace.sarimax import SARIMAX import matplotlib.pyplot as plt # Load sample data url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/airline-passengers.csv' data = pd.read_csv(url, parse_dates=['Month'], index_col='Month') # Define SARIMA model parameters order = (1, 1, 1) # non-seasonal seasonal_order = (1, 1, 1, 12) # seasonal with yearly period # Fit SARIMA model model = SARIMAX(data['Passengers'], order=order, seasonal_order=seasonal_order) model_fit = model.fit(disp=False) # Forecast next 12 months forecast = model_fit.forecast(steps=12) # Plot plt.plot(data.index, data['Passengers'], label='Observed') plt.plot(forecast.index, forecast, label='Forecast', color='red') plt.legend() plt.title('SARIMA Forecast of Airline Passengers') plt.show()
Output
A plot showing observed monthly airline passengers and a red line forecasting the next 12 months with SARIMA.
Common Pitfalls
- Not differencing enough: If your data is not stationary, the model may fail or give poor forecasts. Use
dandDto difference data. - Wrong seasonal period: Make sure
smatches your data's seasonality (e.g., 12 for monthly, 4 for quarterly). - Ignoring model diagnostics: Always check residuals and model summary to ensure good fit.
- Using
sklearninstead ofstatsmodels: SARIMA is not insklearn, usestatsmodelsinstead.
python
from statsmodels.tsa.statespace.sarimax import SARIMAX # Wrong: Using sklearn for SARIMA (no SARIMA in sklearn) # from sklearn.sarima import SARIMA # This will cause ImportError # Right: Use statsmodels SARIMAX model = SARIMAX(data, order=(1,1,1), seasonal_order=(1,1,1,12)) model_fit = model.fit()
Quick Reference
Remember these key SARIMA parameters:
| Parameter | Description | Example |
|---|---|---|
| p | Order of non-seasonal AR term | 1 |
| d | Order of non-seasonal differencing | 1 |
| q | Order of non-seasonal MA term | 1 |
| P | Order of seasonal AR term | 1 |
| D | Order of seasonal differencing | 1 |
| Q | Order of seasonal MA term | 1 |
| s | Length of seasonal cycle | 12 (monthly data) |
Key Takeaways
Use statsmodels SARIMAX class to build SARIMA models in Python.
Set both non-seasonal and seasonal orders correctly for your data.
Fit the model with .fit() and predict future points with .forecast().
Check data stationarity and seasonal period before modeling.
SARIMA is not available in sklearn; use statsmodels instead.