How to Use ARIMA in Python for Time Series Forecasting
To use
ARIMA in Python, import it from statsmodels.tsa.arima.model, create a model with your time series data and parameters (p, d, q), then fit and predict. This helps forecast future values based on past trends.Syntax
The basic syntax to use ARIMA in Python is:
ARIMA(data, order=(p, d, q)): Creates the ARIMA model wherepis the number of lag observations,dis the degree of differencing, andqis the size of the moving average window.model.fit(): Fits the ARIMA model to your data.model.predict(start, end): Predicts values fromstarttoendindices.
python
from statsmodels.tsa.arima.model import ARIMA model = ARIMA(data, order=(p, d, q)) model_fit = model.fit() predictions = model_fit.predict(start, end)
Example
This example shows how to fit an ARIMA(2,1,2) model on sample data and forecast future points.
python
import numpy as np import pandas as pd from statsmodels.tsa.arima.model import ARIMA # Create sample time series data np.random.seed(0) data = pd.Series(np.random.randn(100).cumsum()) # Define and fit ARIMA model model = ARIMA(data, order=(2, 1, 2)) model_fit = model.fit() # Forecast next 5 points forecast = model_fit.predict(start=len(data), end=len(data)+4) print(forecast)
Output
100 0.850574
101 0.847560
102 0.844547
103 0.841533
104 0.838520
dtype: float64
Common Pitfalls
- Not differencing data when needed (d=0 vs d>0) can cause poor model fit.
- Choosing wrong order (p, d, q) without testing can lead to bad forecasts.
- Using
ARIMAfromsklearnis incorrect; usestatsmodelsinstead. - Not checking stationarity before modeling can cause errors.
python
from statsmodels.tsa.arima.model import ARIMA # Wrong: no differencing on non-stationary data model_wrong = ARIMA(data, order=(2, 0, 2)) model_wrong_fit = model_wrong.fit() # Right: differencing applied model_right = ARIMA(data, order=(2, 1, 2)) model_right_fit = model_right.fit()
Quick Reference
| Parameter | Description |
|---|---|
| p | Number of lag observations (AR term) |
| d | Degree of differencing to make series stationary |
| q | Size of moving average window (MA term) |
| fit() | Fits the ARIMA model to data |
| predict(start, end) | Predicts values between start and end indices |
Key Takeaways
Use ARIMA from statsmodels.tsa.arima.model, not sklearn.
Set order=(p, d, q) carefully based on your data's stationarity and autocorrelation.
Fit the model with model.fit() before making predictions.
Check and difference your data if it is not stationary.
Test different orders to find the best forecasting model.