0
0
PytorchHow-ToBeginner · 4 min read

How to Do Time Series Forecasting with PyTorch: Simple Guide

To do time series forecasting in PyTorch, you create a model like an RNN or LSTM that learns from past data sequences and predicts future values. You prepare your data as sequences, define the model, train it with a loss function like Mean Squared Error, and then use it to forecast future points.
📐

Syntax

Here is the basic syntax pattern for time series forecasting with PyTorch:

  • Dataset: Prepare your time series data as input-output pairs of sequences.
  • Model: Define an RNN/LSTM model class inheriting from torch.nn.Module.
  • Training Loop: Use optimizer and loss function to train the model on sequences.
  • Prediction: Use the trained model to predict future time steps.
python
import torch
import torch.nn as nn

class LSTMForecast(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size):
        super().__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
        self.linear = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        out, _ = self.lstm(x)
        out = self.linear(out[:, -1, :])
        return out
💻

Example

This example shows how to forecast a simple sine wave using an LSTM model in PyTorch. It trains the model on past sine values and predicts the next value.

python
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt

# Generate sine wave data
np.random.seed(0)
time_steps = np.linspace(0, 100, 1000)
data = np.sin(time_steps) + 0.1 * np.random.randn(len(time_steps))

# Prepare sequences
sequence_length = 20
def create_sequences(data, seq_length):
    xs, ys = [], []
    for i in range(len(data) - seq_length):
        x = data[i:i+seq_length]
        y = data[i+seq_length]
        xs.append(x)
        ys.append(y)
    return np.array(xs), np.array(ys)

X, y = create_sequences(data, sequence_length)

# Convert to tensors
X = torch.tensor(X, dtype=torch.float32).unsqueeze(-1)  # shape: (samples, seq_len, 1)
y = torch.tensor(y, dtype=torch.float32).unsqueeze(-1)  # shape: (samples, 1)

# Define model
class LSTMForecast(nn.Module):
    def __init__(self, input_size=1, hidden_size=50, num_layers=1, output_size=1):
        super().__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
        self.linear = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        out, _ = self.lstm(x)
        out = self.linear(out[:, -1, :])
        return out

model = LSTMForecast()

# Training setup
criterion = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

# Train model
num_epochs = 100
for epoch in range(num_epochs):
    model.train()
    optimizer.zero_grad()
    output = model(X)
    loss = criterion(output, y)
    loss.backward()
    optimizer.step()
    if (epoch+1) % 20 == 0:
        print(f'Epoch {epoch+1}/{num_epochs}, Loss: {loss.item():.4f}')

# Forecast future values
model.eval()
with torch.no_grad():
    test_seq = X[-1].unsqueeze(0)  # last sequence
    predictions = []
    for _ in range(50):
        pred = model(test_seq)
        predictions.append(pred.item())
        new_seq = torch.cat((test_seq[:,1:,:], pred.unsqueeze(1).unsqueeze(2)), dim=1)
        test_seq = new_seq

# Plot results
plt.figure(figsize=(10,5))
plt.plot(time_steps, data, label='Original Data')
plt.plot(time_steps[-50:], predictions, label='Forecasted', linestyle='--')
plt.legend()
plt.show()
Output
Epoch 20/100, Loss: 0.0223 Epoch 40/100, Loss: 0.0091 Epoch 60/100, Loss: 0.0053 Epoch 80/100, Loss: 0.0037 Epoch 100/100, Loss: 0.0028
⚠️

Common Pitfalls

1. Not normalizing data: Time series data should be scaled or normalized for better training.

2. Wrong input shape: PyTorch LSTM expects input shape as (batch, sequence_length, features).

3. Forgetting to detach hidden states: When using stateful LSTMs, detach hidden states to avoid memory leaks.

4. Using inappropriate loss: For regression forecasting, use Mean Squared Error loss, not classification losses.

python
## Wrong input shape example (will cause error or bad training):
# input shape: (sequence_length, batch, features) instead of (batch, sequence_length, features)

# Wrong
x_wrong = torch.randn(20, 1, 1)  # sequence first

# Right
x_right = torch.randn(1, 20, 1)  # batch first

print(f'Wrong shape: {x_wrong.shape}, Right shape: {x_right.shape}')
Output
Wrong shape: torch.Size([20, 1, 1]), Right shape: torch.Size([1, 20, 1])
📊

Quick Reference

  • Data shape: (batch_size, sequence_length, features)
  • Model: Use nn.LSTM followed by nn.Linear for output
  • Loss: Use nn.MSELoss() for regression
  • Optimizer: Adam or SGD
  • Training: Loop over epochs, zero gradients, forward pass, compute loss, backward pass, optimizer step
  • Prediction: Use model.eval() and no_grad() context

Key Takeaways

Prepare your time series data as sequences with correct shape (batch, seq_len, features).
Use an LSTM model in PyTorch with a linear layer to predict future values.
Train with Mean Squared Error loss and an optimizer like Adam.
Normalize your data for better model performance.
Always check input shapes and use model.eval() during prediction.