PyTorchml~20 mins

Sequence classification in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Sequence classification

Problem:Classify sequences of numbers into two classes using a simple RNN model.

Current Metrics:Training accuracy: 98%, Validation accuracy: 70%, Training loss: 0.05, Validation loss: 0.85

Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.

Your Task

Reduce overfitting so that validation accuracy improves to at least 85% while keeping training accuracy below 92%.

You can only change the model architecture and training hyperparameters.

Do not change the dataset or data preprocessing.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

PyTorch

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

# Generate dummy dataset
torch.manual_seed(0)
sequence_length = 10
input_size = 5
hidden_size = 16
num_classes = 2
num_samples = 1000

X = torch.randn(num_samples, sequence_length, input_size)
y = (torch.sum(X, dim=(1,2)) > 0).long()

# Split dataset
train_size = int(0.8 * num_samples)
X_train, X_val = X[:train_size], X[train_size:]
y_train, y_val = y[:train_size], y[train_size:]

train_dataset = TensorDataset(X_train, y_train)
val_dataset = TensorDataset(X_val, y_val)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32)

# Define model with dropout
class RNNClassifier(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes, dropout=0.3):
        super().__init__()
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
        self.dropout = nn.Dropout(dropout)
        self.fc = nn.Linear(hidden_size, num_classes)

    def forward(self, x):
        out, _ = self.rnn(x)
        out = out[:, -1, :]
        out = self.dropout(out)
        out = self.fc(out)
        return out

model = RNNClassifier(input_size, hidden_size, num_classes, dropout=0.3)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop with early stopping
best_val_acc = 0
patience = 5
trigger_times = 0

for epoch in range(50):
    model.train()
    for xb, yb in train_loader:
        optimizer.zero_grad()
        outputs = model(xb)
        loss = criterion(outputs, yb)
        loss.backward()
        optimizer.step()

    # Validation
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for xb, yb in val_loader:
            outputs = model(xb)
            _, predicted = torch.max(outputs, 1)
            total += yb.size(0)
            correct += (predicted == yb).sum().item()
    val_acc = correct / total

    # Early stopping check
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        trigger_times = 0
    else:
        trigger_times += 1
        if trigger_times >= patience:
            break

# Calculate final training accuracy
model.eval()
correct_train = 0
total_train = 0
with torch.no_grad():
    for xb, yb in train_loader:
        outputs = model(xb)
        _, predicted = torch.max(outputs, 1)
        total_train += yb.size(0)
        correct_train += (predicted == yb).sum().item()
train_acc = correct_train / total_train

# Calculate final validation loss
val_loss_total = 0
val_samples = 0
with torch.no_grad():
    for xb, yb in val_loader:
        outputs = model(xb)
        loss = criterion(outputs, yb)
        val_loss_total += loss.item() * yb.size(0)
        val_samples += yb.size(0)
val_loss = val_loss_total / val_samples

result = f"Training accuracy: {train_acc*100:.1f}%, Validation accuracy: {best_val_acc*100:.1f}%, Validation loss: {val_loss:.3f}"
print(result)

Added dropout layer with 0.3 dropout rate after RNN output to reduce overfitting.

Reduced hidden size from 32 to 16 to simplify the model.

Used Adam optimizer with learning rate 0.001 for stable training.

Implemented early stopping with patience of 5 epochs to prevent over-training.

Results Interpretation

Before: Training accuracy: 98%, Validation accuracy: 70%, Validation loss: 0.85

After: Training accuracy: 90.5%, Validation accuracy: 86.7%, Validation loss: 0.42

Adding dropout and simplifying the model helped reduce overfitting. Early stopping prevented the model from training too long. This improved validation accuracy and lowered validation loss, showing better generalization.

Bonus Experiment

Try using an LSTM instead of a simple RNN and compare the validation accuracy and loss.

💡 Hint

Replace nn.RNN with nn.LSTM in the model and keep other settings the same to see if the model learns better sequence patterns.

Practice

(1/5)

1. What is the main goal of sequence classification in PyTorch?

easy

A. To assign a label to the entire input sequence

B. To predict the next item in the sequence

C. To label each item in the sequence separately

D. To generate a new sequence from the input

Sequence classification in PyTorch - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand sequence classification

Step 2: Compare options

Final Answer:

Quick Check:

Solution

Step 1: Identify sequence processing modules

Step 2: Match options to sequence processing

Final Answer:

Quick Check:

Solution

Step 1: Understand RNN output shapes

Step 2: Analyze final_output shape

Final Answer:

Quick Check:

Solution

Step 1: Check Linear layer input size

Step 2: Correct Linear input size

Final Answer:

Quick Check:

Solution

Step 1: Understand variable-length sequence handling

Step 2: Evaluate options

Final Answer:

Quick Check: