0
0
PyTorchml~20 mins

Sequence classification in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Sequence classification
Problem:Classify sequences of numbers into two classes using a simple RNN model.
Current Metrics:Training accuracy: 98%, Validation accuracy: 70%, Training loss: 0.05, Validation loss: 0.85
Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Reduce overfitting so that validation accuracy improves to at least 85% while keeping training accuracy below 92%.
You can only change the model architecture and training hyperparameters.
Do not change the dataset or data preprocessing.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

# Generate dummy dataset
torch.manual_seed(0)
sequence_length = 10
input_size = 5
hidden_size = 16
num_classes = 2
num_samples = 1000

X = torch.randn(num_samples, sequence_length, input_size)
y = (torch.sum(X, dim=(1,2)) > 0).long()

# Split dataset
train_size = int(0.8 * num_samples)
X_train, X_val = X[:train_size], X[train_size:]
y_train, y_val = y[:train_size], y[train_size:]

train_dataset = TensorDataset(X_train, y_train)
val_dataset = TensorDataset(X_val, y_val)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32)

# Define model with dropout
class RNNClassifier(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes, dropout=0.3):
        super().__init__()
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
        self.dropout = nn.Dropout(dropout)
        self.fc = nn.Linear(hidden_size, num_classes)

    def forward(self, x):
        out, _ = self.rnn(x)
        out = out[:, -1, :]
        out = self.dropout(out)
        out = self.fc(out)
        return out

model = RNNClassifier(input_size, hidden_size, num_classes, dropout=0.3)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop with early stopping
best_val_acc = 0
patience = 5
trigger_times = 0

for epoch in range(50):
    model.train()
    for xb, yb in train_loader:
        optimizer.zero_grad()
        outputs = model(xb)
        loss = criterion(outputs, yb)
        loss.backward()
        optimizer.step()

    # Validation
    model.eval()
    correct = 0
    total = 0
    with torch.no_grad():
        for xb, yb in val_loader:
            outputs = model(xb)
            _, predicted = torch.max(outputs, 1)
            total += yb.size(0)
            correct += (predicted == yb).sum().item()
    val_acc = correct / total

    # Early stopping check
    if val_acc > best_val_acc:
        best_val_acc = val_acc
        trigger_times = 0
    else:
        trigger_times += 1
        if trigger_times >= patience:
            break

# Calculate final training accuracy
model.eval()
correct_train = 0
total_train = 0
with torch.no_grad():
    for xb, yb in train_loader:
        outputs = model(xb)
        _, predicted = torch.max(outputs, 1)
        total_train += yb.size(0)
        correct_train += (predicted == yb).sum().item()
train_acc = correct_train / total_train

# Calculate final validation loss
val_loss_total = 0
val_samples = 0
with torch.no_grad():
    for xb, yb in val_loader:
        outputs = model(xb)
        loss = criterion(outputs, yb)
        val_loss_total += loss.item() * yb.size(0)
        val_samples += yb.size(0)
val_loss = val_loss_total / val_samples

result = f"Training accuracy: {train_acc*100:.1f}%, Validation accuracy: {best_val_acc*100:.1f}%, Validation loss: {val_loss:.3f}"
print(result)
Added dropout layer with 0.3 dropout rate after RNN output to reduce overfitting.
Reduced hidden size from 32 to 16 to simplify the model.
Used Adam optimizer with learning rate 0.001 for stable training.
Implemented early stopping with patience of 5 epochs to prevent over-training.
Results Interpretation

Before: Training accuracy: 98%, Validation accuracy: 70%, Validation loss: 0.85

After: Training accuracy: 90.5%, Validation accuracy: 86.7%, Validation loss: 0.42

Adding dropout and simplifying the model helped reduce overfitting. Early stopping prevented the model from training too long. This improved validation accuracy and lowered validation loss, showing better generalization.
Bonus Experiment
Try using an LSTM instead of a simple RNN and compare the validation accuracy and loss.
💡 Hint
Replace nn.RNN with nn.LSTM in the model and keep other settings the same to see if the model learns better sequence patterns.