0
0
PyTorchml~20 mins

Training and validation loss tracking in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Training and validation loss tracking
Problem:Train a simple neural network on the MNIST dataset but currently only training loss is tracked. Validation loss is not tracked, so we cannot tell if the model is overfitting.
Current Metrics:Training loss after 5 epochs: 0.15, Validation loss: Not tracked
Issue:Without validation loss tracking, we cannot detect overfitting or underfitting during training.
Your Task
Add validation loss tracking during training and plot both training and validation loss after each epoch to monitor model performance.
Use PyTorch for model and training
Do not change the model architecture
Keep training for 5 epochs
Hint 1
Hint 2
Hint 3
Solution
PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader, random_split
import matplotlib.pyplot as plt

# Define simple neural network
class SimpleNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear = nn.Linear(28*28, 10)
    def forward(self, x):
        x = self.flatten(x)
        return self.linear(x)

# Prepare dataset and dataloaders
transform = transforms.ToTensor()
full_dataset = datasets.MNIST(root='.', train=True, download=True, transform=transform)
train_size = int(0.8 * len(full_dataset))
val_size = len(full_dataset) - train_size
train_dataset, val_dataset = random_split(full_dataset, [train_size, val_size])
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=64)

# Initialize model, loss, optimizer
model = SimpleNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

train_losses = []
val_losses = []

for epoch in range(5):
    model.train()
    running_train_loss = 0.0
    for images, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_train_loss += loss.item() * images.size(0)
    avg_train_loss = running_train_loss / len(train_loader.dataset)
    train_losses.append(avg_train_loss)

    model.eval()
    running_val_loss = 0.0
    with torch.no_grad():
        for images, labels in val_loader:
            outputs = model(images)
            loss = criterion(outputs, labels)
            running_val_loss += loss.item() * images.size(0)
    avg_val_loss = running_val_loss / len(val_loader.dataset)
    val_losses.append(avg_val_loss)

    print(f"Epoch {epoch+1}: Train Loss = {avg_train_loss:.4f}, Val Loss = {avg_val_loss:.4f}")

# Plot losses
plt.plot(range(1,6), train_losses, label='Training Loss')
plt.plot(range(1,6), val_losses, label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and Validation Loss over Epochs')
plt.legend()
plt.show()
Added a validation dataset split from the original training data
Created a validation data loader
Calculated validation loss after each training epoch without updating model weights
Stored training and validation losses in lists
Plotted both losses to visualize training progress
Results Interpretation

Before: Only training loss tracked (0.15 after 5 epochs). Validation loss not tracked, so no insight on overfitting.

After: Both training (~0.15) and validation (~0.18) losses tracked and plotted. This helps detect if validation loss starts increasing, indicating overfitting.

Tracking validation loss alongside training loss is essential to understand if the model generalizes well or is overfitting the training data.
Bonus Experiment
Try adding early stopping based on validation loss to stop training when validation loss stops improving.
💡 Hint
Monitor validation loss each epoch and stop training if it does not decrease for 2 consecutive epochs.