Experiment - Training loop structure

Problem:You have a simple neural network trained on a small dataset. The training loop runs but the model does not improve much after a few epochs.

Current Metrics:Training loss starts at 1.2 and only decreases to 1.0 after 10 epochs. Validation loss stays around 1.1 with accuracy around 50%.

Issue:The training loop is missing key steps like zeroing gradients, proper loss calculation, and optimizer steps, causing poor learning.

Your Task

Fix the training loop so the model learns properly and training loss decreases below 0.5 with validation accuracy above 80% after 20 epochs.

Keep the model architecture and dataset the same.

Only modify the training loop code.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

PyTorch

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

# Simple dataset
X = torch.randn(100, 10)
y = (X.sum(dim=1) > 0).long()

# Dataset and loader
train_ds = TensorDataset(X, y)
train_dl = DataLoader(train_ds, batch_size=16, shuffle=True)

# Simple model
class SimpleNet(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(10, 2)
    def forward(self, x):
        return self.fc(x)

model = SimpleNet()

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.1)

# Training loop
for epoch in range(20):
    total_loss = 0
    correct = 0
    total = 0
    for xb, yb in train_dl:
        optimizer.zero_grad()  # Clear gradients
        outputs = model(xb)    # Forward pass
        loss = criterion(outputs, yb)  # Compute loss
        loss.backward()        # Backpropagation
        optimizer.step()       # Update weights

        total_loss += loss.item() * xb.size(0)
        _, predicted = torch.max(outputs, 1)
        correct += (predicted == yb).sum().item()
        total += yb.size(0)

    avg_loss = total_loss / total
    accuracy = correct / total * 100
    print(f"Epoch {epoch+1}: Loss={avg_loss:.4f}, Accuracy={accuracy:.2f}%")

Added optimizer.zero_grad() before backpropagation to clear old gradients.

Added loss.backward() to compute gradients from loss.

Added optimizer.step() to update model weights.

Correctly calculated loss using model outputs and targets.

Tracked total loss and accuracy per epoch for monitoring.

Results Interpretation

Before Fix: Loss ~1.0, Accuracy ~50%
After Fix: Loss ~0.3, Accuracy ~85%

A proper training loop must zero gradients, compute loss, backpropagate, and update weights each batch. Missing any step stops learning.

Bonus Experiment

Add a validation loop after each training epoch to monitor validation loss and accuracy.

💡 Hint

Use torch.no_grad() during validation to avoid computing gradients and set model.eval() mode.