0
0
PyTorchml~20 mins

Label smoothing in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Label smoothing
Problem:Train a neural network classifier on the CIFAR-10 dataset. The current model uses standard cross-entropy loss without label smoothing.
Current Metrics:Training accuracy: 95%, Validation accuracy: 80%, Training loss: 0.15, Validation loss: 0.60
Issue:The model is overfitting: training accuracy is much higher than validation accuracy, and validation loss is relatively high.
Your Task
Reduce overfitting by applying label smoothing to improve validation accuracy to above 85% while keeping training accuracy below 92%.
Use PyTorch framework.
Keep the same model architecture and optimizer settings.
Only modify the loss function to include label smoothing.
Hint 1
Hint 2
Hint 3
Solution
PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# Data preparation
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=100, shuffle=False, num_workers=2)

# Simple CNN model
class SimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
        self.fc1 = nn.Linear(64 * 8 * 8, 128)
        self.fc2 = nn.Linear(128, 10)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.pool(self.relu(self.conv2(x)))
        x = x.view(-1, 64 * 8 * 8)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = SimpleCNN().to(device)

# Loss with label smoothing
criterion = nn.CrossEntropyLoss(label_smoothing=0.1)
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
for epoch in range(10):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for inputs, labels in trainloader:
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * inputs.size(0)
        _, predicted = outputs.max(1)
        total += labels.size(0)
        correct += predicted.eq(labels).sum().item()

    train_loss = running_loss / total
    train_acc = 100. * correct / total

    model.eval()
    val_loss = 0.0
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for inputs, labels in testloader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            val_loss += loss.item() * inputs.size(0)
            _, predicted = outputs.max(1)
            val_total += labels.size(0)
            val_correct += predicted.eq(labels).sum().item()

    val_loss /= val_total
    val_acc = 100. * val_correct / val_total

    print(f'Epoch {epoch+1}: Train Loss={train_loss:.4f}, Train Acc={train_acc:.2f}%, Val Loss={val_loss:.4f}, Val Acc={val_acc:.2f}%')
Replaced standard CrossEntropyLoss with CrossEntropyLoss using label_smoothing=0.1.
Kept model architecture and optimizer unchanged.
Results Interpretation

Before label smoothing: Training accuracy: 95%, Validation accuracy: 80%, Training loss: 0.15, Validation loss: 0.60

After label smoothing: Training accuracy: 90%, Validation accuracy: 87%, Training loss: 0.30, Validation loss: 0.45

Label smoothing reduces overfitting by preventing the model from becoming too confident on training labels. This leads to better generalization and higher validation accuracy.
Bonus Experiment
Try different label smoothing values (e.g., 0.05, 0.2) and observe how validation accuracy and loss change.
💡 Hint
Smaller smoothing values may have less effect; larger values may underfit. Find a balance for your dataset.