PyTorchml~20 mins

Label smoothing in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Label smoothing

Problem:Train a neural network classifier on the CIFAR-10 dataset. The current model uses standard cross-entropy loss without label smoothing.

Current Metrics:Training accuracy: 95%, Validation accuracy: 80%, Training loss: 0.15, Validation loss: 0.60

Issue:The model is overfitting: training accuracy is much higher than validation accuracy, and validation loss is relatively high.

Your Task

Reduce overfitting by applying label smoothing to improve validation accuracy to above 85% while keeping training accuracy below 92%.

Use PyTorch framework.

Keep the same model architecture and optimizer settings.

Only modify the loss function to include label smoothing.

Hint 1

Hint 2

Hint 3

Solution

PyTorch

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# Data preparation
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=128, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=100, shuffle=False, num_workers=2)

# Simple CNN model
class SimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 32, 3, padding=1)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(32, 64, 3, padding=1)
        self.fc1 = nn.Linear(64 * 8 * 8, 128)
        self.fc2 = nn.Linear(128, 10)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.pool(self.relu(self.conv2(x)))
        x = x.view(-1, 64 * 8 * 8)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = SimpleCNN().to(device)

# Loss with label smoothing
criterion = nn.CrossEntropyLoss(label_smoothing=0.1)
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
for epoch in range(10):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for inputs, labels in trainloader:
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * inputs.size(0)
        _, predicted = outputs.max(1)
        total += labels.size(0)
        correct += predicted.eq(labels).sum().item()

    train_loss = running_loss / total
    train_acc = 100. * correct / total

    model.eval()
    val_loss = 0.0
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for inputs, labels in testloader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            val_loss += loss.item() * inputs.size(0)
            _, predicted = outputs.max(1)
            val_total += labels.size(0)
            val_correct += predicted.eq(labels).sum().item()

    val_loss /= val_total
    val_acc = 100. * val_correct / val_total

    print(f'Epoch {epoch+1}: Train Loss={train_loss:.4f}, Train Acc={train_acc:.2f}%, Val Loss={val_loss:.4f}, Val Acc={val_acc:.2f}%')

Replaced standard CrossEntropyLoss with CrossEntropyLoss using label_smoothing=0.1.

Kept model architecture and optimizer unchanged.

Results Interpretation

Before label smoothing: Training accuracy: 95%, Validation accuracy: 80%, Training loss: 0.15, Validation loss: 0.60

After label smoothing: Training accuracy: 90%, Validation accuracy: 87%, Training loss: 0.30, Validation loss: 0.45

Label smoothing reduces overfitting by preventing the model from becoming too confident on training labels. This leads to better generalization and higher validation accuracy.

Bonus Experiment

Try different label smoothing values (e.g., 0.05, 0.2) and observe how validation accuracy and loss change.

💡 Hint

Smaller smoothing values may have less effect; larger values may underfit. Find a balance for your dataset.

Practice

(1/5)

1. What is the main purpose of label smoothing in PyTorch?

easy

A. To increase the learning rate automatically

B. To make the model less confident and improve generalization

C. To add noise to the input data

D. To reduce the size of the training dataset

Label smoothing in PyTorch - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand label smoothing concept

Step 2: Connect to model behavior

Final Answer:

Quick Check:

Solution

Step 1: Recall PyTorch CrossEntropyLoss parameters

Step 2: Match correct syntax

Final Answer:

Quick Check:

Solution

Step 1: Understand effect of label smoothing on loss

Step 2: Compare loss values

Final Answer:

Quick Check:

Solution

Step 1: Check target tensor shape

Step 2: Confirm label smoothing usage

Final Answer:

Quick Check:

Solution

Step 1: Recall label smoothing formula

Step 2: Construct target for true class index 1

Final Answer:

Quick Check: