Label smoothing helps the model not be too sure about its answers. It makes training more stable and can improve how well the model works on new data.
Label smoothing in PyTorch
Start learning this pattern below
Jump into concepts and practice - no test required
import torch.nn as nn loss = nn.CrossEntropyLoss(label_smoothing=0.1)
The label_smoothing parameter takes a value between 0 and 1.
A value of 0 means no smoothing (normal labels), and higher values soften the labels more.
loss = nn.CrossEntropyLoss(label_smoothing=0.0)loss = nn.CrossEntropyLoss(label_smoothing=0.1)loss = nn.CrossEntropyLoss(label_smoothing=0.2)This code trains a simple model on 3 samples with 3 classes using label smoothing of 0.1. It prints the loss each epoch and shows the predicted classes after training.
import torch import torch.nn as nn import torch.optim as optim # Simple dataset: 3 samples, 3 classes inputs = torch.tensor([[1.0, 2.0, 3.0], [1.0, 0.0, 0.0], [0.0, 1.0, 0.0]]) labels = torch.tensor([2, 0, 1]) # correct classes # Simple linear model model = nn.Linear(3, 3) # Loss with label smoothing criterion = nn.CrossEntropyLoss(label_smoothing=0.1) optimizer = optim.SGD(model.parameters(), lr=0.1) # Training loop for 5 epochs for epoch in range(5): optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step() print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}") # Predictions after training with torch.no_grad(): outputs = model(inputs) predicted = torch.argmax(outputs, dim=1) print("Predicted classes:", predicted.tolist())
Label smoothing changes the target labels from hard 0 or 1 to softer values like 0.9 and 0.05.
This helps the model avoid becoming too confident and can improve accuracy on new data.
Too much smoothing can make training harder, so choose a small value like 0.1 or 0.2.
Label smoothing makes labels less strict to help the model learn better.
It is easy to add in PyTorch using CrossEntropyLoss(label_smoothing=...).
Use it when you want your model to be less confident and more general.
Practice
Solution
Step 1: Understand label smoothing concept
Label smoothing softens the target labels, making the model less confident about the exact class.Step 2: Connect to model behavior
This helps the model generalize better by not being too sure, reducing overfitting.Final Answer:
To make the model less confident and improve generalization -> Option BQuick Check:
Label smoothing = less confident model [OK]
- Thinking it changes learning rate
- Confusing with data augmentation
- Assuming it reduces dataset size
Solution
Step 1: Recall PyTorch CrossEntropyLoss parameters
The correct parameter name for label smoothing is exactly 'label_smoothing'.Step 2: Match correct syntax
Only loss_fn = torch.nn.CrossEntropyLoss(label_smoothing=0.1) uses the exact parameter name and value format.Final Answer:
loss_fn = torch.nn.CrossEntropyLoss(label_smoothing=0.1) -> Option AQuick Check:
Parameter name is 'label_smoothing' [OK]
- Using incorrect parameter names like 'smooth_labels'
- Misspelling 'label_smoothing'
- Passing label smoothing outside loss function
import torch loss_fn = torch.nn.CrossEntropyLoss(label_smoothing=0.2) logits = torch.tensor([[2.0, 0.5, 0.3]]) target = torch.tensor([0]) loss = loss_fn(logits, target) print(round(loss.item(), 3))
Solution
Step 1: Understand effect of label smoothing on loss
Label smoothing softens the target, so the loss does not become zero even if prediction is perfect.Step 2: Compare loss values
Without smoothing, loss can be very low; with smoothing, loss is higher because targets are less certain.Final Answer:
Loss will be higher than without label smoothing -> Option DQuick Check:
Label smoothing increases loss value slightly [OK]
- Expecting loss to be zero with smoothing
- Thinking smoothing lowers loss always
- Confusing loss sign (negative)
import torch loss_fn = torch.nn.CrossEntropyLoss(label_smoothing=0.1) logits = torch.tensor([[1.0, 2.0, 3.0]]) target = torch.tensor([[2]]) loss = loss_fn(logits, target) print(loss.item())
Solution
Step 1: Check target tensor shape
CrossEntropyLoss expects target as 1D tensor of class indices, but target is 2D here.Step 2: Confirm label smoothing usage
Label smoothing parameter is correctly used as float; logits shape is correct as batch size 1 with 3 classes.Final Answer:
Target tensor shape should be 1D, not 2D -> Option AQuick Check:
Target shape must be 1D for CrossEntropyLoss [OK]
- Passing target as 2D tensor
- Using integer for label_smoothing
- Misunderstanding CrossEntropyLoss support
Solution
Step 1: Recall label smoothing formula
With smoothing ε=0.1 and K=5 classes, true class gets 1 - ε = 0.9, each of the other K-1=4 classes gets ε / (K-1) = 0.1 / 4 = 0.025.Step 2: Construct target for true class index 1
The vector is [0.025, 0.9, 0.025, 0.025, 0.025].Final Answer:
[0.025, 0.9, 0.025, 0.025, 0.025] -> Option CQuick Check:
Smoothed target sums to 1 with 0.1 smoothing [OK]
- Using one-hot vector without smoothing
- Assigning smoothing incorrectly to true class
- Making all classes equal probability
