Label smoothing helps the model avoid being too confident about its predictions. It softens the target labels, so the model learns better general patterns. The key metrics to watch are Cross-Entropy Loss and Accuracy. Cross-Entropy Loss shows how well the model predicts the smoothed labels, and Accuracy shows how often the model predicts the correct class. Because labels are softened, accuracy might be slightly lower but the model generalizes better.
Label smoothing in PyTorch - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
Actual \ Predicted | Class A | Class B | Class C
---------------------------------------------
Class A | 45 | 3 | 2
Class B | 4 | 43 | 3
Class C | 1 | 5 | 44
Total samples = 150
From this matrix, we calculate metrics like precision and recall for each class. Label smoothing helps reduce overconfidence that can cause wrong predictions to be very confident.
Label smoothing slightly lowers precision and recall because it softens the targets. This means the model is less sure about any single class, which can reduce false positives (improving precision) and false negatives (improving recall) in some cases.
For example, in a spam filter, label smoothing can help the model avoid marking too many good emails as spam (false positives), improving precision. But it might also miss some spam emails (false negatives), lowering recall a bit.
So, label smoothing balances precision and recall by preventing the model from being too confident, which helps in noisy or uncertain data.
Good: Cross-Entropy Loss steadily decreases during training, and accuracy improves without sudden jumps. Precision and recall are balanced, showing the model is confident but not overconfident.
Bad: Very low loss but accuracy does not improve, or accuracy is high but the model fails on new data (overfitting). Precision or recall is very low, meaning the model is either too cautious or too confident on wrong classes.
- Accuracy paradox: Accuracy might be lower with label smoothing but the model is actually better at generalizing.
- Misinterpreting loss: Cross-Entropy Loss with label smoothing is different from normal loss, so comparing them directly can be misleading.
- Overfitting signs: If loss keeps decreasing but validation accuracy drops, the model might be memorizing smoothed labels instead of learning patterns.
- Ignoring class imbalance: Label smoothing does not fix class imbalance, so metrics like precision and recall per class are important.
Your model uses label smoothing and has 98% accuracy but only 12% recall on the fraud class. Is it good for production?
Answer: No, it is not good. Even with high accuracy, the very low recall means the model misses most fraud cases. For fraud detection, recall is critical because missing fraud is costly. Label smoothing helps generalize but does not fix low recall. You need to improve recall before production.
Practice
Solution
Step 1: Understand label smoothing concept
Label smoothing softens the target labels, making the model less confident about the exact class.Step 2: Connect to model behavior
This helps the model generalize better by not being too sure, reducing overfitting.Final Answer:
To make the model less confident and improve generalization -> Option BQuick Check:
Label smoothing = less confident model [OK]
- Thinking it changes learning rate
- Confusing with data augmentation
- Assuming it reduces dataset size
Solution
Step 1: Recall PyTorch CrossEntropyLoss parameters
The correct parameter name for label smoothing is exactly 'label_smoothing'.Step 2: Match correct syntax
Only loss_fn = torch.nn.CrossEntropyLoss(label_smoothing=0.1) uses the exact parameter name and value format.Final Answer:
loss_fn = torch.nn.CrossEntropyLoss(label_smoothing=0.1) -> Option AQuick Check:
Parameter name is 'label_smoothing' [OK]
- Using incorrect parameter names like 'smooth_labels'
- Misspelling 'label_smoothing'
- Passing label smoothing outside loss function
import torch loss_fn = torch.nn.CrossEntropyLoss(label_smoothing=0.2) logits = torch.tensor([[2.0, 0.5, 0.3]]) target = torch.tensor([0]) loss = loss_fn(logits, target) print(round(loss.item(), 3))
Solution
Step 1: Understand effect of label smoothing on loss
Label smoothing softens the target, so the loss does not become zero even if prediction is perfect.Step 2: Compare loss values
Without smoothing, loss can be very low; with smoothing, loss is higher because targets are less certain.Final Answer:
Loss will be higher than without label smoothing -> Option DQuick Check:
Label smoothing increases loss value slightly [OK]
- Expecting loss to be zero with smoothing
- Thinking smoothing lowers loss always
- Confusing loss sign (negative)
import torch loss_fn = torch.nn.CrossEntropyLoss(label_smoothing=0.1) logits = torch.tensor([[1.0, 2.0, 3.0]]) target = torch.tensor([[2]]) loss = loss_fn(logits, target) print(loss.item())
Solution
Step 1: Check target tensor shape
CrossEntropyLoss expects target as 1D tensor of class indices, but target is 2D here.Step 2: Confirm label smoothing usage
Label smoothing parameter is correctly used as float; logits shape is correct as batch size 1 with 3 classes.Final Answer:
Target tensor shape should be 1D, not 2D -> Option AQuick Check:
Target shape must be 1D for CrossEntropyLoss [OK]
- Passing target as 2D tensor
- Using integer for label_smoothing
- Misunderstanding CrossEntropyLoss support
Solution
Step 1: Recall label smoothing formula
With smoothing ε=0.1 and K=5 classes, true class gets 1 - ε = 0.9, each of the other K-1=4 classes gets ε / (K-1) = 0.1 / 4 = 0.025.Step 2: Construct target for true class index 1
The vector is [0.025, 0.9, 0.025, 0.025, 0.025].Final Answer:
[0.025, 0.9, 0.025, 0.025, 0.025] -> Option CQuick Check:
Smoothed target sums to 1 with 0.1 smoothing [OK]
- Using one-hot vector without smoothing
- Assigning smoothing incorrectly to true class
- Making all classes equal probability
