How to Use nn.Softmax in PyTorch: Syntax and Example
In PyTorch,
nn.Softmax applies the softmax function to an input tensor along a specified dimension, converting raw scores into probabilities. You create it by specifying the dimension with dim, then call it on your tensor to get normalized outputs.Syntax
The nn.Softmax class requires one main argument: dim, which is the dimension along which softmax will be computed. This means it will normalize the values along that axis so they sum to 1.
Example: softmax = nn.Softmax(dim=1) creates a softmax function that works across columns (dimension 1) of a 2D tensor.
python
import torch import torch.nn as nn softmax = nn.Softmax(dim=1) # Apply softmax along dimension 1
Example
This example shows how to apply nn.Softmax to a 2D tensor representing scores for 3 classes across 2 samples. The output is the probability distribution for each sample.
python
import torch import torch.nn as nn # Create a tensor with raw scores for 2 samples and 3 classes scores = torch.tensor([[2.0, 1.0, 0.1], [1.0, 3.0, 0.2]]) # Initialize Softmax to apply along classes dimension (dim=1) softmax = nn.Softmax(dim=1) # Apply softmax to get probabilities probabilities = softmax(scores) print(probabilities)
Output
tensor([[0.6590, 0.2424, 0.0986],
[0.1065, 0.7847, 0.1088]])
Common Pitfalls
- Wrong dimension: Applying softmax on the wrong dimension can give incorrect results. For example, using
dim=0instead ofdim=1when your classes are in dimension 1. - Using softmax twice: Avoid applying softmax multiple times on the same data, as it will distort probabilities.
- Using nn.Softmax vs F.softmax:
nn.Softmaxis a module and needs to be called, whiletorch.nn.functional.softmaxis a function you can call directly.
python
import torch import torch.nn as nn scores = torch.tensor([[2.0, 1.0, 0.1], [1.0, 3.0, 0.2]]) # Wrong dimension example softmax_wrong = nn.Softmax(dim=0) prob_wrong = softmax_wrong(scores) # Correct dimension example softmax_correct = nn.Softmax(dim=1) prob_correct = softmax_correct(scores) print('Wrong dimension softmax:\n', prob_wrong) print('Correct dimension softmax:\n', prob_correct)
Output
Wrong dimension softmax:
tensor([[0.7311, 0.1192, 0.4502],
[0.2689, 0.8808, 0.5498]])
Correct dimension softmax:
tensor([[0.6590, 0.2424, 0.0986],
[0.1065, 0.7847, 0.1088]])
Quick Reference
- Purpose: Convert raw scores to probabilities.
- Input: Tensor of any shape.
- dim: Dimension along which to apply softmax.
- Output: Tensor of same shape with values between 0 and 1 summing to 1 along
dim. - Use case: Final layer in classification models.
Key Takeaways
Use nn.Softmax(dim) to apply softmax along the correct tensor dimension.
Softmax converts raw scores into probabilities that sum to 1 along the chosen dimension.
Avoid applying softmax multiple times on the same data.
Check tensor shape and dimension carefully to get correct probability outputs.
nn.Softmax is a module; call it on tensors to get the softmax result.