PytorchHow-ToBeginner · 3 min read

How to Use nn.Softmax in PyTorch: Syntax and Example

In PyTorch, nn.Softmax applies the softmax function to an input tensor along a specified dimension, converting raw scores into probabilities. You create it by specifying the dimension with dim, then call it on your tensor to get normalized outputs.

📐

Syntax

The nn.Softmax class requires one main argument: dim, which is the dimension along which softmax will be computed. This means it will normalize the values along that axis so they sum to 1.

Example: softmax = nn.Softmax(dim=1) creates a softmax function that works across columns (dimension 1) of a 2D tensor.

python

import torch
import torch.nn as nn

softmax = nn.Softmax(dim=1)  # Apply softmax along dimension 1

💻

Example

This example shows how to apply nn.Softmax to a 2D tensor representing scores for 3 classes across 2 samples. The output is the probability distribution for each sample.

python

import torch
import torch.nn as nn

# Create a tensor with raw scores for 2 samples and 3 classes
scores = torch.tensor([[2.0, 1.0, 0.1], [1.0, 3.0, 0.2]])

# Initialize Softmax to apply along classes dimension (dim=1)
softmax = nn.Softmax(dim=1)

# Apply softmax to get probabilities
probabilities = softmax(scores)

print(probabilities)

Output

tensor([[0.6590, 0.2424, 0.0986], [0.1065, 0.7847, 0.1088]])

⚠️

Common Pitfalls

Wrong dimension: Applying softmax on the wrong dimension can give incorrect results. For example, using dim=0 instead of dim=1 when your classes are in dimension 1.
Using softmax twice: Avoid applying softmax multiple times on the same data, as it will distort probabilities.
Using nn.Softmax vs F.softmax: nn.Softmax is a module and needs to be called, while torch.nn.functional.softmax is a function you can call directly.

python

import torch
import torch.nn as nn

scores = torch.tensor([[2.0, 1.0, 0.1], [1.0, 3.0, 0.2]])

# Wrong dimension example
softmax_wrong = nn.Softmax(dim=0)
prob_wrong = softmax_wrong(scores)

# Correct dimension example
softmax_correct = nn.Softmax(dim=1)
prob_correct = softmax_correct(scores)

print('Wrong dimension softmax:\n', prob_wrong)
print('Correct dimension softmax:\n', prob_correct)

Output

Wrong dimension softmax: tensor([[0.7311, 0.1192, 0.4502], [0.2689, 0.8808, 0.5498]]) Correct dimension softmax: tensor([[0.6590, 0.2424, 0.0986], [0.1065, 0.7847, 0.1088]])

📊

Quick Reference

Purpose: Convert raw scores to probabilities.
Input: Tensor of any shape.
dim: Dimension along which to apply softmax.
Output: Tensor of same shape with values between 0 and 1 summing to 1 along dim.
Use case: Final layer in classification models.

✅

Key Takeaways

Use nn.Softmax(dim) to apply softmax along the correct tensor dimension.

Softmax converts raw scores into probabilities that sum to 1 along the chosen dimension.

Avoid applying softmax multiple times on the same data.

Check tensor shape and dimension carefully to get correct probability outputs.

nn.Softmax is a module; call it on tensors to get the softmax result.