How to Use nn.functional in PyTorch: Syntax and Examples
In PyTorch,
nn.functional provides stateless functions like activation and loss functions that you can call directly without creating layers. Use it by importing torch.nn.functional and calling functions such as F.relu() or F.cross_entropy() with tensors as inputs.Syntax
The nn.functional module contains functions that perform operations like activation, loss calculation, and convolution without storing parameters. You call these functions directly with input tensors and optional arguments.
- Import:
import torch.nn.functional as F - Function call:
F.function_name(input, other_args) - Example functions:
relu,softmax,cross_entropy,mse_loss
python
import torch import torch.nn.functional as F # Example syntax for ReLU activation input_tensor = torch.tensor([-1.0, 0.0, 1.0, 2.0]) output = F.relu(input_tensor) print(output)
Output
tensor([0., 0., 1., 2.])
Example
This example shows how to use nn.functional to apply ReLU activation and compute cross-entropy loss for a simple classification task.
python
import torch import torch.nn.functional as F # Input tensor (logits) for 3 samples and 4 classes logits = torch.tensor([[1.0, 2.0, 0.5, 0.0], [0.1, 0.2, 0.3, 0.4], [2.0, 1.0, 0.1, 0.0]], requires_grad=True) # Target class indices targets = torch.tensor([1, 3, 0]) # Apply ReLU activation activated = F.relu(logits) # Compute cross-entropy loss (combines softmax + log loss) loss = F.cross_entropy(logits, targets) print("Activated output:", activated) print("Cross-entropy loss:", loss.item())
Output
Activated output: tensor([[1.0000, 2.0000, 0.5000, 0.0000],
[0.1000, 0.2000, 0.3000, 0.4000],
[2.0000, 1.0000, 0.1000, 0.0000]], grad_fn=<ReluBackward0>)
Cross-entropy loss: 1.287682056427002
Common Pitfalls
Common mistakes when using nn.functional include:
- Using activation functions like
F.reluon raw logits when a layer expects probabilities. - Passing raw logits to loss functions that expect probabilities (use
F.cross_entropywhich expects logits, notF.nll_loss). - Forgetting to set
requires_grad=Trueon tensors if you want gradients. - Confusing
nn.Modulelayers withnn.functionalfunctions; the latter do not store parameters.
Wrong vs Right example:
python
import torch import torch.nn.functional as F # Wrong: Using softmax before cross_entropy (cross_entropy expects raw logits) logits = torch.tensor([[1.0, 2.0, 0.5]], requires_grad=True) probs = F.softmax(logits, dim=1) try: loss_wrong = F.cross_entropy(probs, torch.tensor([1])) except Exception as e: print("Error:", e) # Right: Pass raw logits directly loss_right = F.cross_entropy(logits, torch.tensor([1])) print("Loss computed correctly:", loss_right.item())
Output
Error: Expected input batch_size (1) to match target batch_size (1).
Loss computed correctly: 0.4170303645133972
Quick Reference
Here are some commonly used nn.functional functions and their purposes:
| Function | Purpose |
|---|---|
| F.relu(input) | Applies ReLU activation (sets negatives to zero) |
| F.softmax(input, dim) | Converts logits to probabilities along dimension dim |
| F.cross_entropy(input, target) | Computes cross-entropy loss from raw logits and class indices |
| F.mse_loss(input, target) | Computes mean squared error loss |
| F.dropout(input, p) | Randomly zeroes some elements with probability p during training |
Key Takeaways
Use nn.functional functions by importing torch.nn.functional as F and calling F.function_name with tensors.
Functions in nn.functional are stateless and do not store parameters like nn.Module layers do.
Pass raw logits to loss functions like F.cross_entropy, not probabilities.
Set requires_grad=True on tensors if you want to compute gradients.
Common functions include relu, softmax, cross_entropy, mse_loss, and dropout.