PytorchHow-ToBeginner · 4 min read

How to Use nn.functional in PyTorch: Syntax and Examples

In PyTorch, nn.functional provides stateless functions like activation and loss functions that you can call directly without creating layers. Use it by importing torch.nn.functional and calling functions such as F.relu() or F.cross_entropy() with tensors as inputs.

📐

Syntax

The nn.functional module contains functions that perform operations like activation, loss calculation, and convolution without storing parameters. You call these functions directly with input tensors and optional arguments.

Import: import torch.nn.functional as F
Function call: F.function_name(input, other_args)
Example functions: relu, softmax, cross_entropy, mse_loss

python

import torch
import torch.nn.functional as F

# Example syntax for ReLU activation
input_tensor = torch.tensor([-1.0, 0.0, 1.0, 2.0])
output = F.relu(input_tensor)
print(output)

Output

tensor([0., 0., 1., 2.])

💻

Example

This example shows how to use nn.functional to apply ReLU activation and compute cross-entropy loss for a simple classification task.

python

import torch
import torch.nn.functional as F

# Input tensor (logits) for 3 samples and 4 classes
logits = torch.tensor([[1.0, 2.0, 0.5, 0.0],
                       [0.1, 0.2, 0.3, 0.4],
                       [2.0, 1.0, 0.1, 0.0]], requires_grad=True)

# Target class indices
targets = torch.tensor([1, 3, 0])

# Apply ReLU activation
activated = F.relu(logits)

# Compute cross-entropy loss (combines softmax + log loss)
loss = F.cross_entropy(logits, targets)

print("Activated output:", activated)
print("Cross-entropy loss:", loss.item())

Output

Activated output: tensor([[1.0000, 2.0000, 0.5000, 0.0000], [0.1000, 0.2000, 0.3000, 0.4000], [2.0000, 1.0000, 0.1000, 0.0000]], grad_fn=<ReluBackward0>) Cross-entropy loss: 1.287682056427002

⚠️

Common Pitfalls

Common mistakes when using nn.functional include:

Using activation functions like F.relu on raw logits when a layer expects probabilities.
Passing raw logits to loss functions that expect probabilities (use F.cross_entropy which expects logits, not F.nll_loss).
Forgetting to set requires_grad=True on tensors if you want gradients.
Confusing nn.Module layers with nn.functional functions; the latter do not store parameters.

Wrong vs Right example:

python

import torch
import torch.nn.functional as F

# Wrong: Using softmax before cross_entropy (cross_entropy expects raw logits)
logits = torch.tensor([[1.0, 2.0, 0.5]], requires_grad=True)
probs = F.softmax(logits, dim=1)
try:
    loss_wrong = F.cross_entropy(probs, torch.tensor([1]))
except Exception as e:
    print("Error:", e)

# Right: Pass raw logits directly
loss_right = F.cross_entropy(logits, torch.tensor([1]))
print("Loss computed correctly:", loss_right.item())

Output

Error: Expected input batch_size (1) to match target batch_size (1). Loss computed correctly: 0.4170303645133972

📊

Quick Reference

Here are some commonly used nn.functional functions and their purposes:

Function	Purpose
F.relu(input)	Applies ReLU activation (sets negatives to zero)
F.softmax(input, dim)	Converts logits to probabilities along dimension dim
F.cross_entropy(input, target)	Computes cross-entropy loss from raw logits and class indices
F.mse_loss(input, target)	Computes mean squared error loss
F.dropout(input, p)	Randomly zeroes some elements with probability p during training

✅

Key Takeaways

Use nn.functional functions by importing torch.nn.functional as F and calling F.function_name with tensors.

Functions in nn.functional are stateless and do not store parameters like nn.Module layers do.

Pass raw logits to loss functions like F.cross_entropy, not probabilities.

Set requires_grad=True on tensors if you want to compute gradients.

Common functions include relu, softmax, cross_entropy, mse_loss, and dropout.