PytorchHow-ToBeginner · 4 min read

How Autograd Works in PyTorch: Simple Explanation and Example

PyTorch's autograd automatically tracks operations on tensors with requires_grad=True and computes gradients during backpropagation using backward(). It builds a dynamic computation graph on the fly, enabling efficient gradient calculation for training models.

📐

Syntax

To use autograd, create tensors with requires_grad=True to track operations. Call backward() on a scalar output to compute gradients. Access gradients via the .grad attribute of tensors.

tensor = torch.tensor(data, requires_grad=True): creates a tensor that tracks operations.
output.backward(): computes gradients of output with respect to inputs.
tensor.grad: holds the gradient after backward call.

python

import torch

# Create tensor with gradient tracking
x = torch.tensor([2.0, 3.0], requires_grad=True)

# Perform operations
y = x * x + 3 * x + 1

# Compute gradient of sum of y
y.sum().backward()

# Access gradients
print(x.grad)

Output

tensor([7., 9.])

💻

Example

This example shows how autograd tracks operations and computes gradients automatically. We define a simple function, compute its output, and call backward() to get gradients.

python

import torch

# Create input tensor with gradient tracking
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)

# Define a function of x
y = 2 * x + 3
z = y.pow(2).sum()  # scalar output

# Compute gradients
z.backward()

# Print gradients of x
print('x:', x)
print('Gradient of z w.r.t x:', x.grad)

Output

x: tensor([1., 2., 3.], requires_grad=True) Gradient of z w.r.t x: tensor([ 8., 16., 24.])

⚠️

Common Pitfalls

Common mistakes when using autograd include:

Not setting requires_grad=True on input tensors, so gradients are not tracked.
Calling backward() on non-scalar tensors without specifying gradient argument.
Reusing tensors without detaching or zeroing gradients, causing incorrect gradient accumulation.
Modifying tensors in-place which can break the computation graph.

python

import torch

# Wrong: requires_grad not set
x = torch.tensor([1.0, 2.0, 3.0])
y = x * 2
try:
    y.backward()
except RuntimeError as e:
    print('Error:', e)

# Right: requires_grad=True
x = torch.tensor([1.0, 2.0, 3.0], requires_grad=True)
y = x * 2
# y is not scalar, so specify gradient
y.backward(torch.tensor([1.0, 1.0, 1.0]))
print('Gradients:', x.grad)

Output

Error: grad can be implicitly created only for scalar outputs Gradients: tensor([2., 2., 2.])

📊

Quick Reference

Summary tips for using PyTorch autograd:

Always set requires_grad=True on tensors you want gradients for.
Call backward() on scalar outputs to compute gradients.
Access gradients via .grad attribute after backward.
Use zero_grad() on optimizers to clear gradients before new backward calls.
Detach tensors to stop tracking when needed with .detach().

✅

Key Takeaways

PyTorch autograd tracks tensor operations dynamically to compute gradients automatically.

Set requires_grad=True on tensors to enable gradient tracking.

Call backward() on scalar outputs to compute gradients for all dependent tensors.

Access computed gradients via the .grad attribute of tensors.

Avoid common mistakes like missing requires_grad or calling backward on non-scalars without gradients.