PytorchConceptBeginner · 3 min read

What is .grad in PyTorch: Explanation and Usage

.grad in PyTorch is an attribute of a tensor that stores the gradient (derivative) of a scalar output with respect to that tensor. It is used during backpropagation to hold the computed gradients needed for updating model parameters.

⚙️

How It Works

Imagine you are trying to find out how changing one ingredient in a recipe affects the final taste. In machine learning, this is like finding how changing a number (a tensor) affects the final result (loss). The .grad attribute in PyTorch holds this information, called the gradient.

When you run backpropagation, PyTorch calculates gradients automatically and stores them in the .grad attribute of each tensor that requires gradients. This is like writing down how much each ingredient should be adjusted to improve the recipe.

These gradients are then used by optimization algorithms to update the model’s parameters, helping the model learn from data.

💻

Example

This example shows how to create a tensor, perform a simple operation, run backpropagation, and access the .grad attribute to see the gradient.

python

import torch

# Create a tensor with gradient tracking enabled
x = torch.tensor(2.0, requires_grad=True)

# Define a simple function y = x^2
y = x ** 2

# Compute gradients (dy/dx)
y.backward()

# Print the gradient stored in x.grad
print(x.grad)

Output

tensor(4.)

🎯

When to Use

You use .grad when you want to know how a change in a tensor affects a result, especially during training machine learning models. It is essential for updating model weights to minimize errors.

For example, in neural networks, after calculating the loss, you call backward() to compute gradients, then access .grad to see how each parameter should change. This guides the optimizer to improve the model.

It is also useful in custom gradient calculations or when debugging your model’s learning process.

✅

Key Points

.grad stores the gradient of a tensor after backpropagation.
Gradients are used to update model parameters during training.
Only tensors with requires_grad=True will have .grad populated.
You must call backward() on a scalar output to compute gradients.
.grad is None before backpropagation.

✅

Key Takeaways

.grad holds the gradient of a tensor after backpropagation.

Gradients guide how model parameters should change to reduce errors.

Only tensors with requires_grad=True have gradients computed.

Call backward() on a scalar to compute gradients.

.grad is None until gradients are calculated.