PytorchHow-ToBeginner · 4 min read

How to Use CUDA in PyTorch for GPU Acceleration

To use CUDA in PyTorch, first check if a GPU is available with torch.cuda.is_available(). Then move your model and tensors to the GPU using .to('cuda') or .cuda(). This enables faster computation by running operations on the GPU.

📐

Syntax

Here is the basic syntax to use CUDA in PyTorch:

torch.cuda.is_available(): Checks if a CUDA-enabled GPU is available.
tensor.to('cuda') or tensor.cuda(): Moves a tensor to the GPU.
model.to('cuda') or model.cuda(): Moves the model to the GPU.

Use these to run your computations on GPU instead of CPU.

python

import torch

# Check if CUDA is available
cuda_available = torch.cuda.is_available()

# Create a tensor
x = torch.tensor([1.0, 2.0, 3.0])

if cuda_available:
    # Move tensor to GPU
    x = x.to('cuda')

# Print device of tensor
print(x.device)

Output

cuda:0

💻

Example

This example shows how to move a simple model and data to CUDA, perform a forward pass, and print the output device.

python

import torch
import torch.nn as nn

# Define a simple model
class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(3, 1)
    def forward(self, x):
        return self.linear(x)

# Check CUDA availability
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Create model and move to device
model = SimpleModel().to(device)

# Create input tensor and move to device
input_tensor = torch.tensor([[1.0, 2.0, 3.0]], device=device)

# Forward pass
output = model(input_tensor)

print(f"Output: {output}")
print(f"Output device: {output.device}")

Output

Output: tensor([[value]], device='cuda:0', grad_fn=<AddmmBackward0>) Output device: cuda:0

⚠️

Common Pitfalls

Common mistakes when using CUDA in PyTorch include:

Trying to perform operations on tensors located on different devices (CPU vs GPU), which causes errors.
Not moving both the model and input tensors to the same device.
Assuming CUDA is always available without checking torch.cuda.is_available().

Always ensure all tensors and models are on the same device before computation.

python

import torch

# Wrong: model on CPU, input on GPU
model = torch.nn.Linear(3, 1)  # on CPU by default
input_tensor = torch.tensor([[1.0, 2.0, 3.0]]).to('cuda')

try:
    output = model(input_tensor)  # This will raise an error
except RuntimeError as e:
    print(f"Error: {e}")

# Right: move model to GPU
model = model.to('cuda')
output = model(input_tensor)
print(f"Output device: {output.device}")

Output

Error: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! Output device: cuda:0

📊

Quick Reference

Command	Description
torch.cuda.is_available()	Check if CUDA GPU is available
tensor.to('cuda')	Move tensor to GPU
model.to('cuda')	Move model to GPU
tensor.to('cpu')	Move tensor back to CPU
model.to('cpu')	Move model back to CPU

✅

Key Takeaways

Always check CUDA availability with torch.cuda.is_available() before using GPU.

Move both your model and tensors to the same device (CPU or GPU) to avoid errors.

Use .to('cuda') or .cuda() to move data and models to GPU for faster computation.

Operations on tensors must happen on the same device to prevent runtime errors.

You can switch back to CPU anytime with .to('cpu') if needed.