How to Use CUDA in PyTorch for GPU Acceleration
To use
CUDA in PyTorch, first check if a GPU is available with torch.cuda.is_available(). Then move your model and tensors to the GPU using .to('cuda') or .cuda(). This enables faster computation by running operations on the GPU.Syntax
Here is the basic syntax to use CUDA in PyTorch:
torch.cuda.is_available(): Checks if a CUDA-enabled GPU is available.tensor.to('cuda')ortensor.cuda(): Moves a tensor to the GPU.model.to('cuda')ormodel.cuda(): Moves the model to the GPU.
Use these to run your computations on GPU instead of CPU.
python
import torch # Check if CUDA is available cuda_available = torch.cuda.is_available() # Create a tensor x = torch.tensor([1.0, 2.0, 3.0]) if cuda_available: # Move tensor to GPU x = x.to('cuda') # Print device of tensor print(x.device)
Output
cuda:0
Example
This example shows how to move a simple model and data to CUDA, perform a forward pass, and print the output device.
python
import torch import torch.nn as nn # Define a simple model class SimpleModel(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(3, 1) def forward(self, x): return self.linear(x) # Check CUDA availability device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') # Create model and move to device model = SimpleModel().to(device) # Create input tensor and move to device input_tensor = torch.tensor([[1.0, 2.0, 3.0]], device=device) # Forward pass output = model(input_tensor) print(f"Output: {output}") print(f"Output device: {output.device}")
Output
Output: tensor([[value]], device='cuda:0', grad_fn=<AddmmBackward0>)
Output device: cuda:0
Common Pitfalls
Common mistakes when using CUDA in PyTorch include:
- Trying to perform operations on tensors located on different devices (CPU vs GPU), which causes errors.
- Not moving both the model and input tensors to the same device.
- Assuming CUDA is always available without checking
torch.cuda.is_available().
Always ensure all tensors and models are on the same device before computation.
python
import torch # Wrong: model on CPU, input on GPU model = torch.nn.Linear(3, 1) # on CPU by default input_tensor = torch.tensor([[1.0, 2.0, 3.0]]).to('cuda') try: output = model(input_tensor) # This will raise an error except RuntimeError as e: print(f"Error: {e}") # Right: move model to GPU model = model.to('cuda') output = model(input_tensor) print(f"Output device: {output.device}")
Output
Error: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Output device: cuda:0
Quick Reference
| Command | Description |
|---|---|
| torch.cuda.is_available() | Check if CUDA GPU is available |
| tensor.to('cuda') | Move tensor to GPU |
| model.to('cuda') | Move model to GPU |
| tensor.to('cpu') | Move tensor back to CPU |
| model.to('cpu') | Move model back to CPU |
Key Takeaways
Always check CUDA availability with torch.cuda.is_available() before using GPU.
Move both your model and tensors to the same device (CPU or GPU) to avoid errors.
Use .to('cuda') or .cuda() to move data and models to GPU for faster computation.
Operations on tensors must happen on the same device to prevent runtime errors.
You can switch back to CPU anytime with .to('cpu') if needed.