0
0
PyTorchml~20 mins

GPU tensors (to, cuda) in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - GPU tensors (to, cuda)
Problem:You have a simple neural network model training on CPU. The training is slow because it does not use GPU acceleration.
Current Metrics:Training time per epoch: 12 seconds, Training accuracy: 85%, Validation accuracy: 83%
Issue:The model training is slow because tensors and model are on CPU instead of GPU. This limits speed and efficiency.
Your Task
Move the model and data tensors to GPU to speed up training time while maintaining or improving accuracy.
You must use PyTorch and keep the same model architecture.
You cannot change the dataset or batch size.
You must ensure the code runs without errors on a machine with CUDA-enabled GPU.
Hint 1
Hint 2
Hint 3
Solution
PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

# Check device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Simple model
class SimpleNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(28*28, 128)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)
    def forward(self, x):
        x = x.view(-1, 28*28)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Data
transform = transforms.ToTensor()
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)

# Model
model = SimpleNN().to(device)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters())

# Training loop
for epoch in range(3):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)  # Move to GPU
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
    accuracy = 100 * correct / total
    print(f'Epoch {epoch+1}, Loss: {running_loss/len(train_loader):.4f}, Accuracy: {accuracy:.2f}%')
Added device detection with torch.device and torch.cuda.is_available()
Moved model to GPU with model.to(device)
Moved input tensors and labels to GPU inside training loop with images.to(device) and labels.to(device)
Results Interpretation

Before: Training time per epoch: 12 seconds, Accuracy: 85%

After: Training time per epoch: 3 seconds, Accuracy: 85%

Moving tensors and model to GPU speeds up training significantly without changing accuracy. Using .to('cuda') or .cuda() is essential for GPU acceleration in PyTorch.
Bonus Experiment
Try adding a validation loop on GPU to measure validation accuracy after each epoch.
💡 Hint
Move validation data to the same device and disable gradient calculation with torch.no_grad() during validation.