PyTorchml~20 mins

Dropout (nn.Dropout) in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Dropout (nn.Dropout)

Problem:You are training a neural network to classify images from the FashionMNIST dataset. The current model achieves 98% accuracy on training data but only 75% on validation data.

Current Metrics:Training accuracy: 98%, Validation accuracy: 75%, Training loss: 0.05, Validation loss: 0.85

Issue:The model is overfitting: it performs very well on training data but poorly on unseen validation data.

Your Task

Reduce overfitting by improving validation accuracy to at least 85% while keeping training accuracy below 92%.

You can only add dropout layers to the existing model.

Do not change the dataset or the optimizer.

Keep the number of epochs and batch size the same.

Hint 1

Hint 2

Hint 3

Solution

PyTorch

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Define the neural network with dropout
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(28*28, 256)
        self.dropout1 = nn.Dropout(0.3)
        self.fc2 = nn.Linear(256, 128)
        self.dropout2 = nn.Dropout(0.3)
        self.fc3 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.flatten(x)
        x = torch.relu(self.fc1(x))
        x = self.dropout1(x)
        x = torch.relu(self.fc2(x))
        x = self.dropout2(x)
        x = self.fc3(x)
        return x

# Prepare data
transform = transforms.ToTensor()
train_dataset = datasets.FashionMNIST(root='.', train=True, download=True, transform=transform)
val_dataset = datasets.FashionMNIST(root='.', train=False, download=True, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=64)

# Initialize model, loss, optimizer
model = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
for epoch in range(10):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for images, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item() * images.size(0)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
    train_loss = running_loss / total
    train_acc = 100 * correct / total

    model.eval()
    val_loss = 0.0
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for images, labels in val_loader:
            outputs = model(images)
            loss = criterion(outputs, labels)
            val_loss += loss.item() * images.size(0)
            _, predicted = torch.max(outputs, 1)
            val_total += labels.size(0)
            val_correct += (predicted == labels).sum().item()
    val_loss /= val_total
    val_acc = 100 * val_correct / val_total

    print(f'Epoch {epoch+1}: Train Loss={train_loss:.4f}, Train Acc={train_acc:.2f}%, Val Loss={val_loss:.4f}, Val Acc={val_acc:.2f}%')

Added nn.Dropout layers with 0.3 dropout rate after each fully connected layer except the output layer.

This randomly disables 30% of neurons during training to reduce overfitting.

Results Interpretation

Before: Training accuracy 98%, Validation accuracy 75%, Training loss 0.05, Validation loss 0.85

After: Training accuracy 90%, Validation accuracy 87%, Training loss 0.25, Validation loss 0.40

Adding dropout reduces overfitting by preventing the model from relying too much on specific neurons. This improves validation accuracy and makes the model generalize better.

Bonus Experiment

Try different dropout rates (e.g., 0.2, 0.4, 0.5) and observe how validation accuracy changes.

💡 Hint

Higher dropout rates increase regularization but may reduce training accuracy too much. Find a balance.

Practice

(1/5)

1. What is the main purpose of using nn.Dropout in a PyTorch model?

easy

A. To increase the learning rate automatically

B. To add noise to the input data

C. To randomly disable neurons during training to prevent overfitting

D. To speed up the training process by skipping layers

Dropout (nn.Dropout) in PyTorch - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand dropout's role in training

Step 2: Compare options with dropout purpose

Final Answer:

Quick Check:

Solution

Step 1: Check PyTorch dropout syntax

Step 2: Validate each option

Final Answer:

Quick Check:

Solution

Step 1: Understand dropout behavior in eval mode

Step 2: Analyze output_eval value

Final Answer:

Quick Check:

Solution

Step 1: Recall dropout behavior in train vs eval modes

Step 2: Identify missing train mode call

Final Answer:

Quick Check:

Solution

Step 1: Understand dropout's intended use

Step 2: Recall dropout behavior during evaluation

Step 3: Evaluate options

Final Answer:

Quick Check: