Bird
Raised Fist0
PyTorchml~20 mins

Dropout (nn.Dropout) in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Dropout (nn.Dropout)
Problem:You are training a neural network to classify images from the FashionMNIST dataset. The current model achieves 98% accuracy on training data but only 75% on validation data.
Current Metrics:Training accuracy: 98%, Validation accuracy: 75%, Training loss: 0.05, Validation loss: 0.85
Issue:The model is overfitting: it performs very well on training data but poorly on unseen validation data.
Your Task
Reduce overfitting by improving validation accuracy to at least 85% while keeping training accuracy below 92%.
You can only add dropout layers to the existing model.
Do not change the dataset or the optimizer.
Keep the number of epochs and batch size the same.
Hint 1
Hint 2
Hint 3
Solution
PyTorch
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Define the neural network with dropout
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(28*28, 256)
        self.dropout1 = nn.Dropout(0.3)
        self.fc2 = nn.Linear(256, 128)
        self.dropout2 = nn.Dropout(0.3)
        self.fc3 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.flatten(x)
        x = torch.relu(self.fc1(x))
        x = self.dropout1(x)
        x = torch.relu(self.fc2(x))
        x = self.dropout2(x)
        x = self.fc3(x)
        return x

# Prepare data
transform = transforms.ToTensor()
train_dataset = datasets.FashionMNIST(root='.', train=True, download=True, transform=transform)
val_dataset = datasets.FashionMNIST(root='.', train=False, download=True, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=64)

# Initialize model, loss, optimizer
model = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
for epoch in range(10):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    for images, labels in train_loader:
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item() * images.size(0)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
    train_loss = running_loss / total
    train_acc = 100 * correct / total

    model.eval()
    val_loss = 0.0
    val_correct = 0
    val_total = 0
    with torch.no_grad():
        for images, labels in val_loader:
            outputs = model(images)
            loss = criterion(outputs, labels)
            val_loss += loss.item() * images.size(0)
            _, predicted = torch.max(outputs, 1)
            val_total += labels.size(0)
            val_correct += (predicted == labels).sum().item()
    val_loss /= val_total
    val_acc = 100 * val_correct / val_total

    print(f'Epoch {epoch+1}: Train Loss={train_loss:.4f}, Train Acc={train_acc:.2f}%, Val Loss={val_loss:.4f}, Val Acc={val_acc:.2f}%')
Added nn.Dropout layers with 0.3 dropout rate after each fully connected layer except the output layer.
This randomly disables 30% of neurons during training to reduce overfitting.
Results Interpretation

Before: Training accuracy 98%, Validation accuracy 75%, Training loss 0.05, Validation loss 0.85

After: Training accuracy 90%, Validation accuracy 87%, Training loss 0.25, Validation loss 0.40

Adding dropout reduces overfitting by preventing the model from relying too much on specific neurons. This improves validation accuracy and makes the model generalize better.
Bonus Experiment
Try different dropout rates (e.g., 0.2, 0.4, 0.5) and observe how validation accuracy changes.
💡 Hint
Higher dropout rates increase regularization but may reduce training accuracy too much. Find a balance.

Practice

(1/5)
1. What is the main purpose of using nn.Dropout in a PyTorch model?
easy
A. To increase the learning rate automatically
B. To add noise to the input data
C. To randomly disable neurons during training to prevent overfitting
D. To speed up the training process by skipping layers

Solution

  1. Step 1: Understand dropout's role in training

    Dropout randomly disables neurons during training to reduce overfitting by preventing co-adaptation of neurons.
  2. Step 2: Compare options with dropout purpose

    Only To randomly disable neurons during training to prevent overfitting correctly describes dropout's function; others describe unrelated concepts.
  3. Final Answer:

    To randomly disable neurons during training to prevent overfitting -> Option C
  4. Quick Check:

    Dropout = random neuron disabling [OK]
Hint: Dropout disables neurons randomly during training only [OK]
Common Mistakes:
  • Thinking dropout speeds up training
  • Confusing dropout with data augmentation
  • Believing dropout changes learning rate
2. Which of the following is the correct way to create a dropout layer with 30% dropout rate in PyTorch?
easy
A. nn.Dropout(30)
B. nn.Dropout(p=30)
C. nn.Dropout(rate=0.3)
D. nn.Dropout(0.3)

Solution

  1. Step 1: Check PyTorch dropout syntax

    The dropout layer takes a float between 0 and 1 as the probability of dropout, passed as the first argument or named 'p'.
  2. Step 2: Validate each option

    nn.Dropout(0.3) uses nn.Dropout(0.3) which is correct. nn.Dropout(p=30) uses p=30 (invalid, should be 0.3). nn.Dropout(rate=0.3) uses 'rate' which is not a valid argument. nn.Dropout(30) passes 30 (integer) which is invalid.
  3. Final Answer:

    nn.Dropout(0.3) -> Option D
  4. Quick Check:

    Dropout probability is float 0-1 [OK]
Hint: Dropout probability is a float between 0 and 1 [OK]
Common Mistakes:
  • Using integer instead of float for dropout rate
  • Using wrong argument name like 'rate'
  • Passing percentage as whole number
3. Consider the following PyTorch code snippet:
import torch
import torch.nn as nn

layer = nn.Dropout(0.5)
input_tensor = torch.ones(4)
layer.train()
output_train = layer(input_tensor)
layer.eval()
output_eval = layer(input_tensor)
print(output_train)
print(output_eval)

What will be the output of print(output_eval)?
medium
A. A tensor of all ones: tensor([1., 1., 1., 1.])
B. A tensor with some zeros randomly placed
C. A tensor of all zeros
D. An error because dropout is disabled in eval mode

Solution

  1. Step 1: Understand dropout behavior in eval mode

    Dropout disables neuron dropping during evaluation mode and passes input unchanged.
  2. Step 2: Analyze output_eval value

    Since layer.eval() is called before output_eval, the output will be the same as input: all ones tensor.
  3. Final Answer:

    A tensor of all ones: tensor([1., 1., 1., 1.]) -> Option A
  4. Quick Check:

    Dropout off in eval mode = input unchanged [OK]
Hint: Dropout disables only in eval mode, output equals input [OK]
Common Mistakes:
  • Expecting dropout to apply in eval mode
  • Confusing train() and eval() modes
  • Thinking dropout outputs zeros always
4. You wrote this PyTorch code but the dropout layer seems to have no effect during training:
import torch.nn as nn
layer = nn.Dropout(0.4)
output = layer(input_tensor)

What is the most likely reason dropout is not working as expected?
medium
A. Dropout only works on GPU tensors
B. You forgot to call layer.train() to enable dropout
C. The dropout probability 0.4 is too low to see effect
D. You need to call layer.eval() to activate dropout

Solution

  1. Step 1: Recall dropout behavior in train vs eval modes

    Dropout only disables neurons during training mode. In eval mode, dropout is disabled.
  2. Step 2: Identify missing train mode call

    If layer.train() is not called (e.g., after a previous layer.eval()), the layer stays in eval mode, so dropout has no effect.
  3. Final Answer:

    You forgot to call layer.train() to enable dropout -> Option B
  4. Quick Check:

    Dropout active only in train mode [OK]
Hint: Call train() to activate dropout during training [OK]
Common Mistakes:
  • Assuming dropout works without train() mode
  • Thinking dropout depends on tensor device
  • Calling eval() instead of train()
5. You want to add dropout to a neural network to reduce overfitting. Which of the following is the best practice when using nn.Dropout in your model?
hard
A. Apply dropout only during training and disable it during evaluation
B. Apply dropout during both training and evaluation for consistency
C. Apply dropout only during evaluation to test robustness
D. Apply dropout only to the input layer and never to hidden layers

Solution

  1. Step 1: Understand dropout's intended use

    Dropout is designed to randomly disable neurons during training to prevent overfitting.
  2. Step 2: Recall dropout behavior during evaluation

    During evaluation, dropout is disabled to use the full network for predictions.
  3. Step 3: Evaluate options

    Apply dropout only during training and disable it during evaluation correctly states dropout is applied only during training. Options B and C are incorrect because dropout should not be active during evaluation. Apply dropout only to the input layer and never to hidden layers is incorrect because dropout can be applied to hidden layers as well.
  4. Final Answer:

    Apply dropout only during training and disable it during evaluation -> Option A
  5. Quick Check:

    Dropout active in train, off in eval [OK]
Hint: Dropout off during eval, on during training [OK]
Common Mistakes:
  • Applying dropout during evaluation
  • Limiting dropout only to input layer
  • Confusing dropout with data augmentation