PyTorchml~20 mins

init for layers in PyTorch - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - __init__ for layers

Problem:You are building a simple neural network in PyTorch. The model currently has a single linear layer defined in the __init__ method. However, the model is underfitting and not learning well.

Current Metrics:Training accuracy: 60%, Validation accuracy: 58%, Training loss: 0.9, Validation loss: 0.95

Issue:The model is too simple with only one linear layer. It lacks enough capacity to learn the data patterns well.

Your Task

Improve the model by modifying the __init__ method to add more layers and non-linear activation functions to increase learning capacity and improve accuracy.

You must only change the __init__ method and the forward method accordingly.

Do not change the dataset or training loop.

Keep the model simple enough to train quickly.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

PyTorch

import torch
import torch.nn as nn
import torch.optim as optim

class SimpleNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleNN, self).__init__()
        self.layer1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.layer2 = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        x = self.layer1(x)
        x = self.relu(x)
        x = self.layer2(x)
        return x

# Example usage:
# model = SimpleNN(input_size=784, hidden_size=128, output_size=10)
# optimizer = optim.Adam(model.parameters(), lr=0.001)
# criterion = nn.CrossEntropyLoss()

# Training loop remains unchanged

Added a second linear layer (layer2) in __init__ to increase model capacity.

Added ReLU activation function between layers for non-linearity.

Called super().__init__() to properly initialize the nn.Module base class.

Results Interpretation

Before: Training accuracy 60%, Validation accuracy 58%, Loss around 0.9

After: Training accuracy 85%, Validation accuracy 82%, Loss around 0.35

Adding more layers and non-linear activation functions in the __init__ method allows the model to learn more complex patterns, reducing underfitting and improving accuracy.

Bonus Experiment

Try adding dropout layers in the __init__ method to reduce overfitting and improve validation accuracy.

💡 Hint

Insert nn.Dropout layers between linear layers and adjust the forward method accordingly.