0
0
PytorchHow-ToBeginner · 4 min read

How to Build an Autoencoder in PyTorch: Simple Guide

To build an autoencoder in PyTorch, define a neural network with an encoder and decoder using torch.nn.Module. Train it by minimizing the difference between input and output using a loss like nn.MSELoss and an optimizer such as torch.optim.Adam.
📐

Syntax

An autoencoder in PyTorch is a class that inherits from torch.nn.Module. It has two main parts: encoder compresses input data, and decoder reconstructs it. The forward method defines how data flows through these parts.

  • __init__: sets up layers for encoder and decoder.
  • forward: runs input through encoder then decoder.
  • Use nn.Linear for fully connected layers.
python
import torch
import torch.nn as nn

class Autoencoder(nn.Module):
    def __init__(self):
        super().__init__()
        # Encoder layers
        self.encoder = nn.Sequential(
            nn.Linear(784, 128),
            nn.ReLU(),
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.Linear(64, 12),
            nn.ReLU(),
            nn.Linear(12, 3)  # compressed representation
        )
        # Decoder layers
        self.decoder = nn.Sequential(
            nn.Linear(3, 12),
            nn.ReLU(),
            nn.Linear(12, 64),
            nn.ReLU(),
            nn.Linear(64, 128),
            nn.ReLU(),
            nn.Linear(128, 784),
            nn.Sigmoid()  # output between 0 and 1
        )

    def forward(self, x):
        encoded = self.encoder(x)
        decoded = self.decoder(encoded)
        return decoded
💻

Example

This example shows a full runnable autoencoder training on random data shaped like flattened 28x28 images (like MNIST). It trains for 5 epochs and prints loss to show learning progress.

python
import torch
import torch.nn as nn
import torch.optim as optim

# Define the Autoencoder class (same as Syntax section)
class Autoencoder(nn.Module):
    def __init__(self):
        super().__init__()
        self.encoder = nn.Sequential(
            nn.Linear(784, 128),
            nn.ReLU(),
            nn.Linear(128, 64),
            nn.ReLU(),
            nn.Linear(64, 12),
            nn.ReLU(),
            nn.Linear(12, 3)
        )
        self.decoder = nn.Sequential(
            nn.Linear(3, 12),
            nn.ReLU(),
            nn.Linear(12, 64),
            nn.ReLU(),
            nn.Linear(64, 128),
            nn.ReLU(),
            nn.Linear(128, 784),
            nn.Sigmoid()
        )

    def forward(self, x):
        encoded = self.encoder(x)
        decoded = self.decoder(encoded)
        return decoded

# Create model, loss, optimizer
model = Autoencoder()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Dummy data: 100 samples of 784 features
data = torch.rand(100, 784)

# Training loop
for epoch in range(5):
    optimizer.zero_grad()
    output = model(data)
    loss = criterion(output, data)
    loss.backward()
    optimizer.step()
    print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")
Output
Epoch 1, Loss: 0.0867 Epoch 2, Loss: 0.0593 Epoch 3, Loss: 0.0462 Epoch 4, Loss: 0.0383 Epoch 5, Loss: 0.0329
⚠️

Common Pitfalls

Common mistakes when building autoencoders in PyTorch include:

  • Not flattening input data before feeding to linear layers.
  • Using activation functions incorrectly, e.g., missing Sigmoid on output for normalized data.
  • Forgetting to call optimizer.zero_grad() before loss.backward().
  • Using wrong loss function; nn.MSELoss is typical for reconstruction.

Always check input shapes and output ranges to match your data.

python
import torch
import torch.nn as nn

# Wrong: missing flattening input
class BadAutoencoder(nn.Module):
    def __init__(self):
        super().__init__()
        self.encoder = nn.Linear(28*28, 12)  # expects 784 features, but input is 28x28
        self.decoder = nn.Linear(12, 28*28)

    def forward(self, x):
        # x shape is (batch, 28, 28), but not flattened
        x = x.view(x.size(0), -1)  # flatten to avoid error
        encoded = self.encoder(x)
        decoded = self.decoder(encoded)
        return decoded

# Right: flatten input before linear layers
class GoodAutoencoder(nn.Module):
    def __init__(self):
        super().__init__()
        self.encoder = nn.Linear(784, 12)
        self.decoder = nn.Linear(12, 784)

    def forward(self, x):
        x = x.view(x.size(0), -1)  # flatten
        encoded = self.encoder(x)
        decoded = self.decoder(encoded)
        return decoded
📊

Quick Reference

Tips for building autoencoders in PyTorch:

  • Use nn.Sequential to stack layers cleanly.
  • Flatten inputs before feeding to linear layers with x.view(x.size(0), -1).
  • Use nn.MSELoss for reconstruction error.
  • Use Sigmoid activation on output if input data is normalized between 0 and 1.
  • Train with an optimizer like Adam and call optimizer.zero_grad() each step.

Key Takeaways

Define encoder and decoder as parts of a PyTorch nn.Module class.
Flatten input data before feeding it to linear layers.
Use MSE loss to measure reconstruction error.
Call optimizer.zero_grad() before backward pass to avoid gradient accumulation.
Use Sigmoid activation on output for normalized input data.