How to Build an Autoencoder in PyTorch: Simple Guide
To build an autoencoder in
PyTorch, define a neural network with an encoder and decoder using torch.nn.Module. Train it by minimizing the difference between input and output using a loss like nn.MSELoss and an optimizer such as torch.optim.Adam.Syntax
An autoencoder in PyTorch is a class that inherits from torch.nn.Module. It has two main parts: encoder compresses input data, and decoder reconstructs it. The forward method defines how data flows through these parts.
__init__: sets up layers for encoder and decoder.forward: runs input through encoder then decoder.- Use
nn.Linearfor fully connected layers.
python
import torch import torch.nn as nn class Autoencoder(nn.Module): def __init__(self): super().__init__() # Encoder layers self.encoder = nn.Sequential( nn.Linear(784, 128), nn.ReLU(), nn.Linear(128, 64), nn.ReLU(), nn.Linear(64, 12), nn.ReLU(), nn.Linear(12, 3) # compressed representation ) # Decoder layers self.decoder = nn.Sequential( nn.Linear(3, 12), nn.ReLU(), nn.Linear(12, 64), nn.ReLU(), nn.Linear(64, 128), nn.ReLU(), nn.Linear(128, 784), nn.Sigmoid() # output between 0 and 1 ) def forward(self, x): encoded = self.encoder(x) decoded = self.decoder(encoded) return decoded
Example
This example shows a full runnable autoencoder training on random data shaped like flattened 28x28 images (like MNIST). It trains for 5 epochs and prints loss to show learning progress.
python
import torch import torch.nn as nn import torch.optim as optim # Define the Autoencoder class (same as Syntax section) class Autoencoder(nn.Module): def __init__(self): super().__init__() self.encoder = nn.Sequential( nn.Linear(784, 128), nn.ReLU(), nn.Linear(128, 64), nn.ReLU(), nn.Linear(64, 12), nn.ReLU(), nn.Linear(12, 3) ) self.decoder = nn.Sequential( nn.Linear(3, 12), nn.ReLU(), nn.Linear(12, 64), nn.ReLU(), nn.Linear(64, 128), nn.ReLU(), nn.Linear(128, 784), nn.Sigmoid() ) def forward(self, x): encoded = self.encoder(x) decoded = self.decoder(encoded) return decoded # Create model, loss, optimizer model = Autoencoder() criterion = nn.MSELoss() optimizer = optim.Adam(model.parameters(), lr=0.001) # Dummy data: 100 samples of 784 features data = torch.rand(100, 784) # Training loop for epoch in range(5): optimizer.zero_grad() output = model(data) loss = criterion(output, data) loss.backward() optimizer.step() print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")
Output
Epoch 1, Loss: 0.0867
Epoch 2, Loss: 0.0593
Epoch 3, Loss: 0.0462
Epoch 4, Loss: 0.0383
Epoch 5, Loss: 0.0329
Common Pitfalls
Common mistakes when building autoencoders in PyTorch include:
- Not flattening input data before feeding to linear layers.
- Using activation functions incorrectly, e.g., missing
Sigmoidon output for normalized data. - Forgetting to call
optimizer.zero_grad()beforeloss.backward(). - Using wrong loss function;
nn.MSELossis typical for reconstruction.
Always check input shapes and output ranges to match your data.
python
import torch import torch.nn as nn # Wrong: missing flattening input class BadAutoencoder(nn.Module): def __init__(self): super().__init__() self.encoder = nn.Linear(28*28, 12) # expects 784 features, but input is 28x28 self.decoder = nn.Linear(12, 28*28) def forward(self, x): # x shape is (batch, 28, 28), but not flattened x = x.view(x.size(0), -1) # flatten to avoid error encoded = self.encoder(x) decoded = self.decoder(encoded) return decoded # Right: flatten input before linear layers class GoodAutoencoder(nn.Module): def __init__(self): super().__init__() self.encoder = nn.Linear(784, 12) self.decoder = nn.Linear(12, 784) def forward(self, x): x = x.view(x.size(0), -1) # flatten encoded = self.encoder(x) decoded = self.decoder(encoded) return decoded
Quick Reference
Tips for building autoencoders in PyTorch:
- Use
nn.Sequentialto stack layers cleanly. - Flatten inputs before feeding to linear layers with
x.view(x.size(0), -1). - Use
nn.MSELossfor reconstruction error. - Use
Sigmoidactivation on output if input data is normalized between 0 and 1. - Train with an optimizer like
Adamand calloptimizer.zero_grad()each step.
Key Takeaways
Define encoder and decoder as parts of a PyTorch nn.Module class.
Flatten input data before feeding it to linear layers.
Use MSE loss to measure reconstruction error.
Call optimizer.zero_grad() before backward pass to avoid gradient accumulation.
Use Sigmoid activation on output for normalized input data.