A Variational Autoencoder (VAE) helps us learn how to create new data similar to what we have by compressing and then rebuilding it in a smart way.
0
0
Variational Autoencoder in PyTorch
Introduction
When you want to generate new images that look like your training pictures.
When you want to reduce the size of data but keep important features.
When you want to learn hidden patterns in data for tasks like anomaly detection.
When you want to create smooth transitions between different data points, like morphing faces.
When you want to explore creative AI applications like generating music or text.
Syntax
PyTorch
class VAE(nn.Module): def __init__(self): super().__init__() # Define encoder layers # Define layers to get mean and log variance # Define decoder layers def encode(self, x): # Pass x through encoder # Return mean and log variance def reparameterize(self, mu, logvar): # Sample from normal distribution using mu and logvar def decode(self, z): # Pass z through decoder def forward(self, x): mu, logvar = self.encode(x) z = self.reparameterize(mu, logvar) return self.decode(z), mu, logvar
The encoder compresses input into two values: mean (mu) and log variance (logvar).
The reparameterization trick allows backpropagation through random sampling.
Examples
This code samples a point from the distribution defined by mu and logvar.
PyTorch
def reparameterize(self, mu, logvar): std = torch.exp(0.5 * logvar) eps = torch.randn_like(std) return mu + eps * std
This loss combines reconstruction error and a penalty to keep the distribution close to normal.
PyTorch
def loss_function(recon_x, x, mu, logvar): BCE = F.binary_cross_entropy(recon_x, x, reduction='sum') KLD = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp()) return BCE + KLD
Sample Model
This program trains a simple VAE on MNIST digits for one epoch and prints the loss.
PyTorch
import torch import torch.nn as nn import torch.nn.functional as F from torch.utils.data import DataLoader from torchvision import datasets, transforms class VAE(nn.Module): def __init__(self): super().__init__() self.fc1 = nn.Linear(28*28, 400) self.fc21 = nn.Linear(400, 20) # mu self.fc22 = nn.Linear(400, 20) # logvar self.fc3 = nn.Linear(20, 400) self.fc4 = nn.Linear(400, 28*28) def encode(self, x): h1 = F.relu(self.fc1(x)) return self.fc21(h1), self.fc22(h1) def reparameterize(self, mu, logvar): std = torch.exp(0.5 * logvar) eps = torch.randn_like(std) return mu + eps * std def decode(self, z): h3 = F.relu(self.fc3(z)) return torch.sigmoid(self.fc4(h3)) def forward(self, x): mu, logvar = self.encode(x.view(-1, 28*28)) z = self.reparameterize(mu, logvar) return self.decode(z), mu, logvar def loss_function(recon_x, x, mu, logvar): BCE = F.binary_cross_entropy(recon_x, x.view(-1, 28*28), reduction='sum') KLD = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp()) return BCE + KLD # Load MNIST dataset train_loader = DataLoader( datasets.MNIST('.', train=True, download=True, transform=transforms.ToTensor()), batch_size=128, shuffle=True) model = VAE() optimizer = torch.optim.Adam(model.parameters(), lr=1e-3) model.train() for epoch in range(1): # 1 epoch for demo train_loss = 0 for batch_idx, (data, _) in enumerate(train_loader): optimizer.zero_grad() recon_batch, mu, logvar = model(data) loss = loss_function(recon_batch, data, mu, logvar) loss.backward() train_loss += loss.item() optimizer.step() if batch_idx == 0: print(f'Train Epoch: {epoch+1} \tLoss: {loss.item() / len(data):.4f}') print(f'====> Epoch: {epoch+1} Average loss: {train_loss / len(train_loader.dataset):.4f}')
OutputSuccess
Important Notes
Training longer improves the quality of generated data.
VAE learns a smooth space to create new data by sampling latent points.
Use GPU if available for faster training.
Summary
Variational Autoencoders learn to compress and generate data using a probabilistic approach.
They use an encoder to get mean and variance, then sample a point to decode back.
Loss combines how well data is rebuilt and how close the latent space is to normal.