0
0
Computer Visionml~20 mins

Autoencoder for images in Computer Vision - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Autoencoder Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Model Choice
intermediate
2:00remaining
Choosing the right architecture for an image autoencoder

You want to build an autoencoder to compress and reconstruct 28x28 grayscale images. Which model architecture is best suited for this task?

AA convolutional neural network with convolutional layers followed by a bottleneck and deconvolutional layers
BA fully connected neural network with input size 784 and a bottleneck layer of size 32
CA recurrent neural network with LSTM layers to process the image pixels sequentially
DA decision tree model trained to predict pixel values from other pixels
Attempts:
2 left
💡 Hint

Think about which model type can capture spatial patterns in images effectively.

Predict Output
intermediate
2:00remaining
Output shape after encoding in a convolutional autoencoder

Given the following encoder code for 28x28 grayscale images, what is the shape of the encoded output?

Computer Vision
import torch
import torch.nn as nn

class Encoder(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 16, 3, stride=2, padding=1)  # output: 16 x 14 x 14
        self.conv2 = nn.Conv2d(16, 8, 3, stride=2, padding=1)  # output: 8 x 7 x 7
    def forward(self, x):
        x = torch.relu(self.conv1(x))
        x = torch.relu(self.conv2(x))
        return x

encoder = Encoder()
input_tensor = torch.randn(1, 1, 28, 28)
encoded = encoder(input_tensor)
print(encoded.shape)
Atorch.Size([1, 16, 14, 14])
Btorch.Size([1, 8, 7, 7])
Ctorch.Size([1, 8, 14, 14])
Dtorch.Size([1, 1, 28, 28])
Attempts:
2 left
💡 Hint

Calculate output size after each convolution using stride and padding.

Hyperparameter
advanced
2:00remaining
Choosing the right latent dimension size in an autoencoder

You are training an autoencoder on 64x64 RGB images. What is the effect of choosing a very small latent dimension size?

AThe model will compress images too much, causing blurry or poor reconstructions
BThe model will overfit and memorize training images exactly
CThe model will run faster but use more memory
DThe model will reconstruct images perfectly with no loss
Attempts:
2 left
💡 Hint

Think about what happens if the compressed representation is too small to hold enough information.

Metrics
advanced
2:00remaining
Evaluating autoencoder reconstruction quality

Which metric is best to measure how well an autoencoder reconstructs images?

AF1 score of reconstructed image pixels
BAccuracy of predicted class labels
CCross-entropy loss between input and output pixels
DMean Squared Error (MSE) between original and reconstructed images
Attempts:
2 left
💡 Hint

Consider a metric that measures pixel-wise difference between images.

🔧 Debug
expert
2:00remaining
Identifying the cause of poor autoencoder training

You trained an autoencoder on 32x32 grayscale images but the loss does not decrease and reconstructions are random noise. What is the most likely cause?

AThe batch size is too large causing slow convergence
BThe input images are not normalized before training
CThe learning rate is too high causing unstable training
DThe autoencoder has too many layers causing overfitting
Attempts:
2 left
💡 Hint

Think about what causes training loss to not improve and outputs to be random.