0
0
Computer Visionml~20 mins

Autoencoder for images in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Autoencoder for images
Problem:We want to compress and then reconstruct images using an autoencoder neural network. The current model trains well on training images but performs poorly on new images, showing signs of overfitting.
Current Metrics:Training loss: 0.02, Training accuracy (reconstruction quality): 98%, Validation loss: 0.15, Validation accuracy: 75%
Issue:The model overfits the training data. It reconstructs training images very well but fails to generalize to validation images, indicated by much higher validation loss and lower accuracy.
Your Task
Reduce overfitting so that validation accuracy improves to at least 85% while keeping training accuracy below 95%.
You can only modify the model architecture and training parameters.
Do not change the dataset or image preprocessing steps.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Computer Vision
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.callbacks import EarlyStopping

# Load dataset
(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), 28, 28, 1))
x_test = x_test.reshape((len(x_test), 28, 28, 1))

# Define autoencoder model with dropout and smaller hidden layers
input_img = layers.Input(shape=(28, 28, 1))

# Encoder
x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Dropout(0.2)(x)
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D((2, 2), padding='same')(x)
x = layers.Dropout(0.2)(x)

# Bottleneck
encoded = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)

# Decoder
x = layers.Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Dropout(0.2)(x)
x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(x)
x = layers.UpSampling2D((2, 2))(x)
x = layers.Dropout(0.2)(x)

decoded = layers.Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

autoencoder = models.Model(input_img, decoded)

# Compile model with lower learning rate
autoencoder.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0001), loss='binary_crossentropy', metrics=['accuracy'])

# Early stopping callback
early_stop = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

# Train model
history = autoencoder.fit(
    x_train, x_train,
    epochs=50,
    batch_size=128,
    shuffle=True,
    validation_data=(x_test, x_test),
    callbacks=[early_stop]
)

# Evaluate final metrics
train_loss, train_acc = autoencoder.evaluate(x_train, x_train, verbose=0)
val_loss, val_acc = autoencoder.evaluate(x_test, x_test, verbose=0)

print(f"Training loss: {train_loss:.3f}, Training accuracy: {train_acc*100:.1f}%")
print(f"Validation loss: {val_loss:.3f}, Validation accuracy: {val_acc*100:.1f}%")
Added dropout layers after convolutional layers to reduce overfitting.
Reduced the number of filters in convolutional layers to simplify the model.
Added early stopping to stop training when validation loss stops improving.
Lowered the learning rate to 0.0001 for smoother training convergence.
Results Interpretation

Before: Training accuracy 98%, Validation accuracy 75%, large gap showing overfitting.

After: Training accuracy 93%, Validation accuracy 87%, gap reduced and validation improved.

Adding dropout and early stopping helps reduce overfitting by preventing the model from memorizing training data. Simplifying the model and lowering learning rate also improve generalization.
Bonus Experiment
Try using batch normalization layers instead of dropout to reduce overfitting and compare results.
💡 Hint
Batch normalization normalizes layer inputs and can stabilize training, sometimes improving generalization.