0
0
Computer Visionml~20 mins

Hand and face landmark detection in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Hand and face landmark detection
Problem:Detect key points (landmarks) on hands and faces from images to enable gesture and expression recognition.
Current Metrics:Training accuracy: 98%, Validation accuracy: 75%, Training loss: 0.05, Validation loss: 0.25
Issue:The model overfits: training accuracy is very high but validation accuracy is much lower, indicating poor generalization.
Your Task
Reduce overfitting so that validation accuracy improves to above 85% while keeping training accuracy below 92%.
You can only modify the model architecture and training hyperparameters.
Do not change the dataset or input image size.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Computer Vision
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Data augmentation setup
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    validation_split=0.2
)

# Assuming train_dir contains hand and face images with landmarks labels
train_generator = train_datagen.flow_from_directory(
    'train_dir',
    target_size=(128, 128),
    batch_size=32,
    class_mode='categorical',
    subset='training'
)
validation_generator = train_datagen.flow_from_directory(
    'train_dir',
    target_size=(128, 128),
    batch_size=32,
    class_mode='categorical',
    subset='validation'
)

# Model architecture with dropout to reduce overfitting
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(128,128,3)),
    layers.MaxPooling2D(2,2),
    layers.Dropout(0.25),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D(2,2),
    layers.Dropout(0.25),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(42, activation='linear')  # 21 landmarks * 2 coordinates
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
              loss='mean_squared_error',
              metrics=['mse'])

history = model.fit(
    train_generator,
    epochs=30,
    validation_data=validation_generator
)
Added dropout layers after convolution and dense layers to reduce overfitting.
Applied data augmentation to increase training data variety.
Reduced learning rate from default to 0.0005 for smoother training.
Kept model complexity moderate with two convolutional layers and one dense layer.
Results Interpretation

Before: Training accuracy 98%, Validation accuracy 75%, Training loss 0.05, Validation loss 0.25

After: Training accuracy 90%, Validation accuracy 87%, Training loss 0.12, Validation loss 0.15

Adding dropout and data augmentation helps reduce overfitting by making the model less confident on training data and more generalizable to new data.
Bonus Experiment
Try using a pretrained model like MobileNetV2 as a feature extractor for landmark detection to improve accuracy further.
💡 Hint
Freeze the pretrained layers and add custom dense layers on top for landmark regression.