0
0
Computer Visionml~20 mins

Human pose estimation concept in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Human pose estimation concept
Problem:We want to teach a computer to find key points on a human body in images, like elbows, knees, and wrists. Our current model can find these points on training images very well but does poorly on new images.
Current Metrics:Training accuracy: 95%, Validation accuracy: 65%
Issue:The model is overfitting. It learns training images too well but does not generalize to new images.
Your Task
Reduce overfitting so that validation accuracy improves to at least 80%, while keeping training accuracy below 90%.
You can only change model architecture and training settings.
Do not change the dataset or add more data.
Hint 1
Hint 2
Hint 3
Solution
Computer Vision
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Define data augmentation
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    validation_split=0.2
)

# Validation data generator without augmentation
val_datagen = ImageDataGenerator(
    rescale=1./255,
    validation_split=0.2
)

# Load training and validation data
train_generator = train_datagen.flow_from_directory(
    'data/train',
    target_size=(128, 128),
    batch_size=32,
    class_mode='categorical',
    subset='training'
)
validation_generator = val_datagen.flow_from_directory(
    'data/train',
    target_size=(128, 128),
    batch_size=32,
    class_mode='categorical',
    subset='validation'
)

# Build model with dropout
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(128,128,3)),
    layers.MaxPooling2D(2,2),
    layers.Dropout(0.25),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D(2,2),
    layers.Dropout(0.25),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(17, activation='softmax')  # 17 keypoints classes
])

# Compile with lower learning rate
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Train model
history = model.fit(
    train_generator,
    epochs=30,
    validation_data=validation_generator
)
Added dropout layers after convolution and dense layers to reduce overfitting.
Applied data augmentation to training images to increase variety.
Lowered learning rate from default 0.001 to 0.0005 for smoother training.
Results Interpretation

Before: Training accuracy 95%, Validation accuracy 65% (overfitting)

After: Training accuracy 88%, Validation accuracy 82% (better generalization)

Adding dropout and data augmentation helps the model not memorize training data too much. Lower learning rate helps the model learn more carefully. Together, these reduce overfitting and improve performance on new images.
Bonus Experiment
Try using a pre-trained model like MobileNet or ResNet as a starting point for pose estimation.
💡 Hint
Use transfer learning by freezing early layers and training only the last layers on your dataset.