Computer Visionml~20 mins

Human pose estimation concept in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Human pose estimation concept

Problem:We want to teach a computer to find key points on a human body in images, like elbows, knees, and wrists. Our current model can find these points on training images very well but does poorly on new images.

Current Metrics:Training accuracy: 95%, Validation accuracy: 65%

Issue:The model is overfitting. It learns training images too well but does not generalize to new images.

Your Task

Reduce overfitting so that validation accuracy improves to at least 80%, while keeping training accuracy below 90%.

You can only change model architecture and training settings.

Do not change the dataset or add more data.

Hint 1

Hint 2

Hint 3

Solution

Computer Vision

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Define data augmentation
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    validation_split=0.2
)

# Validation data generator without augmentation
val_datagen = ImageDataGenerator(
    rescale=1./255,
    validation_split=0.2
)

# Load training and validation data
train_generator = train_datagen.flow_from_directory(
    'data/train',
    target_size=(128, 128),
    batch_size=32,
    class_mode='categorical',
    subset='training'
)
validation_generator = val_datagen.flow_from_directory(
    'data/train',
    target_size=(128, 128),
    batch_size=32,
    class_mode='categorical',
    subset='validation'
)

# Build model with dropout
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(128,128,3)),
    layers.MaxPooling2D(2,2),
    layers.Dropout(0.25),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D(2,2),
    layers.Dropout(0.25),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(17, activation='softmax')  # 17 keypoints classes
])

# Compile with lower learning rate
model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Train model
history = model.fit(
    train_generator,
    epochs=30,
    validation_data=validation_generator
)

Added dropout layers after convolution and dense layers to reduce overfitting.

Applied data augmentation to training images to increase variety.

Lowered learning rate from default 0.001 to 0.0005 for smoother training.

Results Interpretation

Before: Training accuracy 95%, Validation accuracy 65% (overfitting)

After: Training accuracy 88%, Validation accuracy 82% (better generalization)

Adding dropout and data augmentation helps the model not memorize training data too much. Lower learning rate helps the model learn more carefully. Together, these reduce overfitting and improve performance on new images.

Bonus Experiment

Try using a pre-trained model like MobileNet or ResNet as a starting point for pose estimation.

💡 Hint

Use transfer learning by freezing early layers and training only the last layers on your dataset.

Practice

(1/5)

1. What is the main goal of human pose estimation in computer vision?

easy

A. To find the positions of body joints in images or videos

B. To classify objects into categories

C. To detect faces in images

D. To enhance image resolution

Human pose estimation concept in Computer Vision - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand the task of human pose estimation

Step 2: Compare with other computer vision tasks

Final Answer:

Quick Check:

Solution

Step 1: Identify typical model outputs in pose estimation

Step 2: Eliminate other output types

Final Answer:

Quick Check:

Solution

Step 1: Analyze the output dictionary keys and values

Step 2: Understand what these coordinates mean

Final Answer:

Quick Check:

Solution

Step 1: Identify the cause of inconsistent keypoint order

Step 2: Fix by defining a consistent keypoint index mapping

Final Answer:

Quick Check:

Solution

Step 1: Understand multi-person pose estimation challenges

Step 2: Use part affinity fields to group keypoints correctly

Final Answer:

Quick Check: