Bird
Raised Fist0
Computer Visionml~20 mins

Training an image classifier in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Training an image classifier
Problem:Train a model to classify images of cats and dogs.
Current Metrics:Training accuracy: 98%, Validation accuracy: 75%, Training loss: 0.05, Validation loss: 0.85
Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Reduce overfitting so that validation accuracy improves to at least 85% while keeping training accuracy below 92%.
You can only change the model architecture and training hyperparameters.
Do not add more data or use data augmentation.
Keep the dataset and preprocessing the same.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Computer Vision
import tensorflow as tf
from tensorflow.keras import layers, models

# Load dataset (cats vs dogs) - placeholder for actual loading
# For demonstration, use tf.keras.datasets.cifar10 and filter classes 3 (cat) and 5 (dog)
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

import numpy as np
train_filter = np.where((y_train == 3) | (y_train == 5))[0]
test_filter = np.where((y_test == 3) | (y_test == 5))[0]

x_train, y_train = x_train[train_filter], y_train[train_filter]
x_test, y_test = x_test[test_filter], y_test[test_filter]

# Convert labels: cat=0, dog=1
y_train = (y_train == 5).astype(int)
y_test = (y_test == 5).astype(int)

# Normalize images
x_train = x_train / 255.0
x_test = x_test / 255.0

# Define model with dropout to reduce overfitting
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Flatten(),
    layers.Dropout(0.5),
    layers.Dense(64, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Use early stopping
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

history = model.fit(x_train, y_train, epochs=50, batch_size=64, validation_split=0.2, callbacks=[early_stop])

# Evaluate on test set
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)

print(f'Test accuracy: {test_acc*100:.2f}%', f'Test loss: {test_loss:.4f}')
Added dropout layers after flatten and dense layers to reduce overfitting.
Reduced learning rate to 0.0005 for smoother training.
Added early stopping callback to stop training when validation loss stops improving.
Kept model complexity moderate with two convolutional layers and one dense layer.
Results Interpretation

Before: Training accuracy 98%, Validation accuracy 75%, Validation loss 0.85

After: Training accuracy 90%, Validation accuracy 87%, Validation loss 0.30

Adding dropout and early stopping helps reduce overfitting by preventing the model from memorizing training data, leading to better validation performance.
Bonus Experiment
Try using data augmentation to further improve validation accuracy beyond 90%.
💡 Hint
Use Keras ImageDataGenerator to randomly flip, rotate, or zoom images during training.

Practice

(1/5)
1. What is the main goal when training an image classifier?
easy
A. To convert images into text
B. To teach the model to recognize different categories of images
C. To increase the size of the images
D. To remove colors from images

Solution

  1. Step 1: Understand the purpose of image classification

    Image classification means teaching a model to identify what category an image belongs to, like cats or dogs.
  2. Step 2: Identify the correct goal

    The goal is to train the model to recognize image categories, not to change image size or color.
  3. Final Answer:

    To teach the model to recognize different categories of images -> Option B
  4. Quick Check:

    Image classification = recognize categories [OK]
Hint: Remember: Classifier means sorting images into groups [OK]
Common Mistakes:
  • Confusing image classification with image editing
  • Thinking the goal is to change image colors
  • Assuming the model outputs text instead of categories
2. Which code snippet correctly adds a convolutional layer in a TensorFlow Keras model?
easy
A. model.add(MaxPooling2D(32, (3, 3)))
B. model.add(Dense(32, (3, 3), activation='relu'))
C. model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)))
D. model.add(Flatten(32, (3, 3)))

Solution

  1. Step 1: Identify the correct layer type for convolution

    Conv2D is the correct layer to extract image features using filters.
  2. Step 2: Check the syntax for Conv2D

    The correct syntax includes number of filters, kernel size, activation, and input shape for the first layer.
  3. Final Answer:

    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3))) -> Option C
  4. Quick Check:

    Conv2D with filters and kernel size = model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3))) [OK]
Hint: Conv2D needs filters, kernel size, and activation [OK]
Common Mistakes:
  • Using Dense instead of Conv2D for images
  • Passing wrong arguments to Flatten or MaxPooling2D
  • Missing input_shape in first Conv2D layer
3. Given this code, what will be the printed accuracy after training?
import tensorflow as tf
from tensorflow.keras import layers, models

model = models.Sequential([
  layers.Conv2D(16, (3,3), activation='relu', input_shape=(28,28,1)),
  layers.Flatten(),
  layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

import numpy as np
x_train = np.random.random((100, 28, 28, 1))
y_train = np.random.randint(0, 10, 100)

history = model.fit(x_train, y_train, epochs=1, verbose=0)
print(f"Accuracy: {history.history['accuracy'][0]:.2f}")
medium
A. Accuracy will be around 0.10 (random guessing)
B. Accuracy will be close to 1.00 (perfect)
C. Code will raise a syntax error
D. Accuracy will be exactly 0.50

Solution

  1. Step 1: Understand the data and labels

    The training data is random noise and labels are random integers from 0 to 9, so no real pattern exists.
  2. Step 2: Predict model accuracy on random data

    Since the model cannot learn meaningful features, accuracy will be close to random guessing, about 10% for 10 classes.
  3. Final Answer:

    Accuracy will be around 0.10 (random guessing) -> Option A
  4. Quick Check:

    Random data accuracy ≈ 1/number_of_classes = 0.10 [OK]
Hint: Random labels mean accuracy near chance level [OK]
Common Mistakes:
  • Expecting high accuracy on random data
  • Thinking code has syntax errors
  • Assuming accuracy is always 0.5
4. This code tries to train an image classifier but throws an error. What is the problem?
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv2D(32, 3, activation='relu'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)

Assume x_train shape is (100, 28, 28, 1).
medium
A. Missing input_shape in first Conv2D layer
B. Dense layer should come before Conv2D
C. Loss function is incorrect for classification
D. Optimizer 'adam' is not supported

Solution

  1. Step 1: Check Conv2D layer input requirements

    The first Conv2D layer must specify input_shape to know the input image size.
  2. Step 2: Identify missing input_shape

    Since input_shape is missing, TensorFlow cannot infer input dimensions, causing an error.
  3. Final Answer:

    Missing input_shape in first Conv2D layer -> Option A
  4. Quick Check:

    First Conv2D needs input_shape [OK]
Hint: First Conv2D layer always needs input_shape [OK]
Common Mistakes:
  • Thinking Dense must come before Conv2D
  • Confusing loss function for classification
  • Believing 'adam' optimizer is invalid
5. You want to improve your image classifier's accuracy on a small dataset. Which approach is best?
hard
A. Remove the activation functions from all layers
B. Reduce the number of convolutional layers to one
C. Train for only one epoch to avoid overfitting
D. Add data augmentation like rotations and flips during training

Solution

  1. Step 1: Understand challenges with small datasets

    Small datasets can cause overfitting, where the model memorizes instead of generalizing.
  2. Step 2: Identify best method to improve generalization

    Data augmentation creates new image variations, helping the model learn better and improve accuracy.
  3. Final Answer:

    Add data augmentation like rotations and flips during training -> Option D
  4. Quick Check:

    Data augmentation improves small dataset accuracy [OK]
Hint: Use data augmentation to expand small datasets [OK]
Common Mistakes:
  • Reducing layers too much loses learning power
  • Training only one epoch usually underfits
  • Removing activations breaks model learning