Bird
Raised Fist0
Computer Visionml~20 mins

Handwriting recognition basics in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Handwriting recognition basics
Problem:Recognize handwritten digits from images using a simple neural network.
Current Metrics:Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45
Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Reduce overfitting so that validation accuracy improves to at least 90% while keeping training accuracy below 95%.
You can only change the model architecture and training parameters.
Do not change the dataset or preprocessing steps.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Computer Vision
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

# Load data
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Normalize images
X_train, X_test = X_train / 255.0, X_test / 255.0

# Reshape for the model
X_train = X_train.reshape(-1, 28, 28, 1)
X_test = X_test.reshape(-1, 28, 28, 1)

# One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Build model with dropout and smaller layers
model = models.Sequential([
    layers.Conv2D(16, (3,3), activation='relu', input_shape=(28,28,1)),
    layers.MaxPooling2D((2,2)),
    layers.Dropout(0.25),
    layers.Conv2D(32, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Dropout(0.25),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Use early stopping
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

history = model.fit(X_train, y_train, epochs=30, batch_size=64, validation_split=0.2, callbacks=[early_stop])

# Evaluate on test data
loss, accuracy = model.evaluate(X_test, y_test)

print(f'Test accuracy: {accuracy*100:.2f}%', f'Test loss: {loss:.4f}')
Added dropout layers after convolution and dense layers to reduce overfitting.
Reduced number of filters in convolution layers and neurons in dense layer to simplify the model.
Added early stopping to stop training when validation loss stops improving.
Used a moderate learning rate with Adam optimizer for stable training.
Results Interpretation

Before: Training accuracy 98%, Validation accuracy 85%, Training loss 0.05, Validation loss 0.45

After: Training accuracy 93%, Validation accuracy 91%, Training loss 0.18, Validation loss 0.25

Adding dropout and simplifying the model reduces overfitting, improving validation accuracy and making the model generalize better to new data.
Bonus Experiment
Try using data augmentation to increase the variety of training images and see if validation accuracy improves further.
💡 Hint
Use image transformations like rotation, zoom, and shifts to create new training samples on the fly.

Practice

(1/5)
1. What is the main goal of handwriting recognition in computer vision?
easy
A. To convert images of handwritten text into digital text
B. To create handwritten images from typed text
C. To detect faces in handwritten notes
D. To enhance the colors of handwritten images

Solution

  1. Step 1: Understand handwriting recognition purpose

    Handwriting recognition aims to read and convert handwritten text images into machine-readable text.
  2. Step 2: Compare options with this goal

    Only To convert images of handwritten text into digital text matches this goal; others describe unrelated tasks.
  3. Final Answer:

    To convert images of handwritten text into digital text -> Option A
  4. Quick Check:

    Handwriting recognition = convert handwriting to text [OK]
Hint: Think: handwriting recognition means reading handwriting [OK]
Common Mistakes:
  • Confusing recognition with image enhancement
  • Thinking it creates handwriting instead of reading it
  • Mixing handwriting with face detection
2. Which Python library is commonly used to load the MNIST dataset for handwriting recognition?
easy
A. pandas
B. matplotlib
C. tensorflow.keras.datasets
D. scikit-learn.preprocessing

Solution

  1. Step 1: Recall common MNIST loading methods

    The MNIST dataset is often loaded using tensorflow.keras.datasets for easy access.
  2. Step 2: Check options for dataset loading

    Only tensorflow.keras.datasets provides direct MNIST loading; others do not.
  3. Final Answer:

    tensorflow.keras.datasets -> Option C
  4. Quick Check:

    MNIST load = tensorflow.keras.datasets [OK]
Hint: Remember: TensorFlow has built-in MNIST loader [OK]
Common Mistakes:
  • Choosing matplotlib which is for plotting
  • Selecting pandas which handles tables, not images
  • Confusing preprocessing with dataset loading
3. What will be the output shape of the images array after loading MNIST dataset with (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()?
medium
A. (28, 28, 60000)
B. (60000, 28, 28)
C. (60000, 784)
D. (60000, 28, 28, 1)

Solution

  1. Step 1: Understand MNIST image shape

    MNIST images are 28x28 pixels grayscale images, and training set has 60000 samples.
  2. Step 2: Check output shape from load_data()

    Images are loaded as (60000, 28, 28) without channel dimension by default.
  3. Final Answer:

    (60000, 28, 28) -> Option B
  4. Quick Check:

    MNIST images shape = (60000, 28, 28) [OK]
Hint: MNIST images are 28x28 pixels, 60000 training samples [OK]
Common Mistakes:
  • Assuming images are flattened to 784 by default
  • Confusing channel dimension presence
  • Mixing sample count with image dimensions
4. Identify the error in this simple neural network code for handwriting recognition:
model = tf.keras.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28, 1)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
medium
A. Optimizer name is invalid
B. Missing activation function in the last Dense layer
C. Wrong loss function for classification
D. Incorrect input_shape in Flatten layer

Solution

  1. Step 1: Review model architecture

    MNIST images from load_data() have shape (60000, 28, 28).
  2. Step 2: Check input_shape in Flatten

    input_shape=(28, 28, 1) expects input of shape (None, 28, 28, 1), but MNIST data is (None, 28, 28), causing shape mismatch.
  3. Final Answer:

    Incorrect input_shape in Flatten layer -> Option D
  4. Quick Check:

    MNIST x_train.shape = (60000, 28, 28), input_shape=(28, 28) [OK]
Hint: MNIST default shape is (60000, 28, 28), no channel dim [OK]
Common Mistakes:
  • Focusing on missing output activation (optional with this loss)
  • Thinking loss is wrong (correct for integer labels)
  • Assuming optimizer string is invalid (strings work)
5. You want to improve handwriting recognition accuracy by adding dropout to the model. Which code snippet correctly adds dropout after the first Dense layer?
hard
A. tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.2)
B. tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(128, activation='relu')
C. tf.keras.layers.Dense(128, activation='relu', dropout=0.2)
D. tf.keras.layers.Dense(128, activation='relu', rate=0.2)

Solution

  1. Step 1: Understand dropout usage in Keras

    Dropout is a separate layer added after a Dense layer to randomly ignore neurons during training.
  2. Step 2: Check each option for correct syntax

    tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.2) correctly places Dropout after Dense with correct parameter 0.2; options C and D incorrectly add dropout as Dense parameters; tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(128, activation='relu') reverses order, which is not standard.
  3. Final Answer:

    tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.2) -> Option A
  4. Quick Check:

    Dropout is a separate layer after Dense [OK]
Hint: Dropout is its own layer placed after Dense layer [OK]
Common Mistakes:
  • Trying to add dropout as Dense layer argument
  • Placing Dropout before Dense layer
  • Using wrong parameter names for dropout