Computer-visionHow-ToBeginner · 4 min read

How to Use MNIST for Classification in Computer Vision

To use MNIST for classification in computer vision, load the dataset of handwritten digits, preprocess the images by normalizing pixel values, then train a simple neural network model like a Convolutional Neural Network (CNN) to classify digits from 0 to 9. Finally, evaluate the model's accuracy on test data to measure performance.

📐

Syntax

Here is the basic syntax to load MNIST, preprocess data, define a model, train it, and evaluate accuracy.

tf.keras.datasets.mnist.load_data(): Loads MNIST images and labels.
model = tf.keras.Sequential([...]): Defines a neural network model.
model.compile(): Sets optimizer, loss, and metrics.
model.fit(): Trains the model on training data.
model.evaluate(): Tests model performance on test data.

python

import tensorflow as tf

# Load MNIST data
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

# Normalize images
train_images = train_images / 255.0
test_images = test_images / 255.0

# Define model
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train model
model.fit(train_images, train_labels, epochs=5)

# Evaluate model
model.evaluate(test_images, test_labels)

💻

Example

This example shows a complete runnable script that loads MNIST, trains a simple neural network, and prints test accuracy.

python

import tensorflow as tf

# Load MNIST dataset
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

# Normalize pixel values to [0,1]
train_images = train_images / 255.0
test_images = test_images / 255.0

# Build a simple neural network model
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=5, verbose=2)

# Evaluate the model on test data
loss, accuracy = model.evaluate(test_images, test_labels, verbose=0)
print(f'Test accuracy: {accuracy:.4f}')

Output

Epoch 1/5 1875/1875 - 3s - loss: 0.2561 - accuracy: 0.9267 Epoch 2/5 1875/1875 - 3s - loss: 0.1157 - accuracy: 0.9663 Epoch 3/5 1875/1875 - 3s - loss: 0.0787 - accuracy: 0.9763 Epoch 4/5 1875/1875 - 3s - loss: 0.0577 - accuracy: 0.9817 Epoch 5/5 1875/1875 - 3s - loss: 0.0433 - accuracy: 0.9860 Test accuracy: 0.9789

⚠️

Common Pitfalls

Common mistakes when using MNIST for classification include:

Not normalizing pixel values, which slows training and reduces accuracy.
Using a model without an output layer of size 10 with softmax activation for digit classes.
Mixing up training and test data, causing misleading accuracy results.
Using inappropriate loss functions like mean squared error instead of sparse categorical crossentropy.

python

import tensorflow as tf

# Wrong: No normalization
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()

model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])  # Wrong loss
model.fit(train_images, train_labels, epochs=1)

# Right way
train_images = train_images / 255.0
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(train_images, train_labels, epochs=1)

📊

Quick Reference

Tips for using MNIST classification:

Always normalize images by dividing pixel values by 255.
Use a neural network with an output layer of 10 units and softmax activation.
Use sparse_categorical_crossentropy loss for integer labels.
Train for multiple epochs to improve accuracy.
Evaluate on test data to check real performance.

✅

Key Takeaways

Normalize MNIST images by scaling pixel values to [0,1] before training.

Use a neural network with 10 output units and softmax activation for digit classification.

Compile the model with 'sparse_categorical_crossentropy' loss and 'accuracy' metric.

Train the model on training data and evaluate on test data for reliable accuracy.

Avoid common mistakes like skipping normalization or using wrong loss functions.