How to Use CNN for Image Classification in Computer Vision
To use a
CNN (Convolutional Neural Network) for image classification, you build a model with convolutional layers that extract image features, followed by dense layers to classify images into categories. You train the CNN on labeled images using a loss function and optimizer, then use it to predict classes of new images.Syntax
A CNN model for image classification typically includes these parts:
- Input layer: Accepts image data (height, width, channels).
- Convolutional layers: Extract features using filters.
- Activation functions: Add non-linearity, usually ReLU.
- Pooling layers: Reduce spatial size to lower computation.
- Flatten layer: Converts 2D features to 1D vector.
- Dense (fully connected) layers: Learn to classify based on features.
- Output layer: Uses softmax activation for multi-class classification.
python
from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense model = Sequential([ Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)), MaxPooling2D(pool_size=(2, 2)), Conv2D(64, (3, 3), activation='relu'), MaxPooling2D(pool_size=(2, 2)), Flatten(), Dense(128, activation='relu'), Dense(10, activation='softmax') ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
Example
This example shows how to build, train, and evaluate a CNN on the CIFAR-10 dataset, which has 10 image classes like airplanes and cats.
python
import tensorflow as tf from tensorflow.keras.datasets import cifar10 from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense # Load data (x_train, y_train), (x_test, y_test) = cifar10.load_data() # Normalize pixel values x_train, x_test = x_train / 255.0, x_test / 255.0 # Build CNN model model = Sequential([ Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)), MaxPooling2D(pool_size=(2, 2)), Conv2D(64, (3, 3), activation='relu'), MaxPooling2D(pool_size=(2, 2)), Flatten(), Dense(64, activation='relu'), Dense(10, activation='softmax') ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) # Train model history = model.fit(x_train, y_train, epochs=3, validation_split=0.2) # Evaluate model loss, accuracy = model.evaluate(x_test, y_test) print(f'Test accuracy: {accuracy:.4f}')
Output
Epoch 1/3
1250/1250 [==============================] - 22s 17ms/step - loss: 1.4812 - accuracy: 0.4603 - val_loss: 1.1834 - val_accuracy: 0.5798
Epoch 2/3
1250/1250 [==============================] - 21s 17ms/step - loss: 1.0647 - accuracy: 0.6231 - val_loss: 1.0107 - val_accuracy: 0.6464
Epoch 3/3
1250/1250 [==============================] - 21s 17ms/step - loss: 0.9003 - accuracy: 0.6837 - val_loss: 0.9279 - val_accuracy: 0.6750
313/313 [==============================] - 2s 6ms/step - loss: 0.9279 - accuracy: 0.6750
Test accuracy: 0.6750
Common Pitfalls
Common mistakes when using CNNs for image classification include:
- Not normalizing image pixel values, which slows training.
- Using too few convolutional layers, limiting feature learning.
- Overfitting by training too long without enough data or regularization.
- Incorrect input shape causing model errors.
- Using wrong loss function for classification tasks.
Always check data preprocessing and model architecture carefully.
python
from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Conv2D, Flatten, Dense # Wrong: Missing normalization and pooling model_wrong = Sequential([ Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)), Flatten(), Dense(10, activation='softmax') ]) # Right: Add normalization and pooling from tensorflow.keras.layers import MaxPooling2D model_right = Sequential([ Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)), MaxPooling2D(pool_size=(2, 2)), Flatten(), Dense(10, activation='softmax') ])
Quick Reference
- Input shape: (height, width, channels), e.g., (32, 32, 3) for color images.
- Conv2D: Extracts features with filters.
- MaxPooling2D: Reduces feature map size.
- Flatten: Converts 2D features to 1D vector.
- Dense: Fully connected layer for classification.
- Activation: Use ReLU for hidden layers, softmax for output.
- Loss function: Use sparse_categorical_crossentropy for integer labels.
- Optimizer: Adam is a good default choice.
Key Takeaways
Build CNNs with convolution, pooling, flatten, and dense layers for image classification.
Normalize image data before training to improve model performance.
Use softmax activation and appropriate loss for multi-class classification.
Avoid overfitting by using enough data and proper model complexity.
Check input shapes and preprocessing carefully to prevent errors.