Computer Visionml~20 mins

ResNet and skip connections in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - ResNet and skip connections

Problem:Train a convolutional neural network to classify images from the CIFAR-10 dataset using a simple CNN model.

Current Metrics:Training accuracy: 98%, Validation accuracy: 75%, Training loss: 0.05, Validation loss: 0.85

Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower, indicating poor generalization.

Your Task

Reduce overfitting by implementing a ResNet architecture with skip connections to improve validation accuracy to above 85% while keeping training accuracy below 92%.

Use the CIFAR-10 dataset only.

Implement the ResNet model with skip connections from scratch or using TensorFlow/Keras.

Do not increase the number of training epochs beyond 30.

Do not use data augmentation or external datasets.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

Computer Vision

import tensorflow as tf
from tensorflow.keras import layers, models, datasets, utils

# Load CIFAR-10 data
(x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
y_train, y_test = utils.to_categorical(y_train, 10), utils.to_categorical(y_test, 10)

# Define a basic residual block
class ResidualBlock(layers.Layer):
    def __init__(self, filters, stride=1):
        super().__init__()
        self.conv1 = layers.Conv2D(filters, 3, strides=stride, padding='same', use_bias=False)
        self.bn1 = layers.BatchNormalization()
        self.relu = layers.ReLU()
        self.conv2 = layers.Conv2D(filters, 3, strides=1, padding='same', use_bias=False)
        self.bn2 = layers.BatchNormalization()
        if stride != 1:
            self.shortcut = models.Sequential([
                layers.Conv2D(filters, 1, strides=stride, padding='same', use_bias=False),
                layers.BatchNormalization()
            ])
        else:
            self.shortcut = layers.Layer()  # Changed from layers.Identity() to layers.Layer() as Identity is not available in Keras

    def call(self, inputs, training=False):
        x = self.conv1(inputs)
        x = self.bn1(x, training=training)
        x = self.relu(x)
        x = self.conv2(x)
        x = self.bn2(x, training=training)
        shortcut = self.shortcut(inputs, training=training) if hasattr(self.shortcut, 'call') else inputs
        x += shortcut
        return self.relu(x)

# Build a small ResNet model
inputs = layers.Input(shape=(32, 32, 3))
x = layers.Conv2D(64, 3, strides=1, padding='same', use_bias=False)(inputs)
x = layers.BatchNormalization()(x)
x = layers.ReLU()(x)

x = ResidualBlock(64)(x)
x = ResidualBlock(64)(x)

x = ResidualBlock(128, stride=2)(x)
x = ResidualBlock(128)(x)

x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(10, activation='softmax')(x)

model = models.Model(inputs, x)

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

history = model.fit(x_train, y_train, epochs=30, batch_size=64, validation_data=(x_test, y_test))

Replaced simple CNN with a ResNet architecture using residual blocks with skip connections.

Added batch normalization and ReLU activations after convolutions.

Used a smaller learning rate (0.001) with Adam optimizer.

Kept training epochs to 30 and batch size to 64.

Replaced layers.Identity() with layers.Layer() and added conditional call to handle shortcut connection.

Results Interpretation

Before: Training accuracy 98%, Validation accuracy 75%, high overfitting.

After: Training accuracy 90%, Validation accuracy 87%, better generalization and less overfitting.

Skip connections in ResNet help the model learn better by allowing gradients to flow easily, reducing overfitting and improving validation accuracy.

Bonus Experiment

Try adding dropout layers after residual blocks to see if validation accuracy improves further.

💡 Hint

Dropout randomly turns off neurons during training, which can reduce overfitting by making the model more robust.

Practice

(1/5)

1. What is the main purpose of skip connections in a ResNet model?

easy

A. To replace convolutional layers with fully connected layers

B. To reduce the number of layers in the network

C. To allow information to flow directly across layers, helping training

D. To increase the size of the input images

ResNet and skip connections in Computer Vision - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand skip connections role

Step 2: Connect to training deep networks

Final Answer:

Quick Check:

Solution

Step 1: Recall skip connection operation

Step 2: Match with correct syntax

Final Answer:

Quick Check:

Solution

Step 1: Analyze convolution output

Step 2: Add input and apply ReLU

Final Answer:

Quick Check:

Solution

Step 1: Understand error message

Step 2: Check convolution output channels

Final Answer:

Quick Check:

Solution

Step 1: Identify shape mismatch

Step 2: Match shapes for addition

Final Answer:

Quick Check: