Convolutional Neural Networks (CNNs) use filters (kernels) to scan images. Why do these filters usually have small sizes like 3x3 or 5x5?
Think about how images have small details like edges and textures.
Small filters focus on local features like edges or textures. This helps the CNN learn important visual patterns efficiently without too many parameters.
Given a 28x28 grayscale image input to a convolutional layer with 16 filters of size 3x3, stride 1, and 'valid' padding, what is the output shape?
import tensorflow as tf input_shape = (1, 28, 28, 1) # batch size 1, height 28, width 28, channels 1 inputs = tf.random.normal(input_shape) conv_layer = tf.keras.layers.Conv2D(filters=16, kernel_size=3, strides=1, padding='valid') outputs = conv_layer(inputs) print(outputs.shape)
Use formula: output_size = input_size - filter_size + 1 for 'valid' padding.
With 'valid' padding, output height and width = 28 - 3 + 1 = 26. Number of filters = 16 channels.
You want to build a CNN to recognize textures in images. Which architecture choice helps the network better capture fine texture details?
Think about how stacking small filters helps build complex features step-by-step.
Stacking small filters in deeper layers helps the CNN learn simple to complex texture features gradually, improving recognition.
During CNN training on image classification, you observe the training loss steadily decreases but validation accuracy plateaus early. What does this indicate?
Think about what it means when training improves but validation does not.
Decreasing training loss with stagnant validation accuracy usually means the model memorizes training data but fails to generalize, a sign of overfitting.
Consider this TensorFlow CNN code snippet. After training, the model accuracy stays near random guessing. What is the main issue?
import tensorflow as tf model = tf.keras.Sequential([ tf.keras.layers.Conv2D(32, kernel_size=3, activation='relu', input_shape=(28,28,1)), tf.keras.layers.MaxPooling2D(pool_size=2), tf.keras.layers.Flatten(), tf.keras.layers.Dense(10, activation='softmax') ]) model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy']) # Training code omitted
Think about what helps CNNs train stably on image data.
Without normalization layers, CNN training can be unstable or slow, leading to poor learning and low accuracy.