Bird
Raised Fist0
Computer Visionml~10 mins

CNN architecture review in Computer Vision - Interactive Code Practice

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to add a convolutional layer with 32 filters and a 3x3 kernel.

Computer Vision
model.add(Conv2D([1], kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 1)))
Drag options to blanks, or click blank then click option'
A32
B64
C16
D128
Attempts:
3 left
💡 Hint
Common Mistakes
Choosing too few or too many filters like 16 or 128 which may underfit or overfit early.
Confusing kernel size with number of filters.
2fill in blank
medium

Complete the code to add a max pooling layer with pool size 2x2.

Computer Vision
model.add(MaxPooling2D(pool_size=[1]))
Drag options to blanks, or click blank then click option'
A(3, 3)
B(4, 4)
C(2, 2)
D(1, 1)
Attempts:
3 left
💡 Hint
Common Mistakes
Using (1, 1) which does not reduce size.
Using too large pool sizes like (4, 4) which may lose too much information.
3fill in blank
hard

Fix the error in the code to flatten the output before the dense layer.

Computer Vision
model.add([1]())
Drag options to blanks, or click blank then click option'
AFlatten
BDense
CConv2D
DMaxPooling2D
Attempts:
3 left
💡 Hint
Common Mistakes
Using Dense directly without flattening causes shape errors.
Using Conv2D or MaxPooling2D here is incorrect.
4fill in blank
hard

Fill both blanks to add a dropout layer with rate 0.5 and a dense output layer with 10 units.

Computer Vision
model.add(Dropout([1]))
model.add(Dense([2], activation='softmax'))
Drag options to blanks, or click blank then click option'
A0.5
B10
C5
D0.25
Attempts:
3 left
💡 Hint
Common Mistakes
Using dropout rates too low or too high.
Setting output units incorrectly for classification.
5fill in blank
hard

Fill all three blanks to compile the model with Adam optimizer, categorical crossentropy loss, and accuracy metric.

Computer Vision
model.compile(optimizer='[1]', loss='[2]', metrics=['[3]'])
Drag options to blanks, or click blank then click option'
Aadam
Bcategorical_crossentropy
Caccuracy
Dsgd
Attempts:
3 left
💡 Hint
Common Mistakes
Using wrong loss function like 'mse' for classification.
Choosing incorrect optimizer or metric.

Practice

(1/5)
1. What is the main purpose of a Convolutional Neural Network (CNN) in computer vision?
easy
A. To perform text translation
B. To sort numbers in a list
C. To generate random images
D. To detect patterns and features in images

Solution

  1. Step 1: Understand CNN function

    CNNs scan images to find important patterns like edges and shapes.
  2. Step 2: Match purpose to options

    Only To detect patterns and features in images describes detecting patterns in images, which is CNN's main job.
  3. Final Answer:

    To detect patterns and features in images -> Option D
  4. Quick Check:

    CNN purpose = detect image patterns [OK]
Hint: CNNs find image features, not unrelated tasks like sorting [OK]
Common Mistakes:
  • Confusing CNNs with general neural networks
  • Thinking CNNs generate images
  • Mixing CNNs with text processing models
2. Which of the following is the correct way to add a 2D convolutional layer in Keras?
easy
A. Dense(units=32, activation='relu')
B. Conv1D(filters=32, kernel_size=3, activation='relu')
C. Conv2D(filters=32, kernel_size=(3,3), activation='relu')
D. MaxPooling2D(pool_size=(2,2))

Solution

  1. Step 1: Identify Conv2D syntax

    Conv2D requires filters, kernel_size as a tuple, and activation function.
  2. Step 2: Compare options

    Conv2D(filters=32, kernel_size=(3,3), activation='relu') matches Conv2D syntax correctly; others are different layers or wrong dimensions.
  3. Final Answer:

    Conv2D(filters=32, kernel_size=(3,3), activation='relu') -> Option C
  4. Quick Check:

    Conv2D syntax = Conv2D(filters=32, kernel_size=(3,3), activation='relu') [OK]
Hint: Conv2D uses 2D kernel size tuple, not single int [OK]
Common Mistakes:
  • Using Conv1D instead of Conv2D for images
  • Confusing Dense layer with Conv2D
  • Wrong kernel_size format
3. Given this Keras CNN snippet, what is the output shape after the Conv2D layer?
model = Sequential()
model.add(Conv2D(16, (3,3), input_shape=(28,28,1)))
medium
A. (26, 26, 16)
B. (28, 28, 16)
C. (30, 30, 16)
D. (28, 28, 1)

Solution

  1. Step 1: Calculate output size after Conv2D

    With default 'valid' padding and kernel size 3, output dims = input - kernel + 1 = 28 - 3 + 1 = 26.
  2. Step 2: Determine output channels

    Filters=16 means output depth is 16 channels.
  3. Final Answer:

    (26, 26, 16) -> Option A
  4. Quick Check:

    Output shape = (26,26,16) [OK]
Hint: Output size = input - kernel + 1 with 'valid' padding [OK]
Common Mistakes:
  • Assuming output size equals input size without padding
  • Confusing number of filters with spatial dimensions
  • Forgetting default padding is 'valid'
4. Identify the error in this CNN model code snippet:
model = Sequential()
model.add(Conv2D(32, (3,3), activation='relu', input_shape=(28,28)))
model.add(Flatten())
model.add(Dense(10, activation='softmax'))
medium
A. Dense layer should come before Flatten
B. input_shape missing channel dimension
C. Activation function 'relu' is invalid
D. Conv2D filters must be 64 or more

Solution

  1. Step 1: Check input_shape format

    Conv2D expects input_shape with 3 dimensions: height, width, channels. Here channels are missing.
  2. Step 2: Validate other parts

    Activation 'relu' is valid, Flatten before Dense is correct, filters can be any positive integer.
  3. Final Answer:

    input_shape missing channel dimension -> Option B
  4. Quick Check:

    Input shape must include channels [OK]
Hint: Conv2D input_shape needs (height, width, channels) [OK]
Common Mistakes:
  • Ignoring channel dimension in input_shape
  • Misordering Flatten and Dense layers
  • Thinking filters must be >=64
5. You want to build a CNN for classifying 64x64 RGB images into 5 classes. Which architecture choice is best?
hard
A. Conv2D(32, (3,3)) + MaxPooling2D + Conv2D(64, (3,3)) + Flatten + Dense(5, softmax)
B. Dense(128) + Dense(64) + Dense(5, softmax)
C. Conv1D(32, 3) + Flatten + Dense(5, softmax)
D. Flatten + Dense(5, softmax)

Solution

  1. Step 1: Identify suitable layers for image data

    Conv2D layers extract spatial features from 2D images; MaxPooling reduces size; Flatten prepares for Dense.
  2. Step 2: Evaluate options

    Conv2D(32, (3,3)) + MaxPooling2D + Conv2D(64, (3,3)) + Flatten + Dense(5, softmax) uses Conv2D and pooling correctly for images. The Dense-only option lacks feature extraction, Conv1D is unsuitable for 2D images, and Flatten + Dense skips convolutions.
  3. Final Answer:

    Conv2D(32, (3,3)) + MaxPooling2D + Conv2D(64, (3,3)) + Flatten + Dense(5, softmax) -> Option A
  4. Quick Check:

    Use Conv2D + pooling for images [OK]
Hint: Use Conv2D layers for images, not Dense-only or Conv1D [OK]
Common Mistakes:
  • Using Dense layers only for image input
  • Applying Conv1D to 2D images
  • Skipping pooling layers for downsampling