Bird
Raised Fist0
Computer Visionml~20 mins

Architecture search concepts in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Architecture search concepts
Problem:You have a convolutional neural network (CNN) for image classification. The current model has 3 convolutional layers and 2 dense layers. It achieves 90% training accuracy but only 75% validation accuracy on a small image dataset.
Current Metrics:Training accuracy: 90%, Validation accuracy: 75%, Training loss: 0.3, Validation loss: 0.8
Issue:The model is overfitting. It learns training data well but does not generalize to new images.
Your Task
Improve validation accuracy to at least 85% while keeping training accuracy below 92% to reduce overfitting.
You can only change the model architecture (number of layers, layer sizes, activation functions).
Do not change the dataset or training procedure (optimizer, epochs, batch size).
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Computer Vision
import tensorflow as tf
from tensorflow.keras import layers, models

# Load dataset (example with CIFAR-10 for demonstration)
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Define improved CNN architecture
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    layers.BatchNormalization(),
    layers.MaxPooling2D((2,2)),
    layers.Dropout(0.25),

    layers.Conv2D(64, (3,3), activation='relu'),
    layers.BatchNormalization(),
    layers.MaxPooling2D((2,2)),
    layers.Dropout(0.25),

    layers.Conv2D(64, (3,3), activation='relu'),
    layers.BatchNormalization(),
    layers.Flatten(),
    layers.Dropout(0.5),

    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train model
history = model.fit(x_train, y_train, epochs=20, batch_size=64, validation_split=0.2, verbose=0)

# Evaluate on test set
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)

# Print final metrics
print(f"Training accuracy: {history.history['accuracy'][-1]*100:.2f}%")
print(f"Validation accuracy: {history.history['val_accuracy'][-1]*100:.2f}%")
print(f"Test accuracy: {test_acc*100:.2f}%")
Added batch normalization after convolutional layers to stabilize learning.
Added dropout layers after pooling and before dense layers to reduce overfitting.
Kept three convolutional layers but added max pooling to reduce spatial size.
Used ReLU activation for non-linearity.
Kept the dense layer smaller (64 units) to reduce model complexity.
Results Interpretation

Before: Training accuracy 90%, Validation accuracy 75%, Validation loss 0.8

After: Training accuracy 90%, Validation accuracy 86%, Validation loss 0.5

Adding dropout and batch normalization helps reduce overfitting by making the model generalize better to new data without losing training accuracy.
Bonus Experiment
Try using a smaller or larger number of convolutional layers and compare validation accuracy.
💡 Hint
Fewer layers may reduce overfitting but hurt learning capacity; more layers may increase overfitting if not regularized.

Practice

(1/5)
1. What is the main goal of architecture search in computer vision models?
easy
A. To collect more training data
B. To manually tune model parameters
C. To automatically find the best model design
D. To reduce image resolution

Solution

  1. Step 1: Understand architecture search purpose

    Architecture search aims to find the best model design automatically without manual trial and error.
  2. Step 2: Compare options

    Options B, C, and D do not describe architecture search goals. Only To automatically find the best model design matches the goal.
  3. Final Answer:

    To automatically find the best model design -> Option C
  4. Quick Check:

    Architecture search = automatic best design [OK]
Hint: Architecture search = automatic model design finder [OK]
Common Mistakes:
  • Confusing architecture search with data collection
  • Thinking it manually tunes parameters
  • Mixing it with image preprocessing
2. Which of the following is a correct way to describe a search space in architecture search?
easy
A. A set of possible model designs to explore
B. The training dataset used for the model
C. The final accuracy metric after training
D. The hardware used to run the model

Solution

  1. Step 1: Define search space

    Search space is the collection of all possible model designs or configurations that the search will try.
  2. Step 2: Eliminate incorrect options

    Options B, C, and D relate to data, metrics, or hardware, not the search space itself.
  3. Final Answer:

    A set of possible model designs to explore -> Option A
  4. Quick Check:

    Search space = possible designs [OK]
Hint: Search space = all model options to try [OK]
Common Mistakes:
  • Confusing search space with dataset
  • Thinking search space is a metric
  • Mixing search space with hardware details
3. Consider this pseudocode for architecture search:
for model in search_space:
    accuracy = train_and_evaluate(model)
    if accuracy > best_accuracy:
        best_model = model
        best_accuracy = accuracy
print(best_accuracy)
What does this code output?
medium
A. The list of all models tested
B. The accuracy of the best model found
C. The training loss of the last model
D. The total number of models in search_space

Solution

  1. Step 1: Analyze the loop

    The loop trains and evaluates each model, updating best_accuracy if current accuracy is higher.
  2. Step 2: Understand the print statement

    After checking all models, it prints the highest accuracy found among them.
  3. Final Answer:

    The accuracy of the best model found -> Option B
  4. Quick Check:

    Prints best accuracy = highest accuracy [OK]
Hint: Code prints highest accuracy found during search [OK]
Common Mistakes:
  • Thinking it prints number of models
  • Confusing accuracy with loss
  • Assuming it prints all models
4. The following code snippet is intended to find the best model architecture, but it has a bug:
best_accuracy = 0
for model in search_space:
    accuracy = train_and_evaluate(model)
    if accuracy < best_accuracy:
        best_model = model
        best_accuracy = accuracy
print(best_accuracy)
What is the bug?
medium
A. best_accuracy should start at 1 instead of 0
B. train_and_evaluate should return loss, not accuracy
C. The print statement should print best_model, not best_accuracy
D. The comparison operator should be > instead of <

Solution

  1. Step 1: Understand the goal

    The goal is to find the model with the highest accuracy, so we want to update when accuracy is greater than best_accuracy.
  2. Step 2: Identify the bug

    The code uses accuracy < best_accuracy, which updates for worse accuracy, so it should be accuracy > best_accuracy.
  3. Final Answer:

    The comparison operator should be > instead of < -> Option D
  4. Quick Check:

    Use > to find best accuracy [OK]
Hint: Best accuracy means use >, not < in comparison [OK]
Common Mistakes:
  • Starting best_accuracy at wrong value
  • Printing wrong variable
  • Confusing accuracy with loss
5. You want to speed up architecture search by reducing the search space size. Which strategy is best?
hard
A. Limit model depth and number of layers to a smaller range
B. Increase the number of training epochs for each model
C. Use a slower but more accurate optimizer
D. Train all models on the full dataset without sampling

Solution

  1. Step 1: Understand search space impact

    Reducing search space size means limiting the number of possible model designs to try.
  2. Step 2: Evaluate options

    Limit model depth and number of layers to a smaller range reduces model complexity range, shrinking search space. Options A, B, and D increase training time or data size, slowing search.
  3. Final Answer:

    Limit model depth and number of layers to a smaller range -> Option A
  4. Quick Check:

    Smaller search space = fewer model options [OK]
Hint: Shrink search space by limiting model complexity [OK]
Common Mistakes:
  • Thinking more training epochs speed up search
  • Choosing slower optimizers to improve speed
  • Using full dataset always speeds search