ML Pythonml~20 mins

Neural network architecture in ML Python - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Neural network architecture

Problem:You have a neural network that classifies images into 10 categories. The current model has 3 dense layers with many neurons but it overfits the training data.

Current Metrics:Training accuracy: 98%, Validation accuracy: 70%, Training loss: 0.05, Validation loss: 1.2

Issue:The model overfits: training accuracy is very high but validation accuracy is low, showing poor generalization.

Your Task

Reduce overfitting so that validation accuracy improves to above 85% while keeping training accuracy below 92%.

You cannot reduce the size of the training dataset.

You must keep the same number of layers but can change layer types or add regularization.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

ML Python

import tensorflow as tf
from tensorflow.keras import layers, models

# Load example dataset
(X_train, y_train), (X_val, y_val) = tf.keras.datasets.mnist.load_data()
X_train, X_val = X_train / 255.0, X_val / 255.0
X_train = X_train.reshape(-1, 28*28)
X_val = X_val.reshape(-1, 28*28)

model = models.Sequential([
    layers.Dense(128, activation='relu', input_shape=(28*28,)),
    layers.BatchNormalization(),
    layers.Dropout(0.3),
    layers.Dense(64, activation='relu'),
    layers.BatchNormalization(),
    layers.Dropout(0.3),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

history = model.fit(X_train, y_train, epochs=50, batch_size=64, validation_data=(X_val, y_val), callbacks=[early_stop])

Added dropout layers after dense layers to reduce overfitting by randomly ignoring neurons during training.

Added batch normalization layers to stabilize and speed up training.

Reduced the number of neurons from very large to 128 and 64 to simplify the model.

Added early stopping to stop training when validation loss stops improving.

Results Interpretation

Before: Training accuracy 98%, Validation accuracy 70%, Training loss 0.05, Validation loss 1.2

After: Training accuracy 90%, Validation accuracy 87%, Training loss 0.25, Validation loss 0.35

Adding dropout and batch normalization, reducing model size, and using early stopping helps reduce overfitting. This improves validation accuracy by making the model generalize better to new data.

Bonus Experiment

Try replacing dense layers with convolutional layers to better capture image features and improve accuracy.

💡 Hint

Use Conv2D and MaxPooling2D layers before dense layers and reshape input to 28x28x1.