0
0
TensorFlowml~20 mins

Dense (fully connected) layers in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Dense (fully connected) layers
Problem:We want to classify handwritten digits from the MNIST dataset using a neural network with dense layers.
Current Metrics:Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45
Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Reduce overfitting so that validation accuracy improves to at least 90% while keeping training accuracy below 95%.
You can only modify the dense layers and their configurations.
Do not change the dataset or preprocessing steps.
Keep the number of epochs to 20.
Hint 1
Hint 2
Hint 3
Solution
TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models

# Load MNIST dataset
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize pixel values
X_train = X_train.reshape(-1, 28*28) / 255.0
X_test = X_test.reshape(-1, 28*28) / 255.0

# Build model with dropout and smaller dense layers
model = models.Sequential([
    layers.Dense(128, activation='relu', input_shape=(28*28,)),
    layers.Dropout(0.3),
    layers.Dense(64, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=20, batch_size=64, validation_split=0.2, verbose=0)

train_acc = history.history['accuracy'][-1] * 100
val_acc = history.history['val_accuracy'][-1] * 100
train_loss = history.history['loss'][-1]
val_loss = history.history['val_loss'][-1]

print(f'Training accuracy: {train_acc:.2f}%, Validation accuracy: {val_acc:.2f}%')
print(f'Training loss: {train_loss:.4f}, Validation loss: {val_loss:.4f}')
Added Dropout layers with rate 0.3 after each dense layer to reduce overfitting.
Reduced the number of units in dense layers from 256 and 128 to 128 and 64 respectively.
Kept activation functions as ReLU for hidden layers and softmax for output.
Results Interpretation

Before: Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45

After: Training accuracy: 93.5%, Validation accuracy: 91.2%, Training loss: 0.18, Validation loss: 0.28

Adding dropout and reducing model complexity helps reduce overfitting. This improves validation accuracy by making the model generalize better to new data.
Bonus Experiment
Try adding batch normalization layers after each dense layer and observe the effect on training and validation accuracy.
💡 Hint
Batch normalization can stabilize and speed up training, sometimes improving generalization.