TensorFlowml~20 mins

Dense (fully connected) layers in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - Dense (fully connected) layers

Problem:We want to classify handwritten digits from the MNIST dataset using a neural network with dense layers.

Current Metrics:Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45

Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.

Your Task

Reduce overfitting so that validation accuracy improves to at least 90% while keeping training accuracy below 95%.

You can only modify the dense layers and their configurations.

Do not change the dataset or preprocessing steps.

Keep the number of epochs to 20.

Hint 1

Hint 2

Hint 3

Solution

TensorFlow

import tensorflow as tf
from tensorflow.keras import layers, models

# Load MNIST dataset
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize pixel values
X_train = X_train.reshape(-1, 28*28) / 255.0
X_test = X_test.reshape(-1, 28*28) / 255.0

# Build model with dropout and smaller dense layers
model = models.Sequential([
    layers.Dense(128, activation='relu', input_shape=(28*28,)),
    layers.Dropout(0.3),
    layers.Dense(64, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=20, batch_size=64, validation_split=0.2, verbose=0)

train_acc = history.history['accuracy'][-1] * 100
val_acc = history.history['val_accuracy'][-1] * 100
train_loss = history.history['loss'][-1]
val_loss = history.history['val_loss'][-1]

print(f'Training accuracy: {train_acc:.2f}%, Validation accuracy: {val_acc:.2f}%')
print(f'Training loss: {train_loss:.4f}, Validation loss: {val_loss:.4f}')

Added Dropout layers with rate 0.3 after each dense layer to reduce overfitting.

Reduced the number of units in dense layers from 256 and 128 to 128 and 64 respectively.

Kept activation functions as ReLU for hidden layers and softmax for output.

Results Interpretation

Before: Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45

After: Training accuracy: 93.5%, Validation accuracy: 91.2%, Training loss: 0.18, Validation loss: 0.28

Adding dropout and reducing model complexity helps reduce overfitting. This improves validation accuracy by making the model generalize better to new data.

Bonus Experiment

Try adding batch normalization layers after each dense layer and observe the effect on training and validation accuracy.

💡 Hint

Batch normalization can stabilize and speed up training, sometimes improving generalization.

Practice

(1/5)

1. What does a Dense (fully connected) layer do in a neural network?

easy

A. Does not connect any neurons, only passes data through

B. Connects every input neuron to every output neuron with weights

C. Connects neurons randomly without weights

D. Only connects input neurons to output neurons with zero weights

Dense (fully connected) layers in TensorFlow - ML Experiment: Train & Evaluate

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of Dense layers

Step 2: Compare options with Dense layer behavior

Final Answer:

Quick Check:

Solution

Step 1: Recall TensorFlow Dense layer syntax

Step 2: Match options to correct syntax

Final Answer:

Quick Check:

Solution

Step 1: Analyze model layers and input shape

Step 2: Determine output shape after second Dense

Final Answer:

Quick Check:

Solution

Step 1: Check Dense layer usage and input shape

Step 2: Verify loss function and activation usage

Final Answer:

Quick Check:

Solution

Step 1: Understand classification output needs

Step 2: Choose activation for multi-class classification

Step 3: Evaluate options

Final Answer:

Quick Check: