Bird
Raised Fist0
TensorFlowml~20 mins

Dense (fully connected) layers in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Dense (fully connected) layers
Problem:We want to classify handwritten digits from the MNIST dataset using a neural network with dense layers.
Current Metrics:Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45
Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Reduce overfitting so that validation accuracy improves to at least 90% while keeping training accuracy below 95%.
You can only modify the dense layers and their configurations.
Do not change the dataset or preprocessing steps.
Keep the number of epochs to 20.
Hint 1
Hint 2
Hint 3
Solution
TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models

# Load MNIST dataset
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize pixel values
X_train = X_train.reshape(-1, 28*28) / 255.0
X_test = X_test.reshape(-1, 28*28) / 255.0

# Build model with dropout and smaller dense layers
model = models.Sequential([
    layers.Dense(128, activation='relu', input_shape=(28*28,)),
    layers.Dropout(0.3),
    layers.Dense(64, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

history = model.fit(X_train, y_train, epochs=20, batch_size=64, validation_split=0.2, verbose=0)

train_acc = history.history['accuracy'][-1] * 100
val_acc = history.history['val_accuracy'][-1] * 100
train_loss = history.history['loss'][-1]
val_loss = history.history['val_loss'][-1]

print(f'Training accuracy: {train_acc:.2f}%, Validation accuracy: {val_acc:.2f}%')
print(f'Training loss: {train_loss:.4f}, Validation loss: {val_loss:.4f}')
Added Dropout layers with rate 0.3 after each dense layer to reduce overfitting.
Reduced the number of units in dense layers from 256 and 128 to 128 and 64 respectively.
Kept activation functions as ReLU for hidden layers and softmax for output.
Results Interpretation

Before: Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45

After: Training accuracy: 93.5%, Validation accuracy: 91.2%, Training loss: 0.18, Validation loss: 0.28

Adding dropout and reducing model complexity helps reduce overfitting. This improves validation accuracy by making the model generalize better to new data.
Bonus Experiment
Try adding batch normalization layers after each dense layer and observe the effect on training and validation accuracy.
💡 Hint
Batch normalization can stabilize and speed up training, sometimes improving generalization.

Practice

(1/5)
1. What does a Dense (fully connected) layer do in a neural network?
easy
A. Does not connect any neurons, only passes data through
B. Connects every input neuron to every output neuron with weights
C. Connects neurons randomly without weights
D. Only connects input neurons to output neurons with zero weights

Solution

  1. Step 1: Understand the role of Dense layers

    A Dense layer connects each input neuron to every output neuron using weights and biases to learn patterns.
  2. Step 2: Compare options with Dense layer behavior

    Only Connects every input neuron to every output neuron with weights correctly describes this full connection with weights; others are incorrect or incomplete.
  3. Final Answer:

    Connects every input neuron to every output neuron with weights -> Option B
  4. Quick Check:

    Dense layer = full weighted connections [OK]
Hint: Dense means all inputs connect to all outputs [OK]
Common Mistakes:
  • Thinking Dense layers connect neurons randomly
  • Believing Dense layers have zero weights
  • Assuming Dense layers do not connect neurons
2. Which of the following is the correct way to add a Dense layer with 10 neurons and ReLU activation in TensorFlow?
easy
A. tf.keras.layers.Dense(10, activation='relu')
B. tf.keras.DenseLayer(10, activation='relu')
C. tf.layers.Dense(activation='relu', units=10)
D. tf.keras.layers.Dense(activation='relu', neurons=10)

Solution

  1. Step 1: Recall TensorFlow Dense layer syntax

    The correct syntax is tf.keras.layers.Dense(units, activation='function').
  2. Step 2: Match options to correct syntax

    tf.keras.layers.Dense(10, activation='relu') matches this exactly. Others have wrong class names or parameter names.
  3. Final Answer:

    tf.keras.layers.Dense(10, activation='relu') -> Option A
  4. Quick Check:

    Correct Dense syntax = tf.keras.layers.Dense(10, activation='relu') [OK]
Hint: Use tf.keras.layers.Dense(units, activation) [OK]
Common Mistakes:
  • Using wrong class name like DenseLayer
  • Swapping parameter names (neurons vs units)
  • Placing activation before units
3. What will be the output shape of this model?
model = tf.keras.Sequential([
  tf.keras.layers.Dense(5, input_shape=(3,)),
  tf.keras.layers.Dense(2)
])
output = model(tf.constant([[1.0, 2.0, 3.0]]))
print(output.shape)
medium
A. (3, 2)
B. (1, 5)
C. (1, 2)
D. (3, 5)

Solution

  1. Step 1: Analyze model layers and input shape

    Input shape is (3,), first Dense outputs 5 units, second Dense outputs 2 units.
  2. Step 2: Determine output shape after second Dense

    Batch size is 1 (one input), final output shape is (1, 2).
  3. Final Answer:

    (1, 2) -> Option C
  4. Quick Check:

    Output shape = (batch_size, last layer units) = (1, 2) [OK]
Hint: Output shape = (batch, last Dense units) [OK]
Common Mistakes:
  • Confusing input shape with output shape
  • Mixing up units of first and second Dense layers
  • Ignoring batch dimension
4. Identify the error in this code snippet:
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, input_shape=(4,)))
model.add(tf.keras.layers.Dense(5, activation='relu'))
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(x_train, y_train, epochs=5)
medium
A. Loss function 'mse' is invalid
B. Input shape should be specified in the first layer only
C. Missing activation in the first Dense layer
D. No error, code is correct

Solution

  1. Step 1: Check Dense layer usage and input shape

    Input shape is correctly specified in the first Dense layer only.
  2. Step 2: Verify loss function and activation usage

    Loss 'mse' is valid for regression; activation in second layer is fine; first layer activation is optional.
  3. Final Answer:

    No error, code is correct -> Option D
  4. Quick Check:

    Code syntax and usage are correct [OK]
Hint: Input shape only in first layer; 'mse' is valid loss [OK]
Common Mistakes:
  • Thinking activation is mandatory in every Dense layer
  • Specifying input_shape in multiple layers
  • Believing 'mse' is invalid loss
5. You want to build a model to classify images into 3 categories. Which Dense layer setup is best for the output layer?
hard
A. Dense(3, activation='softmax')
B. Dense(1, activation='sigmoid')
C. Dense(3, activation='relu')
D. Dense(3)

Solution

  1. Step 1: Understand classification output needs

    For 3 categories, output layer should have 3 units, one per class.
  2. Step 2: Choose activation for multi-class classification

    Softmax activation outputs probabilities summing to 1, ideal for multi-class.
  3. Step 3: Evaluate options

    Dense(3, activation='softmax') uses 3 units with softmax, perfect for 3-class classification; others are unsuitable.
  4. Final Answer:

    Dense(3, activation='softmax') -> Option A
  5. Quick Check:

    Multi-class output = units=classes + softmax [OK]
Hint: Use softmax with units = number of classes [OK]
Common Mistakes:
  • Using sigmoid for multi-class output
  • Omitting activation in output layer
  • Using relu activation for output