Bird
Raised Fist0
TensorFlowml~20 mins

Sequential model API in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Sequential model API
Problem:You are training a neural network to classify images into 10 categories using the Sequential model API in TensorFlow.
Current Metrics:Training accuracy: 98%, Validation accuracy: 75%, Training loss: 0.05, Validation loss: 0.85
Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Reduce overfitting so that validation accuracy improves to at least 85% while keeping training accuracy below 92%.
You must use the Sequential model API.
You can only change model architecture and training hyperparameters.
Do not change the dataset or preprocessing.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models

# Load dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

# Normalize pixel values
x_train, x_test = x_train / 255.0, x_test / 255.0

# Build Sequential model with dropout
model = models.Sequential([
    layers.Flatten(input_shape=(32, 32, 3)),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(64, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(10, activation='softmax')
])

# Compile model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train model with validation split and early stopping
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

history = model.fit(x_train, y_train, epochs=30, batch_size=64, validation_split=0.2, callbacks=[early_stop])

# Evaluate on test data
test_loss, test_acc = model.evaluate(x_test, y_test)

print(f'Test accuracy: {test_acc:.2f}', f'Test loss: {test_loss:.2f}')
Added Dropout layers after dense layers to reduce overfitting.
Reduced the number of neurons in the second dense layer from 128 to 64.
Added EarlyStopping callback to stop training when validation loss stops improving.
Used a batch size of 64 and validation split of 0.2 for better generalization.
Results Interpretation

Before: Training accuracy: 98%, Validation accuracy: 75%, Training loss: 0.05, Validation loss: 0.85

After: Training accuracy: 90%, Validation accuracy: 86%, Training loss: 0.25, Validation loss: 0.40

Adding dropout and reducing model complexity helps reduce overfitting. Early stopping prevents training too long. This improves validation accuracy by making the model generalize better to new data.
Bonus Experiment
Try using batch normalization layers in the Sequential model to improve training stability and possibly increase validation accuracy.
💡 Hint
Insert batch normalization layers after dense layers and before activation functions.

Practice

(1/5)
1. What is the main purpose of the Sequential model API in TensorFlow?
easy
A. To visualize the training process of a model
B. To create complex models with multiple inputs and outputs
C. To perform data preprocessing before training
D. To build a model by stacking layers in a linear order

Solution

  1. Step 1: Understand the Sequential API purpose

    The Sequential API is designed to build models by stacking layers one after another in a simple linear fashion.
  2. Step 2: Compare options with the API's function

    Options B, C, and D describe other functionalities not related to the Sequential API's main purpose.
  3. Final Answer:

    To build a model by stacking layers in a linear order -> Option D
  4. Quick Check:

    Sequential API = linear stacking of layers [OK]
Hint: Sequential means layers stacked one after another [OK]
Common Mistakes:
  • Confusing Sequential with Functional API for complex models
  • Thinking Sequential handles data preprocessing
  • Assuming Sequential is for visualization
2. Which of the following is the correct way to create a Sequential model with one dense layer of 10 units in TensorFlow?
easy
A. model = Sequential(Dense(10))
B. model = Sequential([Dense(10)])
C. model = Sequential().add(Dense(10))
D. model = Sequential.add(Dense(10))

Solution

  1. Step 1: Recall correct Sequential model creation syntax

    The Sequential model can be created by passing a list of layers inside the constructor, e.g., Sequential([Dense(10)]).
  2. Step 2: Check each option's syntax validity

    model = Sequential([Dense(10)]) uses the correct list syntax. model = Sequential(Dense(10)) misses the list brackets. model = Sequential().add(Dense(10)) is valid usage but requires assignment to a variable to keep the model reference. model = Sequential.add(Dense(10)) incorrectly calls add() on the class, not an instance.
  3. Final Answer:

    model = Sequential([Dense(10)]) -> Option B
  4. Quick Check:

    Sequential needs list of layers in constructor [OK]
Hint: Pass layers as a list inside Sequential() [OK]
Common Mistakes:
  • Omitting brackets around layers list
  • Calling add() on class instead of instance
  • Chaining add() without assignment
3. What will be the output shape of the model after running this code?
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(5, input_shape=(10,)),
    Dense(3)
])
print(model.output_shape)
medium
A. (None, 5)
B. (10, 3)
C. (None, 3)
D. (5, 3)

Solution

  1. Step 1: Understand input and output shapes in Sequential

    The input shape is (10,), so the first Dense layer outputs (None, 5). The second Dense layer outputs (None, 3) because it has 3 units.
  2. Step 2: Identify final output shape

    The model's output shape is the output of the last layer, which is (None, 3). None means batch size is flexible.
  3. Final Answer:

    (None, 3) -> Option C
  4. Quick Check:

    Last Dense units = output shape [OK]
Hint: Output shape matches last layer units with batch None [OK]
Common Mistakes:
  • Confusing input shape with output shape
  • Using batch size instead of None
  • Mixing layer output shapes
4. Identify the error in this Sequential model code:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential()
model.add(Dense(10, input_shape=(5,)))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
model.fit(x_train, y_train, epochs=5)
medium
A. x_train and y_train are not defined
B. Sequential model cannot use add() method
C. Loss function 'mse' is invalid
D. Missing import for optimizer

Solution

  1. Step 1: Check code for missing definitions

    The code uses x_train and y_train in model.fit() but they are not defined anywhere, causing a runtime error.
  2. Step 2: Verify other parts

    Optimizer 'adam' and loss 'mse' are valid strings. add() method is valid for Sequential instances. Imports are sufficient.
  3. Final Answer:

    x_train and y_train are not defined -> Option A
  4. Quick Check:

    Undefined training data causes error [OK]
Hint: Check if training data variables are defined before fit() [OK]
Common Mistakes:
  • Assuming loss 'mse' is invalid
  • Thinking add() method is not allowed
  • Ignoring missing data variables
5. You want to build a Sequential model for a classification task with 3 classes. Which of the following is the best final layer and loss combination?
hard
A. Dense(3, activation='softmax') with loss='categorical_crossentropy'
B. Dense(1, activation='sigmoid') with loss='mean_squared_error'
C. Dense(3, activation='relu') with loss='binary_crossentropy'
D. Dense(3) with loss='sparse_categorical_crossentropy'

Solution

  1. Step 1: Understand classification output requirements

    For 3 classes, the final layer should have 3 units with softmax activation to output class probabilities.
  2. Step 2: Match appropriate loss function

    For one-hot encoded labels, 'categorical_crossentropy' is the correct loss function to use with softmax output.
  3. Final Answer:

    Dense(3, activation='softmax') with loss='categorical_crossentropy' -> Option A
  4. Quick Check:

    Softmax + categorical_crossentropy = multi-class classification [OK]
Hint: Use softmax + categorical_crossentropy for multi-class tasks [OK]
Common Mistakes:
  • Using sigmoid for multi-class output
  • Using mean squared error for classification
  • Missing activation in final layer