Bird
Raised Fist0
TensorFlowml~20 mins

model.fit() training loop in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - model.fit() training loop
Problem:Train a simple neural network to classify handwritten digits from the MNIST dataset.
Current Metrics:Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45
Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Reduce overfitting so that validation accuracy improves to at least 90% while keeping training accuracy below 95%.
You can only modify the model.fit() training loop parameters and add callbacks.
Do not change the model architecture or dataset.
Hint 1
Hint 2
Hint 3
Solution
TensorFlow
import tensorflow as tf
from tensorflow.keras import layers, models

# Load MNIST data
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalize data
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Flatten images
x_train = x_train.reshape(-1, 28*28)
x_test = x_test.reshape(-1, 28*28)

# Build simple model
model = models.Sequential([
    layers.Dense(128, activation='relu', input_shape=(28*28,)),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Add early stopping callback
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Train model with validation split and early stopping
history = model.fit(
    x_train, y_train,
    epochs=30,
    batch_size=64,
    validation_split=0.2,
    callbacks=[early_stop],
    verbose=2
)

# Evaluate on test data
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f'Test accuracy: {test_acc:.4f}, Test loss: {test_loss:.4f}')
Added validation_split=0.2 to monitor validation performance during training.
Added EarlyStopping callback to stop training early when validation loss stops improving.
Reduced batch size to 64 for better generalization.
Increased max epochs to 30 but training stops early due to early stopping.
Results Interpretation

Before: Training accuracy: 98%, Validation accuracy: 85%, Training loss: 0.05, Validation loss: 0.45

After: Training accuracy: 93%, Validation accuracy: 91%, Training loss: 0.15, Validation loss: 0.25

Using validation data during training and stopping early when validation loss stops improving helps reduce overfitting and improves the model's ability to generalize to new data.
Bonus Experiment
Try adding dropout layers to the model to further reduce overfitting and compare the results.
💡 Hint
Add a Dropout layer with rate 0.3 after the first Dense layer and retrain the model with the same training loop.

Practice

(1/5)
1. What does the epochs parameter control in the model.fit() training loop?
easy
A. The number of times the entire dataset is shown to the model
B. The size of each batch of data during training
C. The learning rate of the optimizer
D. The number of layers in the model

Solution

  1. Step 1: Understand the role of epochs in training

    Epochs define how many times the model sees the whole dataset during training.
  2. Step 2: Differentiate epochs from batch size and other parameters

    Batch size controls data chunks per step, learning rate controls update speed, layers define model depth.
  3. Final Answer:

    The number of times the entire dataset is shown to the model -> Option A
  4. Quick Check:

    Epochs = full dataset passes [OK]
Hint: Epochs = full dataset passes through model [OK]
Common Mistakes:
  • Confusing epochs with batch size
  • Thinking epochs control learning rate
  • Mixing epochs with model architecture
2. Which of the following is the correct way to call model.fit() with 10 epochs and batch size of 32?
easy
A. model.fit(x_train, y_train, epochs=10, batch_size=32)
B. model.fit(x_train, y_train, batch=10, size=32)
C. model.fit(x_train, y_train, epoch=10, batch=32)
D. model.fit(x_train, y_train, epochs=32, batch_size=10)

Solution

  1. Step 1: Recall correct parameter names for model.fit()

    The correct parameters are epochs and batch_size.
  2. Step 2: Check each option for correct syntax

    model.fit(x_train, y_train, epochs=10, batch_size=32) uses correct parameter names and values. Others use wrong names or swapped values.
  3. Final Answer:

    model.fit(x_train, y_train, epochs=10, batch_size=32) -> Option A
  4. Quick Check:

    Correct parameter names = model.fit(x_train, y_train, epochs=10, batch_size=32) [OK]
Hint: Use exact parameter names: epochs and batch_size [OK]
Common Mistakes:
  • Using wrong parameter names like 'batch' or 'epoch'
  • Swapping values of epochs and batch_size
  • Missing required parameters
3. Given the code below, what will be printed after training?
model = tf.keras.Sequential([
  tf.keras.layers.Dense(1, input_shape=(1,))
])
model.compile(optimizer='sgd', loss='mse')
x = np.array([1, 2, 3, 4], dtype=float)
y = np.array([2, 4, 6, 8], dtype=float)
history = model.fit(x, y, epochs=3, batch_size=2, verbose=0)
print(history.history['loss'])
medium
A. [0, 0, 0]
B. [3, 2, 1]
C. [some decreasing loss values over 3 epochs]
D. An error because batch_size is too large

Solution

  1. Step 1: Understand training with batch_size=2 and epochs=3

    The model trains 3 times over data in batches of 2, updating weights each batch.
  2. Step 2: Predict loss values behavior

    Loss starts higher and decreases as model learns; exact values vary but should decrease over epochs.
  3. Final Answer:

    [some decreasing loss values over 3 epochs] -> Option C
  4. Quick Check:

    Loss decreases with training epochs [OK]
Hint: Loss decreases over epochs during training [OK]
Common Mistakes:
  • Expecting exact loss numbers
  • Thinking loss stays constant or zero
  • Assuming batch_size causes error here
4. What is wrong with this model.fit() call?
model.fit(x_train, y_train, epochs=5, batch_size=0)
medium
A. No validation data provided
B. epochs cannot be less than 10
C. x_train and y_train must be lists, not arrays
D. batch_size cannot be zero; it must be a positive integer

Solution

  1. Step 1: Check batch_size parameter validity

    Batch size must be a positive integer; zero is invalid and causes error.
  2. Step 2: Verify other parameters

    Epochs can be any positive integer; data type arrays are allowed; validation data is optional.
  3. Final Answer:

    batch_size cannot be zero; it must be a positive integer -> Option D
  4. Quick Check:

    batch_size > 0 required [OK]
Hint: Batch size must be positive integer, not zero [OK]
Common Mistakes:
  • Setting batch_size to zero
  • Thinking epochs must be >=10
  • Confusing data types for inputs
5. You want to train a model and check its performance on new data after each epoch. Which model.fit() parameter helps you do this?
hard
A. steps_per_epoch
B. validation_data
C. batch_size
D. shuffle

Solution

  1. Step 1: Understand the purpose of validation_data

    Validation data is used to evaluate model performance after each epoch without training on it.
  2. Step 2: Differentiate from other parameters

    Batch size controls training speed, steps_per_epoch controls iteration count, shuffle randomizes data order.
  3. Final Answer:

    validation_data -> Option B
  4. Quick Check:

    Validation data checks model after epochs [OK]
Hint: Use validation_data to check model after each epoch [OK]
Common Mistakes:
  • Confusing batch_size with validation
  • Using steps_per_epoch to validate
  • Thinking shuffle affects validation