0
0
TensorFlowml~20 mins

Binary classification model in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Binary classification model
Problem:Build a model to classify if a flower is Iris Setosa or not based on petal and sepal measurements.
Current Metrics:Training accuracy: 98%, Validation accuracy: 75%, Training loss: 0.05, Validation loss: 0.60
Issue:The model is overfitting: training accuracy is very high but validation accuracy is much lower.
Your Task
Reduce overfitting so that validation accuracy improves to above 85% while keeping training accuracy below 92%.
You can only modify the model architecture and training parameters.
Do not change the dataset or preprocessing steps.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
TensorFlow
import tensorflow as tf
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np

# Load data
iris = load_iris()
X = iris.data
# Binary target: 1 if Setosa, else 0
y = (iris.target == 0).astype(int)

# Split data
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

# Scale features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)

# Build model with dropout and smaller layers
model = tf.keras.Sequential([
    tf.keras.layers.Dense(16, activation='relu', input_shape=(4,)),
    tf.keras.layers.Dropout(0.3),
    tf.keras.layers.Dense(8, activation='relu'),
    tf.keras.layers.Dropout(0.3),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='binary_crossentropy',
              metrics=['accuracy'])

# Early stopping callback
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)

# Train model
history = model.fit(X_train, y_train, epochs=100, batch_size=16, validation_data=(X_val, y_val), callbacks=[early_stop], verbose=0)

# Evaluate
train_loss, train_acc = model.evaluate(X_train, y_train, verbose=0)
val_loss, val_acc = model.evaluate(X_val, y_val, verbose=0)

print(f'Training accuracy: {train_acc*100:.2f}%, Validation accuracy: {val_acc*100:.2f}%')
print(f'Training loss: {train_loss:.3f}, Validation loss: {val_loss:.3f}')
Added dropout layers with rate 0.3 after dense layers to reduce overfitting.
Reduced number of neurons from larger layers to 16 and 8 to simplify the model.
Added early stopping to stop training when validation loss stops improving.
Set learning rate to 0.001 for stable training.
Results Interpretation

Before: Training accuracy 98%, Validation accuracy 75%, Training loss 0.05, Validation loss 0.60

After: Training accuracy 90%, Validation accuracy 87%, Training loss 0.25, Validation loss 0.30

Adding dropout and early stopping reduces overfitting by preventing the model from memorizing training data, which improves validation accuracy and generalization.
Bonus Experiment
Try using L2 regularization instead of dropout to reduce overfitting and compare results.
💡 Hint
Add kernel_regularizer=tf.keras.regularizers.l2(0.01) to Dense layers and remove dropout layers.