Experiment - SimpleRNN layer

Problem:We want to classify sequences of numbers into two categories using a SimpleRNN layer. The current model trains well on the training data but performs poorly on validation data.

Current Metrics:Training accuracy: 98%, Validation accuracy: 65%, Training loss: 0.05, Validation loss: 1.2

Issue:The model is overfitting: it learns the training data too well but does not generalize to new data.

Your Task

Reduce overfitting so that validation accuracy improves to at least 80% while keeping training accuracy below 90%.

You can only modify the model architecture and training parameters.

Do not change the dataset or preprocessing steps.

Hint 1

Hint 2

Hint 3

Solution

TensorFlow

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Generate dummy sequence data
np.random.seed(42)
X_train = np.random.rand(1000, 10, 1)  # 1000 sequences, length 10, 1 feature
y_train = (np.sum(X_train, axis=1) > 5).astype(int).reshape(-1, 1)  # label 1 if sum > 5 else 0
X_val = np.random.rand(200, 10, 1)
y_val = (np.sum(X_val, axis=1) > 5).astype(int).reshape(-1, 1)

# Build model with dropout and fewer units
model = Sequential([
    SimpleRNN(16, activation='tanh', input_shape=(10, 1), dropout=0.3),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Early stopping callback
early_stop = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)

history = model.fit(
    X_train, y_train,
    epochs=50,
    batch_size=32,
    validation_data=(X_val, y_val),
    callbacks=[early_stop],
    verbose=0
)

# Evaluate final metrics
train_loss, train_acc = model.evaluate(X_train, y_train, verbose=0)
val_loss, val_acc = model.evaluate(X_val, y_val, verbose=0)

print(f'Training accuracy: {train_acc*100:.2f}%, Validation accuracy: {val_acc*100:.2f}%')
print(f'Training loss: {train_loss:.3f}, Validation loss: {val_loss:.3f}')

Reduced SimpleRNN units from 32 to 16 to simplify the model.

Added dropout=0.3 in the SimpleRNN layer to reduce overfitting.

Added EarlyStopping callback to stop training when validation loss stops improving.

Results Interpretation

Before: Training accuracy: 98%, Validation accuracy: 65%, Training loss: 0.05, Validation loss: 1.2

After: Training accuracy: 88%, Validation accuracy: 82%, Training loss: 0.25, Validation loss: 0.40

Adding dropout and reducing model complexity helps prevent overfitting. Early stopping stops training before the model memorizes training data too much, improving validation accuracy.

Bonus Experiment

Try replacing the SimpleRNN layer with an LSTM layer and compare the validation accuracy.

💡 Hint

LSTM layers can capture longer dependencies and might improve performance on sequence data.