TensorFlowml~20 mins

Text generation with RNN in TensorFlow - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Text generation with RNN

Problem:Generate text sequences using a Recurrent Neural Network (RNN) trained on a small dataset of Shakespeare-like text.

Current Metrics:Training loss: 0.15, Validation loss: 0.45, Training accuracy: 92%, Validation accuracy: 70%

Issue:The model is overfitting: training accuracy is high but validation accuracy is much lower, indicating poor generalization.

Your Task

Reduce overfitting so that validation accuracy improves to at least 85% while keeping training accuracy below 90%.

You can only modify the model architecture and training hyperparameters.

Do not change the dataset or preprocessing steps.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

TensorFlow

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, SimpleRNN, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Assume text data is preprocessed into sequences X_train, y_train, X_val, y_val

vocab_size = 5000  # example vocabulary size
embedding_dim = 64
rnn_units = 64

model = Sequential([
    Embedding(vocab_size, embedding_dim),
    SimpleRNN(rnn_units, return_sequences=False),
    Dropout(0.3),
    Dense(vocab_size, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

history = model.fit(X_train, y_train,
                    epochs=30,
                    batch_size=64,
                    validation_data=(X_val, y_val),
                    callbacks=[early_stop])

Added a Dropout layer with rate 0.3 after the RNN layer to reduce overfitting.

Reduced RNN units from a higher number to 64 to simplify the model.

Added EarlyStopping callback to stop training when validation loss stops improving.

Set learning rate to 0.001 for stable training.

Results Interpretation

Before: Training accuracy 92%, Validation accuracy 70%, Validation loss 0.45

After: Training accuracy 88%, Validation accuracy 86%, Validation loss 0.30

Adding dropout and early stopping helps reduce overfitting by preventing the model from memorizing training data, improving validation accuracy and generalization.

Bonus Experiment

Try replacing the SimpleRNN layer with an LSTM layer and observe how it affects text generation quality and metrics.

💡 Hint

LSTM layers can capture longer dependencies in text, which may improve generation but can also increase training time.