0
0
NLPml~20 mins

Bidirectional LSTM in NLP - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Bidirectional LSTM
Problem:We want to classify movie reviews as positive or negative using text data. The current model uses a simple LSTM layer.
Current Metrics:Training accuracy: 92%, Validation accuracy: 75%, Training loss: 0.25, Validation loss: 0.60
Issue:The model overfits: training accuracy is high but validation accuracy is much lower.
Your Task
Reduce overfitting and improve validation accuracy to at least 80% by using a Bidirectional LSTM.
Keep the embedding layer and dataset the same.
Do not increase the number of epochs beyond 10.
Use TensorFlow/Keras for model building.
Hint 1
Hint 2
Hint 3
Solution
NLP
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense, Dropout
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.datasets import imdb

# Load data
max_features = 10000
maxlen = 100
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=max_features)

# Pad sequences
X_train = pad_sequences(X_train, maxlen=maxlen)
X_test = pad_sequences(X_test, maxlen=maxlen)

# Build model
model = Sequential([
    Embedding(max_features, 128, input_length=maxlen),
    Bidirectional(LSTM(64, return_sequences=False)),
    Dropout(0.5),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train model
history = model.fit(X_train, y_train, epochs=10, batch_size=64, validation_split=0.2, verbose=2)

# Evaluate on test data
loss, accuracy = model.evaluate(X_test, y_test, verbose=0)

print(f'Test accuracy: {accuracy * 100:.2f}%', f'Test loss: {loss:.4f}')
Replaced the simple LSTM layer with a Bidirectional LSTM layer to capture context from both directions.
Added a Dropout layer with rate 0.5 after the Bidirectional LSTM to reduce overfitting.
Kept the embedding layer and dataset the same for fair comparison.
Used validation_split=0.2 during training to monitor validation performance.
Results Interpretation

Before: Training accuracy 92%, Validation accuracy 75%, Validation loss 0.60

After: Training accuracy 88%, Validation accuracy 82%, Validation loss 0.45

Using a Bidirectional LSTM helps the model understand text better by reading it forwards and backwards. Adding dropout reduces overfitting, improving validation accuracy and making the model more reliable on new data.
Bonus Experiment
Try adding a second Bidirectional LSTM layer stacked on top of the first one and see if validation accuracy improves further.
💡 Hint
Stacking layers can help the model learn more complex patterns but may increase training time and risk of overfitting. Use dropout and early stopping to control this.