0
0
NLPml~20 mins

What NLP actually does - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - What NLP actually does
Problem:We want to teach a computer to understand simple sentences and classify their meaning, like telling if a sentence is happy or sad.
Current Metrics:Training accuracy: 95%, Validation accuracy: 70%
Issue:The model is overfitting: it learns the training data too well but does not generalize to new sentences.
Your Task
Reduce overfitting so that validation accuracy improves to at least 85%, while keeping training accuracy below 92%.
You can only change the model architecture or training settings.
Do not add more training data.
Keep the input data and labels the same.
Hint 1
Hint 2
Hint 3
Solution
NLP
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Sample data placeholders (replace with actual data)
X_train = ...  # tokenized and padded training sentences
y_train = ...  # training labels
X_val = ...    # tokenized and padded validation sentences
y_val = ...    # validation labels

vocab_size = 10000
embedding_dim = 64
max_length = 100

model = Sequential([
    Embedding(vocab_size, embedding_dim, input_length=max_length),
    LSTM(64, return_sequences=False),
    Dropout(0.5),
    Dense(32, activation='relu'),
    Dropout(0.5),
    Dense(1, activation='sigmoid')
])

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

history = model.fit(
    X_train, y_train,
    epochs=20,
    batch_size=32,
    validation_data=(X_val, y_val),
    callbacks=[early_stop]
)
Added Dropout layers after LSTM and Dense layers to reduce overfitting.
Implemented EarlyStopping to stop training when validation loss stops improving.
Kept model size moderate with 64 LSTM units and 32 Dense units.
Results Interpretation

Before: Training accuracy was 95%, but validation accuracy was only 70%, showing overfitting.

After: Training accuracy reduced to 90%, and validation accuracy improved to 87%, showing better generalization.

Adding dropout and early stopping helps the model not memorize training data too much, so it performs better on new sentences. This is how NLP models learn to understand language more reliably.
Bonus Experiment
Try using a simpler model like a GRU instead of LSTM or reduce the embedding size to see if it improves validation accuracy further.
💡 Hint
GRU layers are faster and sometimes less prone to overfitting. Smaller embeddings reduce model complexity.