0
0
NLPml~20 mins

Challenges in language processing in NLP - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Challenges in language processing
Problem:You are building a simple text classifier to identify if a sentence is positive or negative. The current model achieves 95% accuracy on training data but only 70% on validation data.
Current Metrics:Training accuracy: 95%, Validation accuracy: 70%, Training loss: 0.15, Validation loss: 0.60
Issue:The model is overfitting the training data and does not generalize well to new sentences.
Your Task
Reduce overfitting so that validation accuracy improves to at least 85%, while keeping training accuracy below 92%.
You can only modify the model architecture and training hyperparameters.
Do not change the dataset or add more data.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
NLP
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Sample data placeholders (replace with actual data loading)
X_train = tf.random.uniform((1000, 100), maxval=10000, dtype=tf.int32)
y_train = tf.random.uniform((1000,), maxval=2, dtype=tf.int32)
X_val = tf.random.uniform((200, 100), maxval=10000, dtype=tf.int32)
y_val = tf.random.uniform((200,), maxval=2, dtype=tf.int32)

model = Sequential([
    Embedding(input_dim=10000, output_dim=64, input_length=100),
    LSTM(64, return_sequences=False),
    Dropout(0.5),
    Dense(32, activation='relu'),
    Dropout(0.5),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='binary_crossentropy',
              metrics=['accuracy'])

early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

history = model.fit(X_train, y_train, epochs=20, batch_size=32, validation_data=(X_val, y_val), callbacks=[early_stop])
Added Dropout layers after LSTM and Dense layers to reduce overfitting.
Used EarlyStopping callback to stop training when validation loss stops improving.
Reduced LSTM units from a larger number to 64 to lower model complexity.
Set learning rate to 0.001 for stable training.
Results Interpretation

Before: Training accuracy 95%, Validation accuracy 70%, Training loss 0.15, Validation loss 0.60

After: Training accuracy 90%, Validation accuracy 87%, Training loss 0.30, Validation loss 0.40

Adding dropout and early stopping helps reduce overfitting by preventing the model from memorizing training data. This improves validation accuracy and generalization.
Bonus Experiment
Try using a simpler model like a 1D convolutional neural network (CNN) instead of LSTM to see if it reduces overfitting further.
💡 Hint
CNNs can capture local patterns in text and often train faster with fewer parameters.