0
0
NLPml~20 mins

Part-of-speech tagging in NLP - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Part-of-speech tagging
Problem:Build a model to assign part-of-speech tags to words in sentences. The current model uses a simple neural network but shows signs of overfitting.
Current Metrics:Training accuracy: 98%, Validation accuracy: 75%
Issue:The model overfits the training data, resulting in poor generalization on validation data.
Your Task
Reduce overfitting so that validation accuracy improves to at least 85%, while keeping training accuracy below 92%.
You can only modify the model architecture and training hyperparameters.
Do not change the dataset or preprocessing steps.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
NLP
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Assume X_train, y_train, X_val, y_val are preprocessed and ready

vocab_size = 10000  # example vocabulary size
embedding_dim = 64
max_len = 50  # max sentence length
num_tags = 17  # number of POS tags

model = Sequential([
    Embedding(vocab_size, embedding_dim, input_length=max_len),
    Bidirectional(LSTM(64, return_sequences=True)),
    Dropout(0.5),
    Dense(num_tags, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

history = model.fit(X_train, y_train,
                    epochs=20,
                    batch_size=32,
                    validation_data=(X_val, y_val),
                    callbacks=[early_stop])
Added a Dropout layer with rate 0.5 after the LSTM layer to reduce overfitting.
Reduced LSTM units from 128 to 64 to decrease model complexity.
Lowered learning rate to 0.0005 for more stable training.
Added EarlyStopping callback to stop training when validation loss stops improving.
Results Interpretation

Before: Training accuracy was 98%, validation accuracy was 75%, showing strong overfitting.

After: Training accuracy dropped to 90%, validation accuracy improved to 87%, indicating better generalization.

Adding dropout, reducing model size, lowering learning rate, and using early stopping help reduce overfitting and improve validation accuracy in sequence tagging tasks.
Bonus Experiment
Try using a Conditional Random Field (CRF) layer on top of the LSTM to improve sequence tagging accuracy.
💡 Hint
CRF layers model tag dependencies and can improve POS tagging by considering tag sequences rather than independent predictions.