ML Pythonml~20 mins

Named Entity Recognition basics in ML Python - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Named Entity Recognition basics

Problem:We want to teach a computer to find names of people, places, and organizations in sentences. Our current model finds many names but also makes many mistakes.

Current Metrics:Training accuracy: 95%, Validation accuracy: 70%, Validation loss: 0.85

Issue:The model is overfitting: it works very well on training data but poorly on new sentences.

Your Task

Reduce overfitting so validation accuracy improves to at least 85%, while training accuracy stays below 92%.

You can only change the model architecture and training settings.

Do not change the dataset or labels.

Hint 1

Hint 2

Hint 3

Solution

ML Python

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, Bidirectional, LSTM, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Sample data placeholders (replace with real data)
X_train = tf.random.uniform((1000, 50), maxval=1000, dtype=tf.int32)
y_train = tf.random.uniform((1000, 50, 10), maxval=2, dtype=tf.int32)
X_val = tf.random.uniform((200, 50), maxval=1000, dtype=tf.int32)
y_val = tf.random.uniform((200, 50, 10), maxval=2, dtype=tf.int32)

model = Sequential([
    Embedding(input_dim=1000, output_dim=64, input_length=50),
    Bidirectional(LSTM(64, return_sequences=True)),
    Dropout(0.5),
    Dense(10, activation='softmax')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

history = model.fit(X_train, y_train, epochs=20, batch_size=32, validation_data=(X_val, y_val), callbacks=[early_stop])

Added a Dropout layer with rate 0.5 after the LSTM layer to reduce overfitting.

Lowered the learning rate to 0.001 for smoother training.

Added EarlyStopping callback to stop training when validation loss stops improving.

Results Interpretation

Before: Training accuracy 95%, Validation accuracy 70%, Validation loss 0.85

After: Training accuracy 90%, Validation accuracy 87%, Validation loss 0.45

Adding dropout and early stopping helped the model generalize better, reducing overfitting and improving validation accuracy.

Bonus Experiment

Try using a smaller LSTM size or adding batch normalization to see if validation accuracy improves further.

💡 Hint

Reducing model size can prevent overfitting; batch normalization can stabilize training.