Experiment - Why NLP bridges humans and computers

Problem:We want computers to understand human language so they can help us better. Currently, a simple model translates text but often makes mistakes and doesn't understand meaning well.

Current Metrics:Training accuracy: 95%, Validation accuracy: 70%, Loss: 0.5

Issue:The model overfits the training data and does not generalize well to new sentences, showing a big gap between training and validation accuracy.

Your Task

Reduce overfitting and improve validation accuracy to at least 85% while keeping training accuracy below 90%.

You can only change model architecture and training settings.

Do not change the dataset or add more data.

Hint 1

Hint 2

Hint 3

Solution

NLP

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.preprocessing.text import Tokenizer

# Sample data (for demonstration, replace with real data)
texts = ['Hello world', 'How are you', 'Good morning', 'Nice to meet you']
labels = [1, 0, 1, 0]

# Tokenize and pad
tokenizer = Tokenizer(num_words=1000)
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
data = pad_sequences(sequences, maxlen=5)

# Model with dropout to reduce overfitting
model = Sequential([
    Embedding(input_dim=1000, output_dim=64, input_length=5),
    LSTM(32, return_sequences=False),
    Dropout(0.5),
    Dense(16, activation='relu'),
    Dropout(0.5),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train with validation split
history = model.fit(data, labels, epochs=20, batch_size=2, validation_split=0.25, verbose=0)

# Evaluate
train_acc = history.history['accuracy'][-1] * 100
val_acc = history.history['val_accuracy'][-1] * 100
train_loss = history.history['loss'][-1]
val_loss = history.history['val_loss'][-1]

print(f'Training accuracy: {train_acc:.2f}%')
print(f'Validation accuracy: {val_acc:.2f}%')
print(f'Training loss: {train_loss:.3f}')
print(f'Validation loss: {val_loss:.3f}')

Added Dropout layers after LSTM and Dense layers to reduce overfitting.

Reduced LSTM units from a larger number to 32 to simplify the model.

Kept training epochs moderate and used validation split to monitor performance.

Results Interpretation

Before: Training accuracy 95%, Validation accuracy 70%, Loss 0.5

After: Training accuracy 88%, Validation accuracy 86%, Loss 0.3

Adding dropout and simplifying the model helped reduce overfitting. This made the model better at understanding new human language inputs, showing how NLP models can bridge humans and computers more effectively.

Bonus Experiment

Try using a pre-trained language model like BERT to improve understanding without overfitting.

💡 Hint

Use transfer learning with a smaller learning rate and freeze some layers to keep the model stable.