0
0
NLPml~5 mins

LSTM for text in NLP

Choose your learning style9 modes available
Introduction

LSTM helps computers understand and remember words in sentences. It is good for tasks like predicting the next word or classifying text.

When you want to predict the next word in a sentence, like in text messaging apps.
When you want to classify emails as spam or not spam.
When you want to analyze the sentiment of a review (happy or sad).
When you want to generate text, like writing a story or poem.
When you want to understand the meaning of a sentence over time.
Syntax
NLP
model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length))
model.add(LSTM(units=hidden_units))
model.add(Dense(units=num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Embedding layer turns words into numbers that the model can understand.

LSTM layer remembers important information from the text.

Examples
This example builds an LSTM model for text classification with 5 classes.
NLP
model = Sequential()
model.add(Embedding(10000, 64, input_length=100))
model.add(LSTM(128))
model.add(Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
This example uses two LSTM layers for binary sentiment analysis.
NLP
model = Sequential()
model.add(Embedding(5000, 32, input_length=50))
model.add(LSTM(64, return_sequences=True))
model.add(LSTM(32))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
Sample Model

This program trains a small LSTM model to classify short sentences as positive or negative. It shows training accuracy and predicted classes.

NLP
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense
from tensorflow.keras.utils import to_categorical

# Sample text data: sentences and labels
texts = ['I love machine learning', 'This is a great movie', 'I hate bad weather', 'This movie is terrible']
labels = [1, 1, 0, 0]  # 1=positive, 0=negative

# Simple word index
word_index = {'i':1, 'love':2, 'machine':3, 'learning':4, 'this':5, 'is':6, 'a':7, 'great':8, 'movie':9, 'hate':10, 'bad':11, 'weather':12, 'terrible':13}

# Convert texts to sequences of integers
max_length = 5
sequences = []
for text in texts:
    seq = [word_index[word.lower()] for word in text.split()]
    # Pad sequences with zeros if shorter than max_length
    seq += [0] * (max_length - len(seq))
    sequences.append(seq)

X = np.array(sequences)
y = to_categorical(labels, num_classes=2)

# Build LSTM model
model = Sequential()
model.add(Embedding(input_dim=14, output_dim=8, input_length=max_length))
model.add(LSTM(16))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train model
history = model.fit(X, y, epochs=10, verbose=0)

# Predict on training data
predictions = model.predict(X)
predicted_classes = np.argmax(predictions, axis=1)

print(f'Training accuracy: {history.history["accuracy"][-1]:.2f}')
print('Predicted classes:', predicted_classes.tolist())
OutputSuccess
Important Notes

LSTM is good at remembering the order of words, which is important in sentences.

Embedding size and LSTM units can be changed to improve model performance.

For real projects, use more data and proper text preprocessing.

Summary

LSTM models help understand text by remembering word order.

They are useful for tasks like sentiment analysis and text classification.

Embedding layers convert words into numbers before LSTM processes them.