0
0
NLPml~5 mins

RNN for text classification in NLP

Choose your learning style9 modes available
Introduction

RNNs help computers understand sentences by reading words one by one. This helps to decide what category a text belongs to, like spam or not spam.

When you want to tell if an email is spam or not by reading its words.
When you want to find out if a movie review is positive or negative.
When you want to sort news articles into topics like sports or politics.
When you want to detect the mood of a tweet, like happy or sad.
Syntax
NLP
model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length))
model.add(SimpleRNN(units=hidden_units))
model.add(Dense(units=num_classes, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

Embedding turns words into numbers the model can understand.

SimpleRNN reads the words one by one to learn the order and meaning.

Examples
This example builds a small RNN for classifying texts into 2 categories.
NLP
model = Sequential()
model.add(Embedding(1000, 64, input_length=10))
model.add(SimpleRNN(32))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
This example uses a bigger vocabulary and longer texts for 3 classes.
NLP
model = Sequential()
model.add(Embedding(5000, 128, input_length=20))
model.add(SimpleRNN(64))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Sample Model

This program trains a simple RNN to classify 6 short texts into 2 groups. It shows training accuracy and predicted classes.

NLP
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, SimpleRNN, Dense
from tensorflow.keras.utils import to_categorical

# Sample data: 6 texts, each with 5 words (word indexes)
x_train = np.array([
    [1, 2, 3, 4, 0],
    [2, 3, 4, 5, 0],
    [1, 3, 5, 0, 0],
    [4, 5, 6, 7, 0],
    [5, 6, 7, 8, 0],
    [6, 7, 8, 9, 0]
])

# Labels: 2 classes (0 or 1)
y_train = np.array([0, 0, 0, 1, 1, 1])
y_train_cat = to_categorical(y_train, num_classes=2)

vocab_size = 10  # words numbered 0-9
embedding_dim = 8
max_length = 5
hidden_units = 4
num_classes = 2

model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length))
model.add(SimpleRNN(units=hidden_units))
model.add(Dense(units=num_classes, activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# Train model
history = model.fit(x_train, y_train_cat, epochs=10, verbose=0)

# Predict on training data
predictions = model.predict(x_train)
predicted_classes = np.argmax(predictions, axis=1)

print(f"Training accuracy: {history.history['accuracy'][-1]:.2f}")
print(f"Predicted classes: {predicted_classes.tolist()}")
OutputSuccess
Important Notes

RNNs read words in order, so they understand sentence flow better than simple models.

Embedding layer helps convert words into numbers that keep their meaning.

Training on small data is just for learning; real tasks need more data for good results.

Summary

RNNs are good for reading text word by word to classify it.

Embedding layers turn words into numbers the model can learn from.

SimpleRNN layer helps the model remember word order and context.