RNNs help computers understand sentences by reading words one by one. This helps to decide what category a text belongs to, like spam or not spam.
0
0
RNN for text classification in NLP
Introduction
When you want to tell if an email is spam or not by reading its words.
When you want to find out if a movie review is positive or negative.
When you want to sort news articles into topics like sports or politics.
When you want to detect the mood of a tweet, like happy or sad.
Syntax
NLP
model = Sequential() model.add(Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length)) model.add(SimpleRNN(units=hidden_units)) model.add(Dense(units=num_classes, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Embedding turns words into numbers the model can understand.
SimpleRNN reads the words one by one to learn the order and meaning.
Examples
This example builds a small RNN for classifying texts into 2 categories.
NLP
model = Sequential() model.add(Embedding(1000, 64, input_length=10)) model.add(SimpleRNN(32)) model.add(Dense(2, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
This example uses a bigger vocabulary and longer texts for 3 classes.
NLP
model = Sequential() model.add(Embedding(5000, 128, input_length=20)) model.add(SimpleRNN(64)) model.add(Dense(3, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
Sample Model
This program trains a simple RNN to classify 6 short texts into 2 groups. It shows training accuracy and predicted classes.
NLP
import numpy as np from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Embedding, SimpleRNN, Dense from tensorflow.keras.utils import to_categorical # Sample data: 6 texts, each with 5 words (word indexes) x_train = np.array([ [1, 2, 3, 4, 0], [2, 3, 4, 5, 0], [1, 3, 5, 0, 0], [4, 5, 6, 7, 0], [5, 6, 7, 8, 0], [6, 7, 8, 9, 0] ]) # Labels: 2 classes (0 or 1) y_train = np.array([0, 0, 0, 1, 1, 1]) y_train_cat = to_categorical(y_train, num_classes=2) vocab_size = 10 # words numbered 0-9 embedding_dim = 8 max_length = 5 hidden_units = 4 num_classes = 2 model = Sequential() model.add(Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length)) model.add(SimpleRNN(units=hidden_units)) model.add(Dense(units=num_classes, activation='softmax')) model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) # Train model history = model.fit(x_train, y_train_cat, epochs=10, verbose=0) # Predict on training data predictions = model.predict(x_train) predicted_classes = np.argmax(predictions, axis=1) print(f"Training accuracy: {history.history['accuracy'][-1]:.2f}") print(f"Predicted classes: {predicted_classes.tolist()}")
OutputSuccess
Important Notes
RNNs read words in order, so they understand sentence flow better than simple models.
Embedding layer helps convert words into numbers that keep their meaning.
Training on small data is just for learning; real tasks need more data for good results.
Summary
RNNs are good for reading text word by word to classify it.
Embedding layers turn words into numbers the model can learn from.
SimpleRNN layer helps the model remember word order and context.