0
0
NLPml~5 mins

Visualizing embeddings (t-SNE) in NLP

Choose your learning style9 modes available
Introduction

We use t-SNE to turn complex word or sentence numbers into pictures. This helps us see how similar or different words are in a simple way.

You want to see how words group together by meaning.
You have many word numbers and want to show them in 2D or 3D.
You want to check if your word model learned good relationships.
You want to explain word similarities to friends or teammates.
Syntax
NLP
from sklearn.manifold import TSNE

tsne = TSNE(n_components=2, perplexity=30, random_state=42)
embeddings_2d = tsne.fit_transform(embeddings)

n_components sets the output dimension (2D or 3D).

perplexity balances attention between local and global data structure.

Examples
Basic 2D t-SNE visualization of embeddings.
NLP
tsne = TSNE(n_components=2)
embeddings_2d = tsne.fit_transform(embeddings)
3D t-SNE with higher perplexity for more global structure.
NLP
tsne = TSNE(n_components=3, perplexity=40)
embeddings_3d = tsne.fit_transform(embeddings)
Sample Model

This code shows how to use t-SNE to turn 4D word embeddings into 2D points. It prints the new 2D points and draws a simple plot with word labels.

NLP
import numpy as np
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt

# Sample word embeddings for 5 words (random for demo)
embeddings = np.array([
    [0.1, 0.3, 0.5, 0.7],  # word1
    [0.2, 0.1, 0.4, 0.6],  # word2
    [0.9, 0.8, 0.7, 0.6],  # word3
    [0.85, 0.75, 0.65, 0.55],  # word4
    [0.15, 0.25, 0.35, 0.45]   # word5
])

# Create t-SNE model
tsne = TSNE(n_components=2, random_state=0)

# Transform embeddings to 2D
embeddings_2d = tsne.fit_transform(embeddings)

# Print 2D embeddings
print('2D embeddings:')
print(embeddings_2d)

# Plot the 2D embeddings
plt.scatter(embeddings_2d[:, 0], embeddings_2d[:, 1])
for i, txt in enumerate(['word1', 'word2', 'word3', 'word4', 'word5']):
    plt.annotate(txt, (embeddings_2d[i, 0], embeddings_2d[i, 1]))
plt.title('t-SNE visualization of word embeddings')
plt.xlabel('Dimension 1')
plt.ylabel('Dimension 2')
plt.grid(True)
plt.show()
OutputSuccess
Important Notes

t-SNE is slow for very large data sets; use a smaller sample or other methods for big data.

Results can change each run unless you set random_state for repeatability.

t-SNE shows relative distances, not exact numbers, so use it to explore patterns, not exact values.

Summary

t-SNE helps turn complex word numbers into easy-to-see pictures.

You can use it to check if words with similar meanings group together.

It works best with small to medium data and needs some tuning like perplexity.