0
0
ML Pythonml~5 mins

t-SNE for visualization in ML Python

Choose your learning style9 modes available
Introduction
t-SNE helps us see complex data by turning many features into just two or three, so we can easily understand patterns and groups.
When you want to see how different groups of data look in a simple 2D or 3D picture.
When you have many features and want to find hidden clusters or groups.
When you want to check if your data points are similar or different before building a model.
When you want to explore high-dimensional data like images or text in a visual way.
Syntax
ML Python
from sklearn.manifold import TSNE

tsne = TSNE(n_components=2, perplexity=30, random_state=42)
X_embedded = tsne.fit_transform(X)
n_components sets how many dimensions you want to reduce to (usually 2 or 3 for visualization).
perplexity controls how t-SNE balances attention between local and global data structure; typical values are between 5 and 50.
Examples
Basic 2D t-SNE with default settings.
ML Python
tsne = TSNE(n_components=2)
X_embedded = tsne.fit_transform(X)
3D t-SNE with higher perplexity and fixed random state for reproducibility.
ML Python
tsne = TSNE(n_components=3, perplexity=40, random_state=0)
X_embedded = tsne.fit_transform(X)
Sample Model
This code loads the Iris flower dataset, applies t-SNE to reduce its 4 features to 2, and then plots the points colored by their species to show how t-SNE groups similar flowers.
ML Python
from sklearn.datasets import load_iris
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt

# Load sample data
iris = load_iris()
X = iris.data
labels = iris.target

# Create t-SNE model
tsne = TSNE(n_components=2, perplexity=30, random_state=42)
X_embedded = tsne.fit_transform(X)

# Plot the results
plt.figure(figsize=(6,5))
for label in set(labels):
    plt.scatter(X_embedded[labels == label, 0], X_embedded[labels == label, 1], label=iris.target_names[label])
plt.legend()
plt.title('t-SNE visualization of Iris dataset')
plt.xlabel('t-SNE feature 1')
plt.ylabel('t-SNE feature 2')
plt.tight_layout()
plt.show()
OutputSuccess
Important Notes
t-SNE is good for visualization but not for general dimensionality reduction for modeling.
It can be slow on very large datasets; consider sampling or using faster alternatives if needed.
Results can vary between runs unless you set random_state for reproducibility.
Summary
t-SNE turns complex data into simple 2D or 3D pictures to help us see patterns.
It works well to find groups or clusters in data with many features.
Use it mainly for exploring and understanding data visually, not for final modeling.