Dimensionality reduction visualization helps us see complex data in simple pictures. It makes many features easier to understand by showing them in 2D or 3D.
0
0
Dimensionality reduction visualization in ML Python
Introduction
When you want to explore and understand high-dimensional data like images or text.
When you want to show data patterns or groups clearly in a simple plot.
When you want to prepare data for other machine learning tasks by reducing noise.
When you want to check if your data has natural clusters or groups.
When you want to explain your data to others using easy-to-see visuals.
Syntax
ML Python
from sklearn.manifold import TSNE from sklearn.decomposition import PCA # For PCA pca = PCA(n_components=2) reduced_data = pca.fit_transform(data) # For t-SNE tsne = TSNE(n_components=2, random_state=42) reduced_data = tsne.fit_transform(data)
PCA is a fast method that finds main directions of data.
t-SNE is slower but shows clusters better for complex data.
Examples
This reduces data X to 2 dimensions using PCA.
ML Python
from sklearn.decomposition import PCA pca = PCA(n_components=2) reduced = pca.fit_transform(X)
This reduces data X to 2 dimensions using t-SNE with a fixed random seed.
ML Python
from sklearn.manifold import TSNE tsne = TSNE(n_components=2, random_state=0) reduced = tsne.fit_transform(X)
Sample Program
This program loads the Iris flower data, reduces its 4 features to 2 using PCA and t-SNE, then plots both results side by side. It also prints how much information PCA kept.
ML Python
import matplotlib.pyplot as plt from sklearn.datasets import load_iris from sklearn.decomposition import PCA from sklearn.manifold import TSNE # Load sample data iris = load_iris() X = iris.data labels = iris.target # Reduce dimensions with PCA pca = PCA(n_components=2) X_pca = pca.fit_transform(X) # Reduce dimensions with t-SNE tsne = TSNE(n_components=2, random_state=42) X_tsne = tsne.fit_transform(X) # Plot PCA result plt.figure(figsize=(12,5)) plt.subplot(1,2,1) plt.scatter(X_pca[:,0], X_pca[:,1], c=labels, cmap='viridis') plt.title('PCA Visualization') plt.xlabel('PC1') plt.ylabel('PC2') # Plot t-SNE result plt.subplot(1,2,2) plt.scatter(X_tsne[:,0], X_tsne[:,1], c=labels, cmap='viridis') plt.title('t-SNE Visualization') plt.xlabel('Dim 1') plt.ylabel('Dim 2') plt.tight_layout() plt.show() # Print explained variance ratio for PCA print(f"PCA explained variance ratios: {pca.explained_variance_ratio_}")
OutputSuccess
Important Notes
t-SNE results can change each time unless you set random_state.
PCA shows how much data variance is kept with explained_variance_ratio_.
Visualizing helps find groups or patterns that are hard to see in many dimensions.
Summary
Dimensionality reduction turns many features into 2 or 3 for easy viewing.
PCA is fast and shows main data directions; t-SNE shows clusters well but is slower.
Visual plots help understand and explain complex data simply.