0
0
ML Pythonml~20 mins

UMAP for dimensionality reduction in ML Python - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - UMAP for dimensionality reduction
Problem:You have a dataset with many features, making it hard to visualize or analyze. You want to reduce the data to 2 dimensions while keeping the important structure.
Current Metrics:No dimensionality reduction applied yet; visualization is cluttered and unclear.
Issue:High dimensional data is difficult to visualize and understand. Need to reduce dimensions without losing important information.
Your Task
Use UMAP to reduce the dataset from high dimensions to 2 dimensions and visualize the result clearly.
Use UMAP only for dimensionality reduction.
Keep the original dataset unchanged.
Visualize the 2D output with a scatter plot colored by class labels.
Hint 1
Hint 2
Hint 3
Solution
ML Python
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
import umap

# Load example dataset
iris = load_iris()
X = iris.data
y = iris.target

# Create UMAP reducer
reducer = umap.UMAP(n_neighbors=15, min_dist=0.1, random_state=42)

# Fit and transform data
X_umap = reducer.fit_transform(X)

# Plot the 2D UMAP result
plt.figure(figsize=(8,6))
scatter = plt.scatter(X_umap[:, 0], X_umap[:, 1], c=y, cmap='viridis', s=50)
plt.title('UMAP projection of Iris dataset')
plt.xlabel('UMAP Dimension 1')
plt.ylabel('UMAP Dimension 2')
plt.colorbar(scatter, label='Class label')
plt.grid(True)
plt.show()
Imported umap library and used UMAP class for dimensionality reduction.
Set n_neighbors=15 and min_dist=0.1 to balance local and global structure.
Transformed original data to 2D using UMAP.
Visualized the 2D output with a scatter plot colored by class labels.
Results Interpretation

Before UMAP: Data in 4D space, hard to visualize or interpret.

After UMAP: Data projected to 2D with clear clusters matching class labels, easier to analyze and visualize.

UMAP effectively reduces high-dimensional data to low dimensions while preserving meaningful structure, helping us visualize and understand complex data.
Bonus Experiment
Try changing the n_neighbors and min_dist parameters to see how the UMAP visualization changes.
💡 Hint
Lower n_neighbors focuses on local structure, higher values capture more global structure. min_dist controls how tightly points are packed.