Introduction
UMAP helps us shrink big data with many features into fewer features so we can see patterns more easily.
Jump into concepts and practice - no test required
import umap reducer = umap.UMAP(n_neighbors=15, n_components=2, metric='euclidean') embedding = reducer.fit_transform(data)
import umap reducer = umap.UMAP(n_neighbors=10, n_components=2) embedding = reducer.fit_transform(data)
import umap reducer = umap.UMAP(n_neighbors=30, n_components=3, metric='manhattan') embedding = reducer.fit_transform(data)
import numpy as np import umap from sklearn.datasets import load_iris # Load sample data iris = load_iris() data = iris.data # Create UMAP reducer reducer = umap.UMAP(n_neighbors=15, n_components=2, metric='euclidean', random_state=42) # Fit and transform data embedding = reducer.fit_transform(data) # Print shape and first 5 points print('Embedding shape:', embedding.shape) print('First 5 points of embedding:') print(embedding[:5])
UMAP in machine learning?n_components=2 on a dataset with 100 samples and 50 features?n_neighbors=5 on a dataset but get an error. What is the most likely cause?n_components=2, n_neighbors=50 for maximum neighbor info uses 2D, not 3D. n_components=3, n_neighbors=1000 to use all samples as neighbors uses too many neighbors, slowing computation. n_components=10, n_neighbors=5 for detailed high dimensions uses 10 components, not 3D.