0
0
ML Pythonml~20 mins

t-SNE for visualization in ML Python - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - t-SNE for visualization
Problem:You want to visualize high-dimensional data in 2D to understand its structure. Currently, you use t-SNE with default parameters on the digits dataset, but the clusters overlap and are not well separated.
Current Metrics:Visual inspection shows overlapping clusters with unclear separation in the 2D plot.
Issue:The t-SNE visualization is not clear enough to distinguish different digit groups due to default perplexity and learning rate settings.
Your Task
Improve the t-SNE visualization so that clusters of different digits are more distinct and separated in the 2D plot.
You can only change t-SNE hyperparameters like perplexity, learning rate, and number of iterations.
Do not change the dataset or use other dimensionality reduction methods.
Hint 1
Hint 2
Hint 3
Solution
ML Python
from sklearn.datasets import load_digits
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt

# Load digits dataset
digits = load_digits()
X = digits.data
labels = digits.target

# Configure t-SNE with improved hyperparameters
tsne = TSNE(n_components=2, perplexity=30, learning_rate=200, n_iter=1000, random_state=42)
X_embedded = tsne.fit_transform(X)

# Plot the 2D t-SNE result
plt.figure(figsize=(8, 6))
scatter = plt.scatter(X_embedded[:, 0], X_embedded[:, 1], c=labels, cmap='tab10', s=15)
plt.colorbar(scatter, ticks=range(10), label='Digit Label')
plt.title('t-SNE Visualization of Digits Dataset')
plt.xlabel('t-SNE Dimension 1')
plt.ylabel('t-SNE Dimension 2')
plt.grid(True)
plt.show()
Set perplexity to 30 to balance local and global data structure.
Increased learning rate to 200 for better optimization steps.
Increased number of iterations to 1000 for more stable convergence.
Results Interpretation

Before: Clusters overlap and are hard to distinguish.

After: Clusters are more distinct and separated, making digit groups easier to identify.

Adjusting t-SNE hyperparameters like perplexity, learning rate, and iterations can significantly improve the quality of visualization by better capturing the data's structure.
Bonus Experiment
Try using PCA to reduce the data to 50 dimensions before applying t-SNE and observe if the visualization improves.
💡 Hint
Reducing dimensions with PCA can remove noise and speed up t-SNE while preserving important structure.