Bird
Raised Fist0
ML Pythonml~20 mins

t-SNE for visualization in ML Python - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
t-SNE Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding t-SNE's Purpose

What is the main goal of using t-SNE in data analysis?

ATo perform linear regression on high-dimensional data
BTo increase the number of features in the dataset for better model training
CTo reduce high-dimensional data to a lower dimension for visualization while preserving local structure
DTo cluster data points into predefined groups based on labels
Attempts:
2 left
💡 Hint

Think about what t-SNE does to data dimensions and what it tries to keep intact.

Predict Output
intermediate
2:00remaining
t-SNE Output Shape

Given the following Python code using sklearn's t-SNE, what is the shape of tsne_result?

ML Python
from sklearn.manifold import TSNE
import numpy as np

X = np.random.rand(100, 50)  # 100 samples, 50 features
model = TSNE(n_components=2, random_state=42)
tsne_result = model.fit_transform(X)
print(tsne_result.shape)
A(100, 2)
B(50, 2)
C(2, 100)
D(100, 50)
Attempts:
2 left
💡 Hint

Remember, t-SNE reduces features but keeps the number of samples.

Hyperparameter
advanced
2:00remaining
Effect of Perplexity in t-SNE

Which statement best describes the effect of increasing the perplexity parameter in t-SNE?

AHigher perplexity reduces the number of iterations needed for convergence
BHigher perplexity decreases the number of output dimensions
CHigher perplexity increases the learning rate automatically
DHigher perplexity considers more neighbors, leading to a more global view of the data structure
Attempts:
2 left
💡 Hint

Think about how perplexity relates to neighborhood size in t-SNE.

Metrics
advanced
2:00remaining
Evaluating t-SNE Visualization Quality

Which metric is commonly used to evaluate how well t-SNE preserves local structure in the reduced space?

AK-nearest neighbor preservation score
BMean squared error between original and reduced data
CSilhouette score of clusters in original high-dimensional space
DAccuracy of a classification model trained on reduced data
Attempts:
2 left
💡 Hint

Consider metrics that measure neighborhood consistency after reduction.

🔧 Debug
expert
2:00remaining
Identifying t-SNE Runtime Error

What error will this code raise and why?

from sklearn.manifold import TSNE
import numpy as np

X = np.random.rand(10, 5)
model = TSNE(n_components=3, random_state=0)
result = model.fit_transform(X)
print(result.shape)
AValueError because n_components cannot be greater than 2 for t-SNE
BNo error, output shape will be (10, 3)
CTypeError because input data X is not a pandas DataFrame
DRuntimeWarning due to random_state not being set
Attempts:
2 left
💡 Hint

Check the allowed output dimensions for t-SNE in sklearn.

Practice

(1/5)
1. What is the main purpose of using t-SNE in machine learning?
easy
A. To increase the number of features in the dataset
B. To train a predictive model for classification
C. To visualize high-dimensional data in 2D or 3D to find patterns
D. To clean and preprocess data by removing missing values

Solution

  1. Step 1: Understand t-SNE's function

    t-SNE is a tool that reduces many features into 2 or 3 dimensions for easy visualization.
  2. Step 2: Identify its main use

    It helps us see groups or clusters in complex data, not to train models or clean data.
  3. Final Answer:

    To visualize high-dimensional data in 2D or 3D to find patterns -> Option C
  4. Quick Check:

    t-SNE = visualization tool [OK]
Hint: t-SNE = visualize complex data simply [OK]
Common Mistakes:
  • Thinking t-SNE trains prediction models
  • Confusing t-SNE with data cleaning methods
  • Assuming t-SNE increases feature count
2. Which of the following is the correct way to import t-SNE from scikit-learn in Python?
easy
A. from sklearn.manifold import TSNE
B. import tsne from sklearn
C. from sklearn.decomposition import TSNE
D. import TSNE from sklearn.manifold

Solution

  1. Step 1: Recall correct import syntax

    scikit-learn's t-SNE is in the manifold module and imported as TSNE.
  2. Step 2: Check each option

    from sklearn.manifold import TSNE uses correct Python import syntax and correct module. Others have wrong syntax or module.
  3. Final Answer:

    from sklearn.manifold import TSNE -> Option A
  4. Quick Check:

    Correct import = from sklearn.manifold import TSNE [OK]
Hint: t-SNE is in sklearn.manifold, import as TSNE [OK]
Common Mistakes:
  • Using wrong module like sklearn.decomposition
  • Incorrect import syntax causing errors
  • Confusing lowercase and uppercase in TSNE
3. What will be the shape of the output from the following code snippet?
from sklearn.manifold import TSNE
import numpy as np
X = np.random.rand(100, 50)
tsne = TSNE(n_components=2, random_state=42)
X_embedded = tsne.fit_transform(X)
print(X_embedded.shape)
medium
A. (50, 2)
B. (2, 100)
C. (100, 50)
D. (100, 2)

Solution

  1. Step 1: Understand input and t-SNE output

    Input X has 100 samples and 50 features. t-SNE reduces features to 2 dimensions.
  2. Step 2: Determine output shape

    Output shape is (number of samples, n_components) = (100, 2).
  3. Final Answer:

    (100, 2) -> Option D
  4. Quick Check:

    Output shape = (samples, components) [OK]
Hint: Output shape = (samples, n_components) [OK]
Common Mistakes:
  • Confusing features with samples in output shape
  • Swapping rows and columns in shape
  • Assuming output shape matches input shape
4. You run t-SNE on your dataset but get a ValueError: 'perplexity must be less than n_samples'. What is the likely cause and fix?
medium
A. Input data is not scaled; apply normalization
B. Perplexity is set too high; reduce it below number of samples
C. Random state is not set; set random_state parameter
D. Data contains missing values; remove or fill them

Solution

  1. Step 1: Understand the error message

    The error says perplexity must be less than number of samples, so perplexity is too large.
  2. Step 2: Fix by adjusting perplexity

    Reduce perplexity parameter to a value smaller than the number of samples in your data.
  3. Final Answer:

    Perplexity is set too high; reduce it below number of samples -> Option B
  4. Quick Check:

    Perplexity < samples [OK]
Hint: Keep perplexity less than sample count [OK]
Common Mistakes:
  • Ignoring perplexity limits and increasing it
  • Trying to fix by scaling data instead
  • Changing unrelated parameters like random_state
5. You have a dataset with 1000 samples and 100 features. You want to visualize it with t-SNE but also keep track of clusters found by KMeans. Which approach is best?
hard
A. Run KMeans first, then apply t-SNE on original data, color points by cluster
B. Apply t-SNE first, then run KMeans on the 2D t-SNE output
C. Use t-SNE only, no clustering needed for visualization
D. Run KMeans on original data and use PCA instead of t-SNE

Solution

  1. Step 1: Understand the goal

    You want to visualize data and show meaningful clusters clearly on the 2D plot.
  2. Step 2: Choose correct order

    Running KMeans first on high-dimensional data finds accurate clusters, then t-SNE visualizes them by coloring points by cluster labels.
  3. Step 3: Why not other options?

    Clustering on t-SNE output (B) is suboptimal as t-SNE distorts distances and is for visualization only, not modeling.
  4. Final Answer:

    Run KMeans first, then apply t-SNE on original data, color points by cluster -> Option A
  5. Quick Check:

    Cluster high-dim first, visualize after [OK]
Hint: Cluster original data first, then t-SNE visualize [OK]
Common Mistakes:
  • Clustering t-SNE output causing distorted clusters
  • Skipping clustering and missing group info
  • Using PCA instead of t-SNE unnecessarily