Bird
Raised Fist0
ML Pythonml~10 mins

t-SNE for visualization in ML Python - Interactive Code Practice

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to import the t-SNE class from scikit-learn.

ML Python
from sklearn.manifold import [1]
Drag options to blanks, or click blank then click option'
ATSNE
BtSNE
Ctsne
DTSne
Attempts:
3 left
💡 Hint
Common Mistakes
Using lowercase or incorrect capitalization for TSNE.
Trying to import from sklearn.cluster instead of sklearn.manifold.
2fill in blank
medium

Complete the code to create a t-SNE object with 2 output dimensions.

ML Python
tsne = TSNE(n_components=[1])
Drag options to blanks, or click blank then click option'
A1
B3
C0
D2
Attempts:
3 left
💡 Hint
Common Mistakes
Setting n_components to 3 or 1 which are not typical for 2D visualization.
Using 0 which is invalid.
3fill in blank
hard

Fix the error in the code to fit and transform data using t-SNE.

ML Python
X_embedded = tsne.[1](X)
Drag options to blanks, or click blank then click option'
Afit
Bfit_transform
Ctransform
Dfit_predict
Attempts:
3 left
💡 Hint
Common Mistakes
Using fit() alone which does not return transformed data.
Using transform() which is not supported by t-SNE.
Using fit_predict() which is not a method of TSNE.
4fill in blank
hard

Fill both blanks to create a scatter plot of the t-SNE results with colors.

ML Python
plt.scatter(X_embedded[:, [1]], X_embedded[:, [2]], c=labels, cmap='viridis')
Drag options to blanks, or click blank then click option'
A0
B1
C2
D3
Attempts:
3 left
💡 Hint
Common Mistakes
Using indices 1 and 2 which may cause index errors.
Using 2 or 3 which are out of range for 2D data.
5fill in blank
hard

Fill all three blanks to create a dictionary comprehension that maps each label to the count of points with that label.

ML Python
label_counts = {label: sum(1 for x in labels if x [1] label) for label in set(labels) if label [2] 0 and label [3] -1}
Drag options to blanks, or click blank then click option'
A==
B>
C!=
D<
Attempts:
3 left
💡 Hint
Common Mistakes
Using wrong comparison operators causing logic errors.
Confusing != and == in conditions.

Practice

(1/5)
1. What is the main purpose of using t-SNE in machine learning?
easy
A. To increase the number of features in the dataset
B. To train a predictive model for classification
C. To visualize high-dimensional data in 2D or 3D to find patterns
D. To clean and preprocess data by removing missing values

Solution

  1. Step 1: Understand t-SNE's function

    t-SNE is a tool that reduces many features into 2 or 3 dimensions for easy visualization.
  2. Step 2: Identify its main use

    It helps us see groups or clusters in complex data, not to train models or clean data.
  3. Final Answer:

    To visualize high-dimensional data in 2D or 3D to find patterns -> Option C
  4. Quick Check:

    t-SNE = visualization tool [OK]
Hint: t-SNE = visualize complex data simply [OK]
Common Mistakes:
  • Thinking t-SNE trains prediction models
  • Confusing t-SNE with data cleaning methods
  • Assuming t-SNE increases feature count
2. Which of the following is the correct way to import t-SNE from scikit-learn in Python?
easy
A. from sklearn.manifold import TSNE
B. import tsne from sklearn
C. from sklearn.decomposition import TSNE
D. import TSNE from sklearn.manifold

Solution

  1. Step 1: Recall correct import syntax

    scikit-learn's t-SNE is in the manifold module and imported as TSNE.
  2. Step 2: Check each option

    from sklearn.manifold import TSNE uses correct Python import syntax and correct module. Others have wrong syntax or module.
  3. Final Answer:

    from sklearn.manifold import TSNE -> Option A
  4. Quick Check:

    Correct import = from sklearn.manifold import TSNE [OK]
Hint: t-SNE is in sklearn.manifold, import as TSNE [OK]
Common Mistakes:
  • Using wrong module like sklearn.decomposition
  • Incorrect import syntax causing errors
  • Confusing lowercase and uppercase in TSNE
3. What will be the shape of the output from the following code snippet?
from sklearn.manifold import TSNE
import numpy as np
X = np.random.rand(100, 50)
tsne = TSNE(n_components=2, random_state=42)
X_embedded = tsne.fit_transform(X)
print(X_embedded.shape)
medium
A. (50, 2)
B. (2, 100)
C. (100, 50)
D. (100, 2)

Solution

  1. Step 1: Understand input and t-SNE output

    Input X has 100 samples and 50 features. t-SNE reduces features to 2 dimensions.
  2. Step 2: Determine output shape

    Output shape is (number of samples, n_components) = (100, 2).
  3. Final Answer:

    (100, 2) -> Option D
  4. Quick Check:

    Output shape = (samples, components) [OK]
Hint: Output shape = (samples, n_components) [OK]
Common Mistakes:
  • Confusing features with samples in output shape
  • Swapping rows and columns in shape
  • Assuming output shape matches input shape
4. You run t-SNE on your dataset but get a ValueError: 'perplexity must be less than n_samples'. What is the likely cause and fix?
medium
A. Input data is not scaled; apply normalization
B. Perplexity is set too high; reduce it below number of samples
C. Random state is not set; set random_state parameter
D. Data contains missing values; remove or fill them

Solution

  1. Step 1: Understand the error message

    The error says perplexity must be less than number of samples, so perplexity is too large.
  2. Step 2: Fix by adjusting perplexity

    Reduce perplexity parameter to a value smaller than the number of samples in your data.
  3. Final Answer:

    Perplexity is set too high; reduce it below number of samples -> Option B
  4. Quick Check:

    Perplexity < samples [OK]
Hint: Keep perplexity less than sample count [OK]
Common Mistakes:
  • Ignoring perplexity limits and increasing it
  • Trying to fix by scaling data instead
  • Changing unrelated parameters like random_state
5. You have a dataset with 1000 samples and 100 features. You want to visualize it with t-SNE but also keep track of clusters found by KMeans. Which approach is best?
hard
A. Run KMeans first, then apply t-SNE on original data, color points by cluster
B. Apply t-SNE first, then run KMeans on the 2D t-SNE output
C. Use t-SNE only, no clustering needed for visualization
D. Run KMeans on original data and use PCA instead of t-SNE

Solution

  1. Step 1: Understand the goal

    You want to visualize data and show meaningful clusters clearly on the 2D plot.
  2. Step 2: Choose correct order

    Running KMeans first on high-dimensional data finds accurate clusters, then t-SNE visualizes them by coloring points by cluster labels.
  3. Step 3: Why not other options?

    Clustering on t-SNE output (B) is suboptimal as t-SNE distorts distances and is for visualization only, not modeling.
  4. Final Answer:

    Run KMeans first, then apply t-SNE on original data, color points by cluster -> Option A
  5. Quick Check:

    Cluster high-dim first, visualize after [OK]
Hint: Cluster original data first, then t-SNE visualize [OK]
Common Mistakes:
  • Clustering t-SNE output causing distorted clusters
  • Skipping clustering and missing group info
  • Using PCA instead of t-SNE unnecessarily