Challenge - 5 Problems

🎖️

PCA Mastery Badge

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

What does PCA primarily do to the data?

Imagine you have a dataset with many features. What is the main goal of applying Principal Component Analysis (PCA) to this data?

AIt removes all noise from the data by deleting random samples.

BIt increases the number of features by adding random values to the dataset.

CIt finds new features that are combinations of the original ones, capturing the most variance.

DIt sorts the data points based on their distance from the origin.

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output of PCA transformation on a simple dataset

What is the output of the following Python code using PCA?

ML Python

from sklearn.decomposition import PCA
import numpy as np

X = np.array([[2, 0], [0, 2], [3, 3]])
pca = PCA(n_components=1)
X_pca = pca.fit_transform(X)
print(X_pca.round(2))

[[-0.94]
 [-0.94]
 [ 1.89]]

[[2. 0.]
 [0. 2.]
 [3. 3.]]

[[ 2.83]
 [-1.41]
 [ 1.41]]

[[0. 0.]
 [0. 0.]
 [0. 0.]]

Attempts:

2 left

❓ Hyperparameter

advanced

2:00remaining

Choosing the number of components in PCA

You want to reduce your dataset's dimensions using PCA but keep at least 90% of the variance. Which approach correctly helps you decide the number of components?

ASet n_components to the total number of original features always.

BSet n_components to a random number less than the number of features.

CSet n_components to 1 regardless of variance explained.

DSet n_components to 0.9 in PCA to keep 90% variance automatically.

Attempts:

2 left

❓ Metrics

advanced

2:00remaining

Interpreting explained variance ratio from PCA

After fitting PCA on a dataset, you get explained variance ratios: [0.6, 0.3, 0.1]. What does this mean?

AEach component explains the same amount of variance.

BThe first component explains 60% of variance, the second 30%, and the third 10%.

CThe total variance explained is 30%.

DThe components are sorted by increasing variance explained.

Attempts:

2 left

🔧 Debug

expert

2:00remaining

Why does PCA output have unexpected shape?

You run this code but get an output shape of (5, 5) instead of (5, 2) as expected:

from sklearn.decomposition import PCA
import numpy as np

X = np.random.rand(5, 3)
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)
print(X_pca.shape)

What is the most likely reason?

AYou accidentally imported PCA from a different library that returns all components by default.

BThe input data X has 5 features, so PCA returns 5 components ignoring n_components.

CYou set n_components to 2 but the data has only 2 samples, so PCA returns 5 components.

DYou did not fit PCA before transforming, so it returns the original shape.

Attempts:

2 left