Challenge - 5 Problems

🎖️

Clustering Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

1:30remaining

What is the main goal of clustering in machine learning?

Clustering groups data points based on their similarities. What is the primary purpose of clustering?

ATo reduce the number of features in the dataset

BTo predict the output for new data points

CTo group similar data points without using labeled data

DTo split data into training and testing sets

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

What is the output of this K-means clustering code snippet?

Given the following Python code using scikit-learn, what is the predicted cluster label for the point [1, 2]?

dbt

from sklearn.cluster import KMeans
import numpy as np

X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
kmeans = KMeans(n_clusters=2, random_state=0).fit(X)
pred = kmeans.predict([[1, 2]])
print(pred[0])

DError: n_clusters must be less than or equal to number of samples

Attempts:

2 left

❓ Model Choice

advanced

1:30remaining

Which clustering algorithm is best for detecting clusters of varying shapes?

You have a dataset with clusters that are not spherical but have irregular shapes. Which algorithm is most suitable?

ADBSCAN clustering

BK-means clustering

CHierarchical clustering with single linkage

DLinear regression

Attempts:

2 left

❓ Hyperparameter

advanced

1:30remaining

What effect does increasing the number of clusters (k) have in K-means?

In K-means clustering, what happens if you increase the number of clusters k too much?

AClusters become too general and lose detail

BThe algorithm runs faster

CThe model automatically finds the best k

DClusters become smaller and may overfit the data

Attempts:

2 left

❓ Metrics

expert

2:00remaining

Which metric evaluates clustering quality without true labels?

You want to measure how well your clustering algorithm performed but you do not have true labels. Which metric can you use?

ASilhouette score

BAccuracy

CMean squared error

DConfusion matrix

Attempts:

2 left