Challenge - 5 Problems
Clustering Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate1:30remaining
What is the main goal of clustering in machine learning?
Clustering groups data points based on their similarities. What is the primary purpose of clustering?
Attempts:
2 left
💡 Hint
Think about whether clustering uses labels or not.
✗ Incorrect
Clustering is an unsupervised learning method that groups data points based on similarity without using labels.
❓ Predict Output
intermediate2:00remaining
What is the output of this K-means clustering code snippet?
Given the following Python code using scikit-learn, what is the predicted cluster label for the point [1, 2]?
dbt
from sklearn.cluster import KMeans import numpy as np X = np.array([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]]) kmeans = KMeans(n_clusters=2, random_state=0).fit(X) pred = kmeans.predict([[1, 2]]) print(pred[0])
Attempts:
2 left
💡 Hint
Look at how the data points are grouped and which cluster [1, 2] is closer to.
✗ Incorrect
The points near [1, 2] form one cluster labeled 0, while points near [10, ...] form cluster 1.
❓ Model Choice
advanced1:30remaining
Which clustering algorithm is best for detecting clusters of varying shapes?
You have a dataset with clusters that are not spherical but have irregular shapes. Which algorithm is most suitable?
Attempts:
2 left
💡 Hint
Consider algorithms that do not assume cluster shape.
✗ Incorrect
DBSCAN can find clusters of arbitrary shape by grouping dense regions, unlike K-means which assumes spherical clusters.
❓ Hyperparameter
advanced1:30remaining
What effect does increasing the number of clusters (k) have in K-means?
In K-means clustering, what happens if you increase the number of clusters k too much?
Attempts:
2 left
💡 Hint
Think about what happens when you split data into many small groups.
✗ Incorrect
Increasing k creates many small clusters that may capture noise, leading to overfitting.
❓ Metrics
expert2:00remaining
Which metric evaluates clustering quality without true labels?
You want to measure how well your clustering algorithm performed but you do not have true labels. Which metric can you use?
Attempts:
2 left
💡 Hint
Look for a metric that works without knowing the correct groups.
✗ Incorrect
Silhouette score measures how similar points are within clusters compared to other clusters without needing labels.