Challenge - 5 Problems
Flat Clustering Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of fcluster with distance threshold
What is the output array of cluster labels when using
fcluster with t=1.5 on the given linkage matrix?SciPy
from scipy.cluster.hierarchy import linkage, fcluster import numpy as np X = np.array([[1, 2], [2, 2], [5, 5], [6, 5]]) Z = linkage(X, method='single') clusters = fcluster(Z, t=1.5, criterion='distance') print(clusters)
Attempts:
2 left
💡 Hint
Clusters are formed by cutting the dendrogram at the given distance threshold.
✗ Incorrect
The linkage groups the first two points together and the last two points together because their distances are less than 1.5. So, fcluster assigns labels [1,1,2,2].
❓ data_output
intermediate2:00remaining
Number of clusters formed by fcluster
Using the linkage matrix
Z from the code below, how many clusters are formed when using fcluster with t=3 and criterion='distance'?SciPy
from scipy.cluster.hierarchy import linkage, fcluster import numpy as np X = np.array([[1, 1], [2, 1], [4, 4], [5, 5], [10, 10]]) Z = linkage(X, method='complete') clusters = fcluster(Z, t=3, criterion='distance') num_clusters = len(set(clusters)) print(num_clusters)
Attempts:
2 left
💡 Hint
Count unique cluster labels after applying fcluster.
✗ Incorrect
The distance threshold 3 splits the data into three clusters: two close points near (1,1), two near (4,4), and one far point at (10,10).
🔧 Debug
advanced2:00remaining
Identify the error in fcluster usage
What error will this code raise when running
fcluster with criterion='maxclust' but without specifying t properly?SciPy
from scipy.cluster.hierarchy import linkage, fcluster import numpy as np X = np.array([[0, 0], [1, 1], [5, 5]]) Z = linkage(X, method='ward') clusters = fcluster(Z, criterion='maxclust') print(clusters)
Attempts:
2 left
💡 Hint
Check the required arguments for fcluster function.
✗ Incorrect
The function fcluster requires the threshold parameter 't' even when using 'maxclust' criterion. Omitting it causes a TypeError.
🧠 Conceptual
advanced2:00remaining
Understanding fcluster criterion 'inconsistent'
Which statement best describes how
fcluster forms clusters when using criterion='inconsistent'?Attempts:
2 left
💡 Hint
Inconsistency measures how different a link is compared to links below it.
✗ Incorrect
The 'inconsistent' criterion uses inconsistency statistics to decide where to cut the dendrogram, grouping links with inconsistency below the threshold.
🚀 Application
expert3:00remaining
Predict cluster labels for a custom dataset
Given the dataset and linkage below, which option shows the correct cluster labels when using
fcluster with t=2 and criterion='distance'?SciPy
from scipy.cluster.hierarchy import linkage, fcluster import numpy as np X = np.array([[0, 0], [0, 1], [1, 0], [5, 5], [6, 5], [5, 6]]) Z = linkage(X, method='average') clusters = fcluster(Z, t=2, criterion='distance') print(clusters)
Attempts:
2 left
💡 Hint
Points close together form clusters within the distance threshold.
✗ Incorrect
The first three points are close and form cluster 1; the last three points are close and form cluster 2 when cut at distance 2.