Why might the Calinski-Harabasz Index give misleading results on clusters with very different sizes?

hard📝 Conceptual Q10 of 15

SciPy - Clustering and Distance

AIt is sensitive to the number of features, not cluster size

BIt assumes clusters have similar variance and size

CIt only measures cluster compactness, ignoring separation

DIt requires true labels to work correctly

Step-by-Step Solution

Solution:

Step 1: Understand Calinski-Harabasz assumptions
This index assumes clusters are roughly similar in size and variance for meaningful comparison.
Step 2: Explain why different sizes cause issues
Clusters with very different sizes can distort the variance ratio, misleading the index.
Final Answer:
It assumes clusters have similar variance and size -> Option B
Quick Check:
Calinski-Harabasz assumption = D [OK]

Quick Trick: Calinski-Harabasz assumes similar cluster sizes [OK]

Common Mistakes:

Master "Clustering and Distance" in SciPy

9 interactive learning modes - each teaches the same concept differently

Want More Practice?

15+ quiz questions · All difficulty levels · Free

More SciPy Quizzes