When we group data points into clusters, we want to know how good these groups are. Unlike labeled tasks, clusters have no fixed answers. So, we use special metrics to check if points in the same cluster are similar and points in different clusters are different.
Common metrics include:
- Silhouette Score: Measures how close each point is to its own cluster compared to other clusters. Higher means better grouping.
- Davies-Bouldin Index: Looks at cluster similarity and separation. Lower values mean better clusters.
- Calinski-Harabasz Index: Compares variance within clusters to variance between clusters. Higher is better.
These metrics help us decide if our clusters make sense without needing true labels.