0
0
SciPydata~5 mins

Flat clustering (fcluster) in SciPy

Choose your learning style9 modes available
Introduction

Flat clustering groups data points into separate clusters based on a distance limit. It helps find clear groups from hierarchical clusters.

When you want to cut a hierarchical cluster tree into flat groups.
When you need to assign each data point to a cluster based on a distance threshold.
When you want to analyze or visualize clusters at a specific similarity level.
When you want to simplify complex hierarchical clusters into easy-to-understand groups.
Syntax
SciPy
scipy.cluster.hierarchy.fcluster(Z, t, criterion='distance', depth=2, R=None, monocrit=None)

Z is the linkage matrix from hierarchical clustering.

t is the threshold to cut the tree into clusters.

Examples
Cut the tree at distance 1.5 to form flat clusters.
SciPy
from scipy.cluster.hierarchy import linkage, fcluster

Z = linkage(data, method='ward')
clusters = fcluster(Z, t=1.5, criterion='distance')
Form exactly 3 clusters from the hierarchical tree.
SciPy
clusters = fcluster(Z, t=3, criterion='maxclust')
Sample Program

This code groups 5 points into clusters by cutting the hierarchical tree at distance 15. Points close together get the same cluster number.

SciPy
from scipy.cluster.hierarchy import linkage, fcluster
import numpy as np

# Sample data points
data = np.array([[1, 2], [2, 3], [10, 10], [11, 11], [50, 50]])

# Create linkage matrix using Ward method
Z = linkage(data, method='ward')

# Form flat clusters by cutting at distance 15
clusters = fcluster(Z, t=15, criterion='distance')

print(clusters)
OutputSuccess
Important Notes

The cluster labels start at 1, not 0.

Choosing the right threshold t is important to get meaningful clusters.

Common criteria are 'distance' (cut by distance) and 'maxclust' (fixed number of clusters).

Summary

Flat clustering cuts a hierarchical tree into simple groups.

Use fcluster with a threshold to assign cluster labels.

It helps turn complex cluster trees into easy-to-use clusters.