A dendrogram helps us see how data groups together step by step. It shows the order and distance of joining groups in a tree-like picture.
0
0
Dendrogram visualization in SciPy
Introduction
When you want to find groups or clusters in your data without labels.
To understand how similar or different items are in a dataset.
When you want to decide the number of clusters by looking at the tree.
To visualize hierarchical relationships in data like family trees or document topics.
When exploring data before applying other machine learning methods.
Syntax
SciPy
from scipy.cluster.hierarchy import dendrogram dendrogram(Z, p=30, truncate_mode=None, color_threshold=None, get_leaves=True, orientation='top', labels=None, leaf_rotation=0, leaf_font_size=10, show_contracted=False, show_leaf_counts=True)
Z is the linkage matrix from hierarchical clustering.
You can customize the dendrogram look with options like truncate_mode to shorten the tree or orientation to change its direction.
Examples
Basic dendrogram plot from linkage matrix
Z.SciPy
from scipy.cluster.hierarchy import dendrogram import matplotlib.pyplot as plt dendrogram(Z) plt.show()
Shows only the last 5 merged clusters to simplify the tree.
SciPy
dendrogram(Z, truncate_mode='lastp', p=5) plt.show()
Draws dendrogram with leaves on the left and rotated labels for better readability.
SciPy
dendrogram(Z, orientation='left', leaf_rotation=90, leaf_font_size=12) plt.show()
Sample Program
This code creates a dendrogram from 5 points. It shows how points group together by distance. Labels A to E identify each point.
SciPy
import numpy as np from scipy.cluster.hierarchy import linkage, dendrogram import matplotlib.pyplot as plt # Sample data: 5 points in 2D X = np.array([[1, 2], [2, 3], [3, 2], [8, 7], [7, 8]]) # Create linkage matrix using 'ward' method Z = linkage(X, method='ward') # Plot dendrogram plt.figure(figsize=(6, 4)) dendrogram(Z, labels=['A', 'B', 'C', 'D', 'E'], leaf_rotation=45) plt.title('Dendrogram Example') plt.xlabel('Sample') plt.ylabel('Distance') plt.tight_layout() plt.show()
OutputSuccess
Important Notes
Make sure to import matplotlib.pyplot to display the dendrogram plot.
The linkage matrix Z is created from your data using methods like 'ward', 'single', or 'complete'.
Labels help identify leaves; if none are given, numeric indices are used.
Summary
Dendrograms visualize hierarchical clustering as a tree.
They help understand data grouping and distances between clusters.
Use scipy.cluster.hierarchy.dendrogram with a linkage matrix to create them.