0
0
SciPydata~5 mins

Dendrogram visualization in SciPy

Choose your learning style9 modes available
Introduction

A dendrogram helps us see how data groups together step by step. It shows the order and distance of joining groups in a tree-like picture.

When you want to find groups or clusters in your data without labels.
To understand how similar or different items are in a dataset.
When you want to decide the number of clusters by looking at the tree.
To visualize hierarchical relationships in data like family trees or document topics.
When exploring data before applying other machine learning methods.
Syntax
SciPy
from scipy.cluster.hierarchy import dendrogram

dendrogram(Z, p=30, truncate_mode=None, color_threshold=None, get_leaves=True, orientation='top', labels=None, leaf_rotation=0, leaf_font_size=10, show_contracted=False, show_leaf_counts=True)

Z is the linkage matrix from hierarchical clustering.

You can customize the dendrogram look with options like truncate_mode to shorten the tree or orientation to change its direction.

Examples
Basic dendrogram plot from linkage matrix Z.
SciPy
from scipy.cluster.hierarchy import dendrogram
import matplotlib.pyplot as plt

dendrogram(Z)
plt.show()
Shows only the last 5 merged clusters to simplify the tree.
SciPy
dendrogram(Z, truncate_mode='lastp', p=5)
plt.show()
Draws dendrogram with leaves on the left and rotated labels for better readability.
SciPy
dendrogram(Z, orientation='left', leaf_rotation=90, leaf_font_size=12)
plt.show()
Sample Program

This code creates a dendrogram from 5 points. It shows how points group together by distance. Labels A to E identify each point.

SciPy
import numpy as np
from scipy.cluster.hierarchy import linkage, dendrogram
import matplotlib.pyplot as plt

# Sample data: 5 points in 2D
X = np.array([[1, 2], [2, 3], [3, 2], [8, 7], [7, 8]])

# Create linkage matrix using 'ward' method
Z = linkage(X, method='ward')

# Plot dendrogram
plt.figure(figsize=(6, 4))
dendrogram(Z, labels=['A', 'B', 'C', 'D', 'E'], leaf_rotation=45)
plt.title('Dendrogram Example')
plt.xlabel('Sample')
plt.ylabel('Distance')
plt.tight_layout()
plt.show()
OutputSuccess
Important Notes

Make sure to import matplotlib.pyplot to display the dendrogram plot.

The linkage matrix Z is created from your data using methods like 'ward', 'single', or 'complete'.

Labels help identify leaves; if none are given, numeric indices are used.

Summary

Dendrograms visualize hierarchical clustering as a tree.

They help understand data grouping and distances between clusters.

Use scipy.cluster.hierarchy.dendrogram with a linkage matrix to create them.