0
0
SciPydata~10 mins

Distance matrix computation in SciPy - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Distance matrix computation
Start with data points
Choose distance metric
Compute pairwise distances
Store results in matrix
Use matrix for analysis
We start with data points, pick a distance type, compute distances between all pairs, and save them in a matrix for further use.
Execution Sample
SciPy
from scipy.spatial import distance_matrix
import numpy as np

points = np.array([[1, 2], [4, 6], [7, 8]])
dist_mat = distance_matrix(points, points)
print(dist_mat)
This code calculates the distance matrix for three 2D points using Euclidean distance.
Execution Table
StepActionPoints involvedDistance computedMatrix update
1Start with points[[1,2],[4,6],[7,8]]-Matrix empty
2Compute distance between point 0 and 0[1,2] & [1,2]0.0dist_mat[0,0] = 0.0
3Compute distance between point 0 and 1[1,2] & [4,6]5.0dist_mat[0,1] = 5.0
4Compute distance between point 0 and 2[1,2] & [7,8]8.48528137423857dist_mat[0,2] = 8.48528137423857
5Compute distance between point 1 and 0[4,6] & [1,2]5.0dist_mat[1,0] = 5.0
6Compute distance between point 1 and 1[4,6] & [4,6]0.0dist_mat[1,1] = 0.0
7Compute distance between point 1 and 2[4,6] & [7,8]3.605551275463989dist_mat[1,2] = 3.605551275463989
8Compute distance between point 2 and 0[7,8] & [1,2]8.48528137423857dist_mat[2,0] = 8.48528137423857
9Compute distance between point 2 and 1[7,8] & [4,6]3.605551275463989dist_mat[2,1] = 3.605551275463989
10Compute distance between point 2 and 2[7,8] & [7,8]0.0dist_mat[2,2] = 0.0
11All pairs computed--Distance matrix complete
💡 All pairwise distances computed and stored in the matrix.
Variable Tracker
VariableStartAfter Step 2After Step 4After Step 7After Step 10Final
dist_matempty[[0.0, 0, 0], [0, 0, 0], [0, 0, 0]][[0.0, 5.0, 8.485], [0, 0, 0], [0, 0, 0]][[0.0, 5.0, 8.485], [5.0, 0.0, 3.606], [0, 0, 0]][[0.0, 5.0, 8.485], [5.0, 0.0, 3.606], [8.485, 3.606, 0.0]][[0.0, 5.0, 8.48528137], [5.0, 0.0, 3.60555128], [8.48528137, 3.60555128, 0.0]]
Key Moments - 3 Insights
Why is the distance between a point and itself always zero?
Because the distance formula measures how far apart two points are, and a point is zero units away from itself. See execution_table rows 2, 6, and 10 where distances are 0.0.
Why is the distance matrix symmetric?
Distance from point A to B is the same as from B to A, so dist_mat[i,j] equals dist_mat[j,i]. This is shown in rows 3 and 5, 4 and 8, 7 and 9.
What happens if we use different sets of points for rows and columns?
The matrix will have distances between each point in the first set to each point in the second set, not necessarily square or symmetric. Here, both sets are the same, so matrix is square and symmetric.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 4, what is the distance computed between points [1,2] and [7,8]?
A8.48528137423857
B5.0
C3.605551275463989
D0.0
💡 Hint
Check the 'Distance computed' column at step 4 in execution_table.
At which step does the distance between points [4,6] and [7,8] get computed?
AStep 9
BStep 6
CStep 7
DStep 3
💡 Hint
Look for the row where points involved are [4,6] & [7,8] in execution_table.
If we add a new point [0,0], how will the size of the distance matrix change?
AIt will remain 3x3
BIt will become 4x4
CIt will become 3x4
DIt will become 4x3
💡 Hint
Distance matrix size is number of points by number of points, see variable_tracker for matrix size.
Concept Snapshot
Distance matrix computation:
- Input: list/array of points
- Use scipy.spatial.distance_matrix(pointsA, pointsB)
- Computes pairwise distances (default Euclidean)
- Returns matrix with distances between each pair
- Matrix is square and symmetric if pointsA == pointsB
Full Transcript
Distance matrix computation starts with a set of points. We pick a distance metric, usually Euclidean. Then, for every pair of points, we calculate how far apart they are. These distances fill a matrix where each cell shows the distance between two points. The diagonal is zero because a point is zero distance from itself. The matrix is symmetric because distance from A to B equals distance from B to A. This matrix helps in many data science tasks like clustering or nearest neighbor search.