0
0
SciPydata~5 mins

Distance matrix computation in SciPy - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Distance matrix computation
O(n²)
Understanding Time Complexity

When we compute a distance matrix, we find distances between many pairs of points.

We want to know how the time needed grows as we add more points.

Scenario Under Consideration

Analyze the time complexity of the following code snippet.


import numpy as np
from scipy.spatial import distance_matrix

n = 100  # example number of points
points = np.random.rand(n, 2)  # n points in 2D
D = distance_matrix(points, points)
    

This code creates n random points and computes the full distance matrix between them.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Calculating distance between each pair of points.
  • How many times: For n points, distances are computed for all n x n pairs.
How Execution Grows With Input

As the number of points grows, the number of distance calculations grows quickly.

Input Size (n)Approx. Operations
10100
10010,000
10001,000,000

Pattern observation: The operations grow roughly by the square of n, so doubling points makes calculations about four times more.

Final Time Complexity

Time Complexity: O(n²)

This means the time needed grows roughly with the square of the number of points.

Common Mistake

[X] Wrong: "Computing distances grows linearly with the number of points."

[OK] Correct: Each point's distance is calculated to every other point, so the total pairs grow much faster than just the number of points.

Interview Connect

Understanding how distance matrix computation scales helps you explain performance in data tasks involving many points.

Self-Check

"What if we only compute distances between points in two different sets instead of one set to itself? How would the time complexity change?"