NumPydata~5 mins

Sorting along axes in NumPy - Time & Space Complexity

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Time Complexity: Sorting along axes

O(n * m log m)

Understanding Time Complexity

Sorting data is common in data science to organize information. When sorting along an axis in numpy, we want to know how the time needed grows as the data size grows.

We ask: How does sorting time change when we have bigger arrays or sort along different axes?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import numpy as np

arr = np.random.rand(1000, 1000)
sorted_arr = np.sort(arr, axis=1)

This code creates a 1000x1000 array of random numbers and sorts each row independently.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

Primary operation: Sorting each row of the 2D array.
How many times: Sorting is done once per row, so 1000 times for 1000 rows.

How Execution Grows With Input

Sorting each row takes time depending on the row length. Doing this for all rows adds up.

Input Size (n x m)	Approx. Operations
10 x 10	10 rows x sorting 10 items each ≈ 10 x 10 log 10
100 x 100	100 rows x sorting 100 items each ≈ 100 x 100 log 100
1000 x 1000	1000 rows x sorting 1000 items each ≈ 1000 x 1000 log 1000

Pattern observation: The time grows roughly with the number of rows times the sorting cost per row, which depends on the row length times its logarithm.

Final Time Complexity

Time Complexity: O(n * m log m)

This means sorting each of the n rows of length m takes time proportional to n times m log m.

Common Mistake

[X] Wrong: "Sorting the whole 2D array is just O(n log n) because it's one big sort."

[OK] Correct: Sorting along an axis sorts each row separately, so the time depends on sorting each row, not the whole array as one list.

Interview Connect

Understanding how sorting time grows with data size helps you explain performance in real tasks. It shows you can think about how algorithms work on multi-dimensional data.

Self-Check

"What if we sorted along axis=0 (columns) instead of axis=1? How would the time complexity change?"