NumPy array vs Python list performance - Performance Comparison
We want to understand how fast NumPy arrays and Python lists perform when we do operations on many items.
How does the time to complete tasks grow as we increase the number of elements?
Analyze the time complexity of the following code snippet.
import numpy as np
size = 1000000
# Create a NumPy array and a Python list
np_array = np.arange(size)
py_list = list(range(size))
# Sum all elements
np_sum = np.sum(np_array)
py_sum = sum(py_list)
This code creates a large NumPy array and a Python list, then sums all their elements.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Summing all elements by visiting each item once.
- How many times: Exactly once per element, so n times where n is the number of elements.
As the number of elements grows, the time to sum them grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 |
| 100 | 100 |
| 1000 | 1000 |
Pattern observation: Doubling the input size roughly doubles the work needed.
Time Complexity: O(n)
This means the time to sum grows linearly with the number of elements.
[X] Wrong: "NumPy sum is always slower because it does more work internally."
[OK] Correct: NumPy uses optimized C code and vectorized operations, so it usually runs faster than Python's sum over a list, even though both do the same number of element visits.
Understanding how data structures affect performance helps you choose the right tool and explain your choices clearly in interviews.
"What if we replaced the Python list with a linked list? How would the time complexity for summing change?"