Practical uses of structured arrays in NumPy - Time & Space Complexity
We want to understand how the time to work with structured arrays grows as the data size increases.
Specifically, how does accessing and processing fields in structured arrays scale with more data?
Analyze the time complexity of the following code snippet.
import numpy as np
# Create a structured array with 3 fields
data = np.zeros(1000, dtype=[('name', 'U10'), ('age', 'i4'), ('score', 'f4')])
# Access the 'age' field and compute the mean
mean_age = np.mean(data['age'])
# Filter entries where score > 50
high_scores = data[data['score'] > 50]
This code creates a structured array, accesses one field to compute a mean, and filters rows based on a field condition.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Traversing the array elements to access a specific field.
- How many times: Once per element for each operation (mean calculation and filtering).
As the number of elements grows, the time to access and process fields grows proportionally.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 field accesses and comparisons |
| 100 | About 100 field accesses and comparisons |
| 1000 | About 1000 field accesses and comparisons |
Pattern observation: The operations grow linearly with the number of elements.
Time Complexity: O(n)
This means the time to access or filter data grows directly in proportion to the number of records.
[X] Wrong: "Accessing a field in a structured array is instant regardless of size."
[OK] Correct: Each access requires looking at every element, so time grows with data size.
Understanding how structured arrays scale helps you explain data handling efficiency clearly in interviews.
"What if we used a regular 2D array instead of a structured array? How would the time complexity change?"