Saving and loading data (scipy.io) - Time & Space Complexity
When saving or loading data with scipy.io, we want to know how the time needed changes as the data size grows.
We ask: How does the time to save or load data grow when the data gets bigger?
Analyze the time complexity of the following code snippet.
import numpy as np
from scipy import io
data = np.random.rand(1000, 1000) # Create a large array
io.savemat('datafile.mat', {'array': data}) # Save data to a .mat file
loaded = io.loadmat('datafile.mat') # Load data back from the file
This code creates a large array, saves it to a file, and then loads it back into memory.
- Primary operation: Reading or writing each element of the array to or from disk.
- How many times: Once for each element in the array (all 1,000,000 elements).
As the data size grows, the time to save or load grows roughly in proportion to the number of elements.
| Input Size (n x n) | Approx. Operations |
|---|---|
| 10 x 10 | 100 |
| 100 x 100 | 10,000 |
| 1000 x 1000 | 1,000,000 |
Pattern observation: Doubling the size in each dimension multiplies the total operations by the square, so time grows linearly with total elements.
Time Complexity: O(n)
This means the time to save or load data grows directly with the number of elements in the data.
[X] Wrong: "Saving or loading data takes the same time no matter how big the data is."
[OK] Correct: The time depends on how many elements are saved or loaded, so bigger data takes more time.
Understanding how saving and loading time grows helps you handle large datasets efficiently and shows you know how data size affects performance.
"What if we compressed the data before saving? How would the time complexity change?"