Basic histogram with plt.hist in Matplotlib - Time & Space Complexity
We want to understand how the time it takes to create a histogram changes as we add more data points.
How does the work grow when the input data gets bigger?
Analyze the time complexity of the following code snippet.
import matplotlib.pyplot as plt
import numpy as np
data = np.random.randn(1000)
plt.hist(data, bins=30)
plt.show()
This code creates a histogram of 1000 random numbers divided into 30 bins.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Counting how many data points fall into each bin.
- How many times: Each of the n data points is checked once to find its bin.
As the number of data points increases, the time to count them into bins grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 checks |
| 100 | About 100 checks |
| 1000 | About 1000 checks |
Pattern observation: Doubling the data roughly doubles the work.
Time Complexity: O(n)
This means the time to build the histogram grows linearly with the number of data points.
[X] Wrong: "The number of bins affects the time complexity a lot."
[OK] Correct: The bins are usually fixed and small compared to data size, so the main work depends on how many data points we have, not the bins.
Understanding how data size affects plotting helps you explain performance clearly and shows you can think about efficiency in real tasks.
"What if we increased the number of bins to be the same as the number of data points? How would the time complexity change?"