Violin plot with plt.violinplot in Matplotlib - Time & Space Complexity
We want to understand how the time to draw a violin plot changes as we add more data points.
How does the plotting time grow when the input data size increases?
Analyze the time complexity of the following code snippet.
import matplotlib.pyplot as plt
import numpy as np
data = [np.random.normal(size=1000) for _ in range(3)]
plt.violinplot(data)
plt.show()
This code creates three sets of random data and draws violin plots for each set.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Processing each data point to estimate its distribution for the violin shape.
- How many times: For each data set, it processes all points once, so total points times number of sets.
As the number of data points grows, the time to compute the distribution and draw the plot grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 operations per data set |
| 100 | About 100 operations per data set |
| 1000 | About 1000 operations per data set |
Pattern observation: Doubling data points roughly doubles the work needed.
Time Complexity: O(n)
This means the time to draw the violin plot grows linearly with the number of data points.
[X] Wrong: "Adding more data points won't affect the plotting time much because the plot looks similar."
[OK] Correct: Even if the plot looks similar, the program must process every data point to estimate the distribution, so more points mean more work.
Understanding how plotting time grows with data size helps you explain performance in data visualization tasks clearly and confidently.
What if we changed the number of data sets from 3 to 10? How would the time complexity change?