Why scatter plots show relationships in Matplotlib - Performance Analysis
We want to understand how the time it takes to create a scatter plot changes as we add more points.
How does the number of points affect the work matplotlib does to show relationships?
Analyze the time complexity of the following code snippet.
import matplotlib.pyplot as plt
import numpy as np
n = 1000
x = np.random.rand(n)
y = np.random.rand(n)
plt.scatter(x, y)
plt.show()
This code creates a scatter plot with n points randomly placed on the graph.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Plotting each point on the graph.
- How many times: Once for each of the
npoints.
As the number of points increases, the work to plot each point adds up directly.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 plotting actions |
| 100 | 100 plotting actions |
| 1000 | 1000 plotting actions |
Pattern observation: The work grows in a straight line with the number of points.
Time Complexity: O(n)
This means the time to create the scatter plot grows directly with the number of points.
[X] Wrong: "Adding more points doesn't affect the time much because the plot is just one image."
[OK] Correct: Each point requires drawing work, so more points mean more drawing steps and longer time.
Understanding how plotting time grows helps you explain performance in data visualization tasks clearly and confidently.
"What if we added color or size variations for each point? How would the time complexity change?"