0
0
Pandasdata~5 mins

Scatter plots in Pandas - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Scatter plots
O(n)
Understanding Time Complexity

We want to understand how the time to create a scatter plot changes as the data size grows.

How does plotting many points affect the time it takes?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import pandas as pd
import matplotlib.pyplot as plt

n = 1000  # example size

data = pd.DataFrame({
    'x': range(n),
    'y': range(n)
})

plt.scatter(data['x'], data['y'])
plt.show()

This code creates a scatter plot of n points using pandas and matplotlib.

Identify Repeating Operations
  • Primary operation: Plotting each point on the scatter plot.
  • How many times: Once for each of the n points in the data.
How Execution Grows With Input

As the number of points increases, the time to plot grows roughly in direct proportion.

Input Size (n)Approx. Operations
1010 operations (plotting 10 points)
100100 operations (plotting 100 points)
10001000 operations (plotting 1000 points)

Pattern observation: The time grows linearly as the number of points increases.

Final Time Complexity

Time Complexity: O(n)

This means the time to create the scatter plot grows directly with the number of points.

Common Mistake

[X] Wrong: "Plotting a scatter plot takes the same time no matter how many points there are."

[OK] Correct: Each point must be drawn, so more points mean more work and more time.

Interview Connect

Understanding how plotting time grows helps you explain performance when working with large datasets in real projects.

Self-Check

"What if we used a sampling method to plot only a fraction of points? How would the time complexity change?"