Exporting to JSON in Data Analysis Python - Time & Space Complexity
We want to understand how the time to export data to JSON grows as the data size increases.
How does the work change when we have more data to save?
Analyze the time complexity of the following code snippet.
import pandas as pd
n = 1000 # Define n before using it
# Create a sample DataFrame
data = pd.DataFrame({
'A': range(n),
'B': [str(x) for x in range(n)]
})
# Export DataFrame to JSON file
data.to_json('output.json', orient='records')
This code creates a table of data with n rows and saves it as a JSON file.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Going through each row of the DataFrame to convert it to JSON format.
- How many times: Once for each of the n rows in the data.
As the number of rows grows, the time to export grows roughly the same way.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 operations |
| 100 | About 100 operations |
| 1000 | About 1000 operations |
Pattern observation: The work grows directly with the number of rows.
Time Complexity: O(n)
This means if you double the data size, the time to export roughly doubles too.
[X] Wrong: "Exporting to JSON is instant no matter how big the data is."
[OK] Correct: The program must look at every row to write it out, so bigger data takes more time.
Understanding how data export time grows helps you explain performance in real projects clearly and confidently.
"What if we export the data in chunks instead of all at once? How would the time complexity change?"