0
0
Data Analysis Pythondata~5 mins

Jupyter Notebook setup and usage in Data Analysis Python - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Jupyter Notebook setup and usage
O(n)
Understanding Time Complexity

We want to understand how the time it takes to run code in a Jupyter Notebook changes as we add more cells or data.

How does the notebook's performance grow when we do more work inside it?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import pandas as pd

n = 10  # Define n before using it

data = pd.DataFrame({'numbers': range(n)})

result = []
for i in data['numbers']:
    result.append(i * 2)

This code creates a list of numbers from 0 to n-1 and doubles each number, storing the results.

Identify Repeating Operations
  • Primary operation: Looping through each number in the data.
  • How many times: Exactly n times, once for each number.
How Execution Grows With Input

As the number of items n grows, the time to double each number grows too.

Input Size (n)Approx. Operations
1010 operations
100100 operations
10001000 operations

Pattern observation: The operations grow directly with the input size. Double the input means double the work.

Final Time Complexity

Time Complexity: O(n)

This means the time to run the code grows in a straight line with the number of items you process.

Common Mistake

[X] Wrong: "Adding more cells or data in a Jupyter Notebook does not affect performance much."

[OK] Correct: Each cell that processes more data takes more time, so the total time grows with the amount of work done.

Interview Connect

Understanding how your code's running time grows helps you write better data analysis scripts and shows you think about efficiency, a key skill in data science.

Self-Check

"What if we replaced the for-loop with a vectorized operation like pandas' apply or map? How would the time complexity change?"