0
0
Pandasdata~5 mins

Selecting rows by condition in Pandas - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Selecting rows by condition
O(n)
Understanding Time Complexity

We want to know how the time to select rows by a condition changes as the data grows.

How does the work increase when the table gets bigger?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import pandas as pd

df = pd.DataFrame({'age': [23, 45, 12, 36, 27] * 200})

selected = df[df['age'] > 30]

This code creates a table with ages and selects rows where age is greater than 30.

Identify Repeating Operations
  • Primary operation: Checking each row's 'age' value against 30.
  • How many times: Once for every row in the table.
How Execution Grows With Input

As the number of rows grows, the time to check each row grows at the same rate.

Input Size (n)Approx. Operations
1010 checks
100100 checks
10001000 checks

Pattern observation: The work grows directly with the number of rows.

Final Time Complexity

Time Complexity: O(n)

This means the time to select rows grows in a straight line as the table gets bigger.

Common Mistake

[X] Wrong: "Selecting rows by condition is instant no matter the size."

[OK] Correct: The code must check each row to see if it meets the condition, so more rows mean more work.

Interview Connect

Understanding how filtering data scales helps you write efficient code and explain your choices clearly.

Self-Check

"What if we select rows using multiple conditions combined with AND? How would the time complexity change?"