0
0
Pandasdata~5 mins

iloc for position-based selection in Pandas - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: iloc for position-based selection
O(r x c)
Understanding Time Complexity

We want to understand how the time it takes to select data using iloc changes as the data size grows.

How does the number of rows or columns affect the work done when using iloc?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import pandas as pd

df = pd.DataFrame({
    'A': range(1000),
    'B': range(1000, 2000),
    'C': range(2000, 3000)
})

subset = df.iloc[100:200, 1:3]

This code creates a DataFrame with 1000 rows and 3 columns, then selects rows 100 to 199 and columns 1 to 2 using iloc.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Accessing each cell in the selected slice of the DataFrame.
  • How many times: Once for each row and column in the selected range.
How Execution Grows With Input

When you select more rows or columns, the work grows with the size of the slice.

Input Size (rows x columns)Approx. Operations
10 x 220
100 x 2200
100 x 3300

Pattern observation: The operations grow roughly in direct proportion to the number of cells selected.

Final Time Complexity

Time Complexity: O(r x c)

This means the time grows proportionally to the number of rows (r) times the number of columns (c) you select.

Common Mistake

[X] Wrong: "Selecting data with iloc always takes the same time no matter how much data is selected."

[OK] Correct: The time depends on how many rows and columns you pick because iloc accesses each cell in the selection.

Interview Connect

Understanding how data selection scales helps you write efficient code and explain your choices clearly in real projects and interviews.

Self-Check

What if we changed the selection to only one column but many rows? How would the time complexity change?