0
0
Pandasdata~5 mins

Merging on index in Pandas - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Merging on index
O(n)
Understanding Time Complexity

When we combine two tables by matching their row labels, it takes some time depending on how big the tables are.

We want to understand how the time needed grows as the tables get bigger.

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import pandas as pd

n = 10  # Example size
# Create two dataframes with indexes
left = pd.DataFrame({'A': range(n)}, index=range(n))
right = pd.DataFrame({'B': range(n)}, index=range(n))

# Merge on index
result = pd.merge(left, right, left_index=True, right_index=True)

This code merges two dataframes by matching their indexes, combining rows with the same index label.

Identify Repeating Operations
  • Primary operation: Checking each index in one dataframe and finding the matching index in the other.
  • How many times: Once for each row in the first dataframe, so n times.
How Execution Grows With Input

As the number of rows grows, the work to match indexes grows roughly in the same way.

Input Size (n)Approx. Operations
10About 10 index matches
100About 100 index matches
1000About 1000 index matches

Pattern observation: The number of operations grows directly with the number of rows.

Final Time Complexity

Time Complexity: O(n)

This means the time to merge grows in a straight line as the number of rows increases.

Common Mistake

[X] Wrong: "Merging on index is instant no matter how big the dataframes are."

[OK] Correct: Even though indexes help find matches quickly, the operation still needs to check each row, so it takes longer as the data grows.

Interview Connect

Understanding how merging on index scales helps you explain data combining tasks clearly and shows you know how data size affects performance.

Self-Check

"What if the indexes were not sorted or unique? How would the time complexity change?"