0
0
Pandasdata~5 mins

append equivalent with concat in Pandas - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: append equivalent with concat
O(n)
Understanding Time Complexity

We want to understand how the time needed to combine data grows when using pandas concat instead of append.

How does the work change as the data size gets bigger?

Scenario Under Consideration

Analyze the time complexity of this pandas code that combines two dataframes.

import pandas as pd

n = 10  # example value for n
df1 = pd.DataFrame({'A': range(n)})
df2 = pd.DataFrame({'A': range(n, 2*n)})

result = pd.concat([df1, df2], ignore_index=True)

This code joins two dataframes with n rows each into one dataframe with 2n rows.

Identify Repeating Operations

Look at what repeats as data grows.

  • Primary operation: copying rows from both dataframes into a new combined dataframe.
  • How many times: once for each row in both dataframes, so about 2n times.
How Execution Grows With Input

As the number of rows n grows, the work to combine grows roughly in direct proportion.

Input Size (n)Approx. Operations
10About 20 operations (copying rows)
100About 200 operations
1000About 2000 operations

Pattern observation: doubling the input doubles the work needed.

Final Time Complexity

Time Complexity: O(n)

This means the time to combine data grows linearly with the number of rows.

Common Mistake

[X] Wrong: "Using concat is faster because it just links data without copying."

[OK] Correct: concat actually copies data to create a new dataframe, so time grows with data size.

Interview Connect

Understanding how data combining scales helps you write efficient code and explain your choices clearly.

Self-Check

What if we combined many small dataframes in a loop using concat each time? How would the time complexity change?