0
0
Data Analysis Pythondata~5 mins

info() for column types in Data Analysis Python - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: info() for column types
O(n)
Understanding Time Complexity

We want to understand how the time to run info() on a data table changes as the table grows.

Specifically, how does checking column types scale with more rows and columns?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import pandas as pd

n = 10  # example value for n

df = pd.DataFrame({
    'A': range(n),
    'B': [str(i) for i in range(n)],
    'C': [float(i) for i in range(n)]
})

df.info()

This code creates a table with n rows and 3 columns, then calls info() to show column types and counts.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Scanning each column to check data types and count non-null values.
  • How many times: For each of the 3 columns, it looks through all n rows once.
How Execution Grows With Input

As the number of rows n grows, the time to check each column grows roughly in direct proportion.

Input Size (n)Approx. Operations
10About 30 checks (3 columns x 10 rows)
100About 300 checks
1000About 3000 checks

Pattern observation: The work grows linearly with the number of rows.

Final Time Complexity

Time Complexity: O(n)

This means the time to run info() grows in a straight line as the number of rows increases.

Common Mistake

[X] Wrong: "The time to run info() depends mostly on the number of columns, not rows."

[OK] Correct: While columns matter, info() checks every row in each column, so more rows mean more work.

Interview Connect

Understanding how data size affects analysis speed helps you write efficient code and explain your choices clearly.

Self-Check

"What if the DataFrame had 100 columns instead of 3? How would the time complexity change?"