0
0
Pandasdata~5 mins

nunique() for unique counts in Pandas - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: nunique() for unique counts
O(n)
Understanding Time Complexity

We want to understand how the time needed to count unique values changes as the data grows.

How does pandas' nunique() method scale with bigger data?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.

import pandas as pd

df = pd.DataFrame({
    'A': [1, 2, 2, 3, 4, 4, 4, 5]
})

unique_count = df['A'].nunique()

This code counts how many unique values are in column 'A' of the DataFrame.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: pandas scans each value in the column once to check uniqueness.
  • How many times: It goes through all n rows exactly one time.
How Execution Grows With Input

As the number of rows grows, the time to count unique values grows roughly in direct proportion.

Input Size (n)Approx. Operations
10About 10 checks
100About 100 checks
1000About 1000 checks

Pattern observation: The work grows linearly as the data size increases.

Final Time Complexity

Time Complexity: O(n)

This means the time to count unique values grows in a straight line with the number of rows.

Common Mistake

[X] Wrong: "Counting unique values is instant no matter how big the data is."

[OK] Correct: pandas must look at each value to know if it is new or repeated, so bigger data takes more time.

Interview Connect

Understanding how counting unique values scales helps you explain data processing speed clearly and confidently.

Self-Check

"What if we used nunique() on multiple columns at once? How would the time complexity change?"