0
0
Data Analysis Pythondata~10 mins

nunique() for cardinality in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - nunique() for cardinality
Start with DataFrame
Call nunique()
Count unique values per column
Return counts as Series
Use counts for cardinality analysis
The nunique() function counts unique values in each column of a DataFrame, helping us understand the cardinality of data.
Execution Sample
Data Analysis Python
import pandas as pd

data = {'A': [1, 2, 2, 3], 'B': ['x', 'y', 'x', 'z']}
df = pd.DataFrame(data)
unique_counts = df.nunique()
This code creates a DataFrame and uses nunique() to count unique values in each column.
Execution Table
StepActionDataFrame Statenunique() ResultExplanation
1Create DataFrame{'A':[1,2,2,3], 'B':['x','y','x','z']}N/ADataFrame with 4 rows and 2 columns created
2Call df.nunique()Same DataFrameA: 3, B: 3Counts unique values: A has 3 unique (1,2,3), B has 3 unique (x,y,z)
3Store result in unique_countsSame DataFrameunique_counts = A:3, B:3Result stored as Series for further use
💡 All columns processed, unique counts returned as Series
Variable Tracker
VariableStartAfter Step 1After Step 2Final
dfNone{'A':[1,2,2,3], 'B':['x','y','x','z']}SameSame
unique_countsNoneNoneA: 3, B: 3A: 3, B: 3
Key Moments - 2 Insights
Why does column 'A' show 3 unique values even though it has 4 rows?
Because nunique() counts distinct values, and '2' appears twice, so unique values are 1, 2, and 3 (3 total). See execution_table row 2.
Does nunique() count missing values (NaN) as unique?
By default, nunique() ignores NaN values when counting unique values. This is why only actual distinct values are counted.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at Step 2, what is the unique count for column 'B'?
A3
B2
C4
D1
💡 Hint
Check the 'nunique() Result' column in execution_table row 2.
At which step is the unique count stored in a variable?
AStep 1
BStep 2
CStep 3
DAfter all steps
💡 Hint
Look at the 'Action' column in execution_table to see when unique_counts is assigned.
If column 'A' had a missing value (NaN), how would nunique() count change?
AIt would count NaN as a unique value
BIt would ignore NaN and count only actual unique values
CIt would count NaN twice
DIt would cause an error
💡 Hint
Recall the key moment about how nunique() treats NaN values.
Concept Snapshot
nunique() counts unique values per DataFrame column
Returns a Series with counts
Ignores NaN by default
Useful for cardinality analysis
Syntax: df.nunique()
Full Transcript
We start with a DataFrame containing columns A and B. Calling df.nunique() counts how many unique values each column has. For example, column A has values 1, 2, 2, 3, so unique count is 3 because 2 repeats. Column B has 'x', 'y', 'x', 'z', so unique count is also 3. The result is stored in unique_counts as a Series. Note that nunique() ignores missing values by default. This helps us understand the variety of data in each column, called cardinality.