How to Use nunique in pandas to Count Unique Values
Use
nunique() in pandas to count the number of unique values in a Series or DataFrame column. It returns an integer representing how many distinct values exist, optionally ignoring missing values with dropna.Syntax
The basic syntax of nunique() is:
Series.nunique(dropna=True): Counts unique values in a Series, ignoring NaN by default.DataFrame.nunique(axis=0, dropna=True): Counts unique values per column (axis=0) or per row (axis=1), ignoring NaN by default.
Parameters:
dropna: Boolean, defaultTrue. IfTrue, NaN values are not counted as unique.axis: For DataFrame,0counts unique values per column,1per row.
python
Series.nunique(dropna=True) DataFrame.nunique(axis=0, dropna=True)
Example
This example shows how to use nunique() on a pandas DataFrame to count unique values in each column and in a Series.
python
import pandas as pd data = {'A': [1, 2, 2, 3, 4, 4, None], 'B': ['x', 'y', 'x', 'z', 'y', 'y', 'z'], 'C': [True, True, False, False, True, None, None]} df = pd.DataFrame(data) # Count unique values per column unique_counts = df.nunique() # Count unique values in column 'B' unique_in_B = df['B'].nunique() unique_counts, unique_in_B
Output
(A 4
B 3
C 2
dtype: int64, 3)
Common Pitfalls
Common mistakes when using nunique() include:
- Not realizing
dropna=Trueby default, so missing values are not counted unless you setdropna=False. - Using
nunique()on the whole DataFrame without specifyingaxis, which counts unique values per column, not overall. - Expecting
nunique()to return unique values themselves instead of counts.
python
import pandas as pd s = pd.Series([1, 2, 2, None, None]) # Wrong: expecting NaN to be counted count_with_nan = s.nunique(dropna=True) # returns 2 # Right: count NaN as unique count_with_nan_correct = s.nunique(dropna=False) # returns 3 count_with_nan, count_with_nan_correct
Output
(2, 3)
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| dropna | Whether to exclude NaN values from count | True |
| axis | Axis to count unique values on (0=columns, 1=rows) | 0 (DataFrame only) |
Key Takeaways
Use
nunique() to count distinct values in pandas Series or DataFrame columns.By default,
nunique() ignores missing (NaN) values unless dropna=False is set.For DataFrames,
nunique() counts unique values per column (axis=0) or per row (axis=1).Remember
nunique() returns counts, not the unique values themselves.Use
dropna=False to include NaN as a unique value in the count.