How to Find Unique Values in pandas DataFrame or Series
To find unique values in pandas, use the
unique() method on a Series or DataFrame column. This returns an array of distinct values without duplicates.Syntax
The unique() method is called on a pandas Series or DataFrame column to get unique values.
Series.unique(): Returns unique values in the Series as a NumPy array.DataFrame['column'].unique(): Returns unique values from a specific column.
python
unique_values = df['column_name'].unique()Example
This example shows how to find unique values in a DataFrame column using unique(). It prints the unique values found.
python
import pandas as pd data = {'fruits': ['apple', 'banana', 'apple', 'orange', 'banana', 'kiwi']} df = pd.DataFrame(data) unique_fruits = df['fruits'].unique() print(unique_fruits)
Output
['apple' 'banana' 'orange' 'kiwi']
Common Pitfalls
Common mistakes when finding unique values include:
- Calling
unique()on a DataFrame instead of a Series or column, which causes an error. - Expecting
unique()to return a list instead of a NumPy array. - Not handling missing values which appear as
nanin the output.
python
import pandas as pd data = {'fruits': ['apple', 'banana', 'apple', None, 'banana', 'kiwi']} df = pd.DataFrame(data) # Wrong: calling unique on DataFrame # unique_values = df.unique() # This will raise an AttributeError # Correct: call unique on a column unique_values = df['fruits'].unique() print(unique_values)
Output
['apple' 'banana' None 'kiwi']
Quick Reference
Summary tips for finding unique values in pandas:
- Use
Series.unique()to get unique values as a NumPy array. - Use
DataFrame['column'].unique()to find unique values in a specific column. - To count unique values, use
nunique(). - Missing values appear as
nanorNonein the output.
Key Takeaways
Use the unique() method on a pandas Series or DataFrame column to get unique values.
unique() returns a NumPy array of distinct values including None or nan if present.
Do not call unique() directly on a DataFrame; select a column first.
Use nunique() to count the number of unique values quickly.
Handle missing values carefully as they appear in the unique values output.