How to Use value_counts in pandas with Python
Use
value_counts() on a pandas Series to count unique values and their frequencies. It returns a Series sorted by counts by default, helping you quickly see how often each value appears.Syntax
The basic syntax of value_counts() is:
Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
Explanation of parameters:
- normalize: If
True, returns relative frequencies instead of counts. - sort: Sorts the result by counts if
True. - ascending: Sort order of counts;
Falsemeans descending. - bins: Groups values into equal-width bins if set.
- dropna: Excludes
NaNvalues ifTrue.
python
series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
Example
This example shows how to count unique values in a pandas Series using value_counts(). It also demonstrates sorting and including NaN values.
python
import pandas as pd # Create a Series with repeated and NaN values data = pd.Series(['apple', 'banana', 'apple', 'orange', 'banana', 'banana', None]) # Count unique values excluding NaN counts = data.value_counts() # Count unique values including NaN counts_with_nan = data.value_counts(dropna=False) print("Counts excluding NaN:") print(counts) print("\nCounts including NaN:") print(counts_with_nan)
Output
Counts excluding NaN:
banana 3
apple 2
orange 1
dtype: int64
Counts including NaN:
banana 3
apple 2
orange 1
NaN 1
dtype: int64
Common Pitfalls
Common mistakes when using value_counts() include:
- Forgetting that
value_counts()works only on pandas Series, not DataFrames directly. - Not handling
NaNvalues properly; by default, they are excluded. - Expecting the result to be unsorted; it sorts counts descending by default.
Example of a wrong and right way:
python
# Wrong: calling value_counts on DataFrame (raises error) import pandas as pd df = pd.DataFrame({'fruits': ['apple', 'banana', 'apple']}) # This will raise an error: # df.value_counts() # Right: call value_counts on a Series (a column) counts = df['fruits'].value_counts() print(counts)
Output
apple 2
banana 1
dtype: int64
Quick Reference
Summary tips for using value_counts():
- Use on a pandas Series to count unique values.
- Set
normalize=Trueto get proportions instead of counts. - Use
dropna=Falseto include missing values in counts. - Use
binsto group numeric data into intervals. - Remember it returns a Series sorted by count descending by default.
Key Takeaways
Use value_counts() on a pandas Series to count unique values and their frequencies.
By default, value_counts() excludes NaN values and sorts counts descending.
Set normalize=True to get relative frequencies instead of counts.
Call value_counts() on a Series, not directly on a DataFrame.
Use dropna=False to include NaN values in the count results.