How to Use value_counts in pandas for Data Analysis
Use
value_counts() on a pandas Series to count the frequency of each unique value. It returns a Series sorted by counts in descending order by default, helping you quickly see the distribution of data.Syntax
The basic syntax of value_counts() is:
Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
Here’s what each part means:
- normalize: If
True, returns relative frequencies instead of counts. - sort: Whether to sort by counts (default
True). - ascending: Sort ascending if
True, descending ifFalse. - bins: Group values into bins instead of unique values.
- dropna: Whether to ignore
NaNvalues.
python
series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
Example
This example shows how to count unique values in a pandas Series and get their frequencies.
python
import pandas as pd # Create a sample Series colors = pd.Series(['red', 'blue', 'red', 'green', 'blue', 'blue', 'red', None]) # Count unique values counts = colors.value_counts() # Count unique values including NaN counts_with_nan = colors.value_counts(dropna=False) # Get relative frequencies relative_freq = colors.value_counts(normalize=True) print('Counts without NaN:') print(counts) print('\nCounts including NaN:') print(counts_with_nan) print('\nRelative frequencies:') print(relative_freq)
Output
Counts without NaN:
blue 3
red 3
green 1
Counts including NaN:
blue 3
red 3
green 1
NaN 1
Relative frequencies:
blue 0.375
red 0.375
green 0.125
Name: , dtype: float64
Common Pitfalls
Some common mistakes when using value_counts() include:
- Forgetting that
value_counts()works only on pandas Series, not DataFrames. - Not handling
NaNvalues properly; by default, they are excluded. - Expecting the result to be sorted ascending by default (it sorts descending by default).
- Trying to use
value_counts()on multiple columns at once (it works on one Series at a time).
python
import pandas as pd df = pd.DataFrame({'A': ['x', 'y', 'x'], 'B': [1, 2, 1]}) # Wrong: calling value_counts on DataFrame # df.value_counts() # This works in pandas 1.1+, but returns counts of rows, not column values # Right: call value_counts on a single column counts_A = df['A'].value_counts() print(counts_A)
Output
x 2
y 1
Name: A, dtype: int64
Quick Reference
| Parameter | Description | Default |
|---|---|---|
| normalize | Return relative frequencies instead of counts | False |
| sort | Sort by counts | True |
| ascending | Sort ascending if True, descending if False | False |
| bins | Group values into equal-width bins | None |
| dropna | Exclude NaN values if True | True |
Key Takeaways
Use value_counts() on a pandas Series to count unique values and their frequencies.
By default, value_counts() excludes NaN values and sorts counts descending.
Set normalize=True to get relative frequencies instead of counts.
Call value_counts() on one Series at a time, not on a whole DataFrame.
Use dropna=False to include NaN values in the counts.