0
0
PandasHow-ToBeginner · 3 min read

How to Use value_counts in pandas for Data Analysis

Use value_counts() on a pandas Series to count the frequency of each unique value. It returns a Series sorted by counts in descending order by default, helping you quickly see the distribution of data.
📐

Syntax

The basic syntax of value_counts() is:

  • Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)

Here’s what each part means:

  • normalize: If True, returns relative frequencies instead of counts.
  • sort: Whether to sort by counts (default True).
  • ascending: Sort ascending if True, descending if False.
  • bins: Group values into bins instead of unique values.
  • dropna: Whether to ignore NaN values.
python
series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
💻

Example

This example shows how to count unique values in a pandas Series and get their frequencies.

python
import pandas as pd

# Create a sample Series
colors = pd.Series(['red', 'blue', 'red', 'green', 'blue', 'blue', 'red', None])

# Count unique values
counts = colors.value_counts()

# Count unique values including NaN
counts_with_nan = colors.value_counts(dropna=False)

# Get relative frequencies
relative_freq = colors.value_counts(normalize=True)

print('Counts without NaN:')
print(counts)
print('\nCounts including NaN:')
print(counts_with_nan)
print('\nRelative frequencies:')
print(relative_freq)
Output
Counts without NaN: blue 3 red 3 green 1 Counts including NaN: blue 3 red 3 green 1 NaN 1 Relative frequencies: blue 0.375 red 0.375 green 0.125 Name: , dtype: float64
⚠️

Common Pitfalls

Some common mistakes when using value_counts() include:

  • Forgetting that value_counts() works only on pandas Series, not DataFrames.
  • Not handling NaN values properly; by default, they are excluded.
  • Expecting the result to be sorted ascending by default (it sorts descending by default).
  • Trying to use value_counts() on multiple columns at once (it works on one Series at a time).
python
import pandas as pd

df = pd.DataFrame({'A': ['x', 'y', 'x'], 'B': [1, 2, 1]})

# Wrong: calling value_counts on DataFrame
# df.value_counts()  # This works in pandas 1.1+, but returns counts of rows, not column values

# Right: call value_counts on a single column
counts_A = df['A'].value_counts()
print(counts_A)
Output
x 2 y 1 Name: A, dtype: int64
📊

Quick Reference

ParameterDescriptionDefault
normalizeReturn relative frequencies instead of countsFalse
sortSort by countsTrue
ascendingSort ascending if True, descending if FalseFalse
binsGroup values into equal-width binsNone
dropnaExclude NaN values if TrueTrue

Key Takeaways

Use value_counts() on a pandas Series to count unique values and their frequencies.
By default, value_counts() excludes NaN values and sorts counts descending.
Set normalize=True to get relative frequencies instead of counts.
Call value_counts() on one Series at a time, not on a whole DataFrame.
Use dropna=False to include NaN values in the counts.