0
0
Data-analysis-pythonHow-ToBeginner ยท 3 min read

How to Use value_counts in pandas with Python

Use value_counts() on a pandas Series to count unique values and their frequencies. It returns a Series sorted by counts by default, helping you quickly see how often each value appears.
๐Ÿ“

Syntax

The basic syntax of value_counts() is:

  • Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)

Explanation of parameters:

  • normalize: If True, returns relative frequencies instead of counts.
  • sort: Sorts the result by counts if True.
  • ascending: Sort order of counts; False means descending.
  • bins: Groups values into equal-width bins if set.
  • dropna: Excludes NaN values if True.
python
series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
๐Ÿ’ป

Example

This example shows how to count unique values in a pandas Series using value_counts(). It also demonstrates sorting and including NaN values.

python
import pandas as pd

# Create a Series with repeated and NaN values
data = pd.Series(['apple', 'banana', 'apple', 'orange', 'banana', 'banana', None])

# Count unique values excluding NaN
counts = data.value_counts()

# Count unique values including NaN
counts_with_nan = data.value_counts(dropna=False)

print("Counts excluding NaN:")
print(counts)
print("\nCounts including NaN:")
print(counts_with_nan)
Output
Counts excluding NaN: banana 3 apple 2 orange 1 dtype: int64 Counts including NaN: banana 3 apple 2 orange 1 NaN 1 dtype: int64
โš ๏ธ

Common Pitfalls

Common mistakes when using value_counts() include:

  • Forgetting that value_counts() works only on pandas Series, not DataFrames directly.
  • Not handling NaN values properly; by default, they are excluded.
  • Expecting the result to be unsorted; it sorts counts descending by default.

Example of a wrong and right way:

python
# Wrong: calling value_counts on DataFrame (raises error)
import pandas as pd

df = pd.DataFrame({'fruits': ['apple', 'banana', 'apple']})

# This will raise an error:
# df.value_counts()

# Right: call value_counts on a Series (a column)
counts = df['fruits'].value_counts()
print(counts)
Output
apple 2 banana 1 dtype: int64
๐Ÿ“Š

Quick Reference

Summary tips for using value_counts():

  • Use on a pandas Series to count unique values.
  • Set normalize=True to get proportions instead of counts.
  • Use dropna=False to include missing values in counts.
  • Use bins to group numeric data into intervals.
  • Remember it returns a Series sorted by count descending by default.
โœ…

Key Takeaways

Use value_counts() on a pandas Series to count unique values and their frequencies.
By default, value_counts() excludes NaN values and sorts counts descending.
Set normalize=True to get relative frequencies instead of counts.
Call value_counts() on a Series, not directly on a DataFrame.
Use dropna=False to include NaN values in the count results.