How to find frequency of values pandas

PandasHow-ToBeginner · 3 min read

How to Find Frequency of Values in pandas DataFrame or Series

Use the value_counts() method on a pandas Series or DataFrame column to find the frequency of each unique value. It returns a Series with values as the index and their counts as the data.

📐

Syntax

The basic syntax to find frequency of values in pandas is:

Series.value_counts(normalize=False, sort=True, ascending=False, dropna=True)
DataFrame['column_name'].value_counts() to get frequencies for a specific column.

Parameters explained:

normalize: If True, returns relative frequencies instead of counts.
sort: Sort by counts (default True).
ascending: Sort ascending if True.
dropna: Exclude NaN values if True (default).

python

series.value_counts(normalize=False, sort=True, ascending=False, dropna=True)

# Example for DataFrame column
DataFrame['column_name'].value_counts()

💻

Example

This example shows how to find the frequency of values in a pandas Series and a DataFrame column.

python

import pandas as pd

# Create a sample DataFrame
data = {'fruits': ['apple', 'banana', 'apple', 'orange', 'banana', 'banana', 'apple', None]}
df = pd.DataFrame(data)

# Frequency of values in 'fruits' column
freq = df['fruits'].value_counts()

# Frequency including NaN values
freq_with_nan = df['fruits'].value_counts(dropna=False)

# Relative frequency
rel_freq = df['fruits'].value_counts(normalize=True)

print('Frequency of fruits:')
print(freq)
print('\nFrequency including NaN:')
print(freq_with_nan)
print('\nRelative frequency:')
print(rel_freq)

Output

Frequency of fruits: apple 3 banana 3 orange 1 Name: fruits, dtype: int64 Frequency including NaN: apple 3 banana 3 orange 1 NaN 1 Name: fruits, dtype: int64 Relative frequency: apple 0.428571 banana 0.428571 orange 0.142857 Name: fruits, dtype: float64

⚠️

Common Pitfalls

Common mistakes when finding frequency of values in pandas include:

Forgetting to select a specific column from a DataFrame before calling value_counts().
Not handling NaN values, which are excluded by default.
Expecting a DataFrame output instead of a Series.

Example of a wrong approach and the correct way:

python

# Wrong: calling value_counts() directly on DataFrame
import pandas as pd
df = pd.DataFrame({'A': [1,2,2,3]})

try:
    df.value_counts()
except Exception as e:
    print(f'Error: {e}')

# Correct: call on a column
counts = df['A'].value_counts()
print(counts)

Output

Error: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all(). 2 2 1 1 3 1 Name: A, dtype: int64

📊

Quick Reference

Method	Description	Example
value_counts()	Counts unique values in a Series or DataFrame column	df['col'].value_counts()
value_counts(normalize=True)	Returns relative frequencies (proportions)	df['col'].value_counts(normalize=True)
value_counts(dropna=False)	Includes NaN values in counts	df['col'].value_counts(dropna=False)
sort_values()	Sorts the result if needed	df['col'].value_counts().sort_values(ascending=True)

✅

Key Takeaways

Use value_counts() on a pandas Series or DataFrame column to get frequency counts.

By default, value_counts() excludes NaN values; use dropna=False to include them.

Set normalize=True to get relative frequencies instead of counts.

Always call value_counts() on a Series, not directly on a DataFrame.

You can sort the frequency results using sort_values() if needed.