0
0
PandasHow-ToBeginner · 3 min read

How to Rank Values in pandas: Simple Guide with Examples

Use the rank() method in pandas to assign ranks to values in a Series or DataFrame column. It supports different ranking methods like 'average', 'min', 'max', and options to rank ascending or descending.
📐

Syntax

The rank() method syntax is:

  • Series.rank(method='average', ascending=True, na_option='keep', pct=False)
  • DataFrame.rank(axis=0, method='average', ascending=True, na_option='keep', pct=False)

Parameters explained:

  • method: How to assign ranks to ties. Options: 'average', 'min', 'max', 'first', 'dense'.
  • ascending: Rank in ascending order if True, descending if False.
  • na_option: How to handle NaN values: 'keep', 'top', or 'bottom'.
  • pct: If True, ranks are expressed as percentage of total.
  • axis: For DataFrame, 0 ranks index (rows), 1 ranks columns.
python
series.rank(method='average', ascending=True, na_option='keep', pct=False)
dataframe.rank(axis=0, method='average', ascending=True, na_option='keep', pct=False)
💻

Example

This example shows how to rank values in a pandas Series with different methods and ascending/descending order.

python
import pandas as pd

# Create a Series with some duplicate values
s = pd.Series([100, 200, 200, 300, 400, 100])

# Rank with default method='average' and ascending=True
rank_default = s.rank()

# Rank with method='min' and ascending=False
rank_min_desc = s.rank(method='min', ascending=False)

# Rank with method='dense'
rank_dense = s.rank(method='dense')

print('Original Series:')
print(s)
print('\nRank (average, ascending):')
print(rank_default)
print('\nRank (min, descending):')
print(rank_min_desc)
print('\nRank (dense, ascending):')
print(rank_dense)
Output
Original Series: 0 100 1 200 2 200 3 300 4 400 5 100 dtype: int64 Rank (average, ascending): 0 1.5 1 3.5 2 3.5 3 5.0 4 6.0 5 1.5 dtype: float64 Rank (min, descending): 0 6.0 1 4.0 2 4.0 3 2.0 4 1.0 5 6.0 dtype: float64 Rank (dense, ascending): 0 1.0 1 2.0 2 2.0 3 3.0 4 4.0 5 1.0 dtype: float64
⚠️

Common Pitfalls

Common mistakes when ranking values in pandas include:

  • Not choosing the right method for ties, which can lead to unexpected ranks.
  • Forgetting that rank() returns float ranks even if input is integers.
  • Not handling NaN values properly, which can stay as NaN or be ranked at top/bottom.
  • Confusing ascending and descending order.

Example of a common mistake and fix:

python
import pandas as pd

s = pd.Series([10, 20, 20, None, 30])

# Wrong: ignoring NaN handling, NaN stays as NaN
rank_wrong = s.rank()

# Right: put NaN at bottom
rank_right = s.rank(na_option='bottom')

print('Rank ignoring NaN handling:')
print(rank_wrong)
print('\nRank with NaN at bottom:')
print(rank_right)
Output
Rank ignoring NaN handling: 0 1.0 1 2.5 2 2.5 3 NaN 4 4.0 dtype: float64 Rank with NaN at bottom: 0 1.0 1 2.5 2 2.5 3 5.0 4 4.0 dtype: float64
📊

Quick Reference

ParameterDescriptionCommon Values
methodHow to assign ranks to ties'average', 'min', 'max', 'first', 'dense'
ascendingRank order directionTrue (ascending), False (descending)
na_optionHow to handle NaN values'keep', 'top', 'bottom'
pctReturn rank as percentageTrue, False
axisAxis to rank (DataFrame only)0 (index/rows), 1 (columns)

Key Takeaways

Use pandas.Series.rank() or DataFrame.rank() to assign ranks to values easily.
Choose the ranking method ('average', 'min', 'max', 'first', 'dense') based on how you want to handle ties.
Remember to set ascending=False to rank from highest to lowest values.
Handle NaN values explicitly with the na_option parameter to avoid unexpected results.
Rank results are floats by default, even if input values are integers.