How to Rank Values in pandas: Simple Guide with Examples
Use the
rank() method in pandas to assign ranks to values in a Series or DataFrame column. It supports different ranking methods like 'average', 'min', 'max', and options to rank ascending or descending.Syntax
The rank() method syntax is:
Series.rank(method='average', ascending=True, na_option='keep', pct=False)DataFrame.rank(axis=0, method='average', ascending=True, na_option='keep', pct=False)
Parameters explained:
- method: How to assign ranks to ties. Options: 'average', 'min', 'max', 'first', 'dense'.
- ascending: Rank in ascending order if True, descending if False.
- na_option: How to handle NaN values: 'keep', 'top', or 'bottom'.
- pct: If True, ranks are expressed as percentage of total.
- axis: For DataFrame, 0 ranks index (rows), 1 ranks columns.
python
series.rank(method='average', ascending=True, na_option='keep', pct=False) dataframe.rank(axis=0, method='average', ascending=True, na_option='keep', pct=False)
Example
This example shows how to rank values in a pandas Series with different methods and ascending/descending order.
python
import pandas as pd # Create a Series with some duplicate values s = pd.Series([100, 200, 200, 300, 400, 100]) # Rank with default method='average' and ascending=True rank_default = s.rank() # Rank with method='min' and ascending=False rank_min_desc = s.rank(method='min', ascending=False) # Rank with method='dense' rank_dense = s.rank(method='dense') print('Original Series:') print(s) print('\nRank (average, ascending):') print(rank_default) print('\nRank (min, descending):') print(rank_min_desc) print('\nRank (dense, ascending):') print(rank_dense)
Output
Original Series:
0 100
1 200
2 200
3 300
4 400
5 100
dtype: int64
Rank (average, ascending):
0 1.5
1 3.5
2 3.5
3 5.0
4 6.0
5 1.5
dtype: float64
Rank (min, descending):
0 6.0
1 4.0
2 4.0
3 2.0
4 1.0
5 6.0
dtype: float64
Rank (dense, ascending):
0 1.0
1 2.0
2 2.0
3 3.0
4 4.0
5 1.0
dtype: float64
Common Pitfalls
Common mistakes when ranking values in pandas include:
- Not choosing the right
methodfor ties, which can lead to unexpected ranks. - Forgetting that
rank()returns float ranks even if input is integers. - Not handling
NaNvalues properly, which can stay asNaNor be ranked at top/bottom. - Confusing ascending and descending order.
Example of a common mistake and fix:
python
import pandas as pd s = pd.Series([10, 20, 20, None, 30]) # Wrong: ignoring NaN handling, NaN stays as NaN rank_wrong = s.rank() # Right: put NaN at bottom rank_right = s.rank(na_option='bottom') print('Rank ignoring NaN handling:') print(rank_wrong) print('\nRank with NaN at bottom:') print(rank_right)
Output
Rank ignoring NaN handling:
0 1.0
1 2.5
2 2.5
3 NaN
4 4.0
dtype: float64
Rank with NaN at bottom:
0 1.0
1 2.5
2 2.5
3 5.0
4 4.0
dtype: float64
Quick Reference
| Parameter | Description | Common Values |
|---|---|---|
| method | How to assign ranks to ties | 'average', 'min', 'max', 'first', 'dense' |
| ascending | Rank order direction | True (ascending), False (descending) |
| na_option | How to handle NaN values | 'keep', 'top', 'bottom' |
| pct | Return rank as percentage | True, False |
| axis | Axis to rank (DataFrame only) | 0 (index/rows), 1 (columns) |
Key Takeaways
Use pandas.Series.rank() or DataFrame.rank() to assign ranks to values easily.
Choose the ranking method ('average', 'min', 'max', 'first', 'dense') based on how you want to handle ties.
Remember to set ascending=False to rank from highest to lowest values.
Handle NaN values explicitly with the na_option parameter to avoid unexpected results.
Rank results are floats by default, even if input values are integers.