0
0
PandasHow-ToBeginner · 3 min read

How to Find Top N Values in pandas DataFrame or Series

Use the nlargest(n) method on a pandas Series or DataFrame column to find the top n values. For DataFrames, specify the column name with nlargest(n, 'column_name') to get rows with the highest values in that column.
📐

Syntax

The nlargest() method is used to get the top n values from a pandas Series or DataFrame.

  • Series.nlargest(n): Returns the top n values from a Series.
  • DataFrame.nlargest(n, columns): Returns the top n rows ordered by the specified column.

Parameters:

  • n: Number of top values to return.
  • columns: Column name to sort by (only for DataFrame).
python
series.nlargest(n)
dataframe.nlargest(n, 'column_name')
💻

Example

This example shows how to find the top 3 values in a pandas Series and the top 2 rows with the highest values in a DataFrame column.

python
import pandas as pd

# Create a Series
series = pd.Series([10, 50, 30, 20, 40])

# Find top 3 values in Series
top3_series = series.nlargest(3)

# Create a DataFrame
data = {'name': ['Alice', 'Bob', 'Charlie', 'David'], 'score': [85, 92, 88, 91]}
df = pd.DataFrame(data)

# Find top 2 rows by 'score'
top2_df = df.nlargest(2, 'score')

print('Top 3 values in Series:')
print(top3_series)
print('\nTop 2 rows in DataFrame by score:')
print(top2_df)
Output
Top 3 values in Series: 1 50 4 40 2 30 dtype: int64 Top 2 rows in DataFrame by score: name score 1 Bob 92 3 David 91
⚠️

Common Pitfalls

Common mistakes when using nlargest() include:

  • Not specifying the column name when using it on a DataFrame, which causes an error.
  • Using sort_values() instead of nlargest() which is less efficient for large data.
  • Confusing nlargest() with head(), which just returns the first rows without sorting.
python
import pandas as pd

data = {'name': ['Alice', 'Bob'], 'score': [85, 92]}
df = pd.DataFrame(data)

# Wrong: missing column name
# df.nlargest(1)  # This will raise TypeError

# Right:
top = df.nlargest(1, 'score')
print(top)
Output
name score 1 Bob 92
📊

Quick Reference

MethodDescriptionUsage Example
Series.nlargest(n)Get top n values from a Seriesseries.nlargest(3)
DataFrame.nlargest(n, columns)Get top n rows by column valuedf.nlargest(2, 'score')

Key Takeaways

Use pandas' nlargest() method to efficiently find top n values in Series or DataFrame columns.
Always specify the column name when using nlargest() on a DataFrame.
nlargest() is faster and clearer than sorting and slicing for top values.
Avoid calling nlargest() on a DataFrame without a column name to prevent errors.