0
0
Pandasdata~10 mins

Sorting by values in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Sorting by values
Start with DataFrame
Choose column(s) to sort
Call sort_values()
Sort rows by chosen column(s)
Return sorted DataFrame
End
We start with a DataFrame, pick columns to sort by, call sort_values(), and get a new DataFrame sorted by those columns.
Execution Sample
Pandas
import pandas as pd

df = pd.DataFrame({'Name': ['Anna', 'Bob', 'Cara'], 'Age': [25, 30, 22]})
sorted_df = df.sort_values(by='Age')
print(sorted_df)
This code sorts the DataFrame rows by the 'Age' column in ascending order.
Execution Table
StepActionDataFrame state (Name, Age)Result
1Create DataFrame[('Anna', 25), ('Bob', 30), ('Cara', 22)]Original order
2Call sort_values(by='Age')Same as step 1Sorting triggered
3Sort rows by Age ascending[('Cara', 22), ('Anna', 25), ('Bob', 30)]Rows reordered
4Return sorted DataFrame[('Cara', 22), ('Anna', 25), ('Bob', 30)]Sorted DataFrame ready
💡 Sorting complete, DataFrame rows ordered by Age ascending
Variable Tracker
VariableStartAfter sort_valuesFinal
df[('Anna', 25), ('Bob', 30), ('Cara', 22)][('Anna', 25), ('Bob', 30), ('Cara', 22)][('Anna', 25), ('Bob', 30), ('Cara', 22)]
sorted_dfN/A[('Cara', 22), ('Anna', 25), ('Bob', 30)][('Cara', 22), ('Anna', 25), ('Bob', 30)]
Key Moments - 2 Insights
Why does the original DataFrame 'df' not change after sorting?
Because sort_values() returns a new sorted DataFrame and does not modify 'df' in place unless you use inplace=True. See execution_table step 3 and variable_tracker where 'df' stays the same.
What happens if we sort by a column with duplicate values?
Rows with the same value in the sorted column keep their original order (stable sort). This is shown by the sorting step in execution_table where order is preserved for ties.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 3, which row comes first after sorting by 'Age'?
A('Bob', 30)
B('Cara', 22)
C('Anna', 25)
D('Bob', 22)
💡 Hint
Check the 'DataFrame state' column at step 3 in execution_table
According to variable_tracker, what is the value of 'df' after sorting?
A[('Anna', 25), ('Bob', 30), ('Cara', 22)]
BN/A
C[('Cara', 22), ('Anna', 25), ('Bob', 30)]
DEmpty DataFrame
💡 Hint
Look at the 'df' row in variable_tracker after sort_values
If we want to sort the DataFrame by 'Age' descending, what parameter should we add?
Aascending=True
Bby='Age_desc'
Cascending=False
Dreverse=True
💡 Hint
Recall sort_values() has an 'ascending' parameter controlling sort order
Concept Snapshot
pandas.DataFrame.sort_values(by=column_name)
- Sorts rows by column values
- Returns a new sorted DataFrame
- Default is ascending order
- Use inplace=True to modify original
- Can sort by multiple columns
Full Transcript
We start with a DataFrame containing names and ages. We want to sort it by the 'Age' column. Using df.sort_values(by='Age'), pandas sorts the rows in ascending order by age. The original DataFrame 'df' remains unchanged because sort_values returns a new DataFrame. The new sorted DataFrame has rows ordered from youngest to oldest. This process is stepwise: create DataFrame, call sort_values, sort rows, return sorted DataFrame. Remember, to change the original DataFrame, use inplace=True. Sorting is stable, so rows with equal values keep their order.