0
0
Pandasdata~10 mins

Sorting MultiIndex in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Sorting MultiIndex
Create MultiIndex DataFrame
Choose sort level(s)
Apply sort_index(level=...)
DataFrame rows reordered
Sorted MultiIndex DataFrame output
We start with a DataFrame having a MultiIndex, select which index level(s) to sort by, apply sorting, and get the reordered DataFrame.
Execution Sample
Pandas
import pandas as pd
idx = pd.MultiIndex.from_tuples([('b', 2), ('a', 1), ('b', 1)])
df = pd.DataFrame({'val': [10, 20, 30]}, index=idx)
sorted_df = df.sort_index(level=0)
This code creates a MultiIndex DataFrame and sorts it by the first level of the index.
Execution Table
StepActionIndex BeforeSort LevelIndex AfterDataFrame Rows Order
1Create MultiIndex DataFrame[('b',2), ('a',1), ('b',1)]N/A[('b',2), ('a',1), ('b',1)]Rows: b-2, a-1, b-1
2Choose sort level=0 (first level)[('b',2), ('a',1), ('b',1)]0N/AN/A
3Apply sort_index(level=0)[('b',2), ('a',1), ('b',1)]0[('a',1), ('b',1), ('b',2)]Rows reordered to a-1, b-1, b-2
4Output sorted DataFrame[('a',1), ('b',1), ('b',2)]0[('a',1), ('b',1), ('b',2)]Rows: a-1, b-1, b-2
💡 Sorting completes after reordering rows by the specified MultiIndex level.
Variable Tracker
VariableStartAfter Step 1After Step 3Final
idxempty[('b',2), ('a',1), ('b',1)][('b',2), ('a',1), ('b',1)][('b',2), ('a',1), ('b',1)]
df.indexempty[('b',2), ('a',1), ('b',1)][('b',2), ('a',1), ('b',1)][('b',2), ('a',1), ('b',1)]
sorted_df.indexN/AN/AN/A[('a',1), ('b',1), ('b',2)]
df rows orderemptyb-2, a-1, b-1b-2, a-1, b-1b-2, a-1, b-1
sorted_df rows orderN/AN/AN/Aa-1, b-1, b-2
Key Moments - 3 Insights
Why does sorting by level=0 reorder the rows differently than the original?
Because level=0 refers to the first index level ('a' or 'b'), sorting rearranges rows to order by that level alphabetically, as shown in execution_table step 3.
Does the original DataFrame 'df' change after sorting?
No, 'df' remains unchanged; sorting returns a new DataFrame 'sorted_df' with reordered rows, as seen in variable_tracker where 'df' index stays the same but 'sorted_df' index changes.
What happens if you sort by level=1 instead?
Sorting by level=1 orders rows by the second index level (numbers), changing the row order accordingly. This would reorder rows differently than sorting by level=0.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 3, what is the first index tuple after sorting by level=0?
A('b', 1)
B('b', 2)
C('a', 1)
D('a', 2)
💡 Hint
Check the 'Index After' column at step 3 in the execution_table.
At which step does the DataFrame rows order change?
AStep 3
BStep 2
CStep 1
DStep 4
💡 Hint
Look at the 'DataFrame Rows Order' column in execution_table to see when rows reorder.
If we sorted by level=1 instead, which of these would likely be the first row index?
A('b', 1)
B('a', 1)
C('b', 2)
D('a', 2)
💡 Hint
Look at the second element in the tuples and consider sorting by level=1 in variable_tracker.
Concept Snapshot
Sorting MultiIndex DataFrames:
- Use df.sort_index(level=level_number) to sort by a specific index level.
- Level 0 is the first index level, level 1 the second, etc.
- Sorting returns a new DataFrame; original stays unchanged.
- Rows reorder based on the chosen index level's values.
Full Transcript
This visual execution shows how to sort a pandas DataFrame with a MultiIndex. We start by creating a DataFrame with a MultiIndex of tuples. Then we choose which index level to sort by, for example level 0, the first index level. Applying sort_index(level=0) rearranges the rows so that the first index level is sorted alphabetically. The original DataFrame remains unchanged, and a new sorted DataFrame is returned. The execution table traces each step, showing the index before and after sorting, and how the rows reorder. The variable tracker shows how the index variables change through the process. Key moments clarify common confusions like why the original DataFrame doesn't change and what sorting by different levels means. The quiz tests understanding of the index order after sorting and when the rows reorder. The snapshot summarizes the key points for quick reference.