0
0
Pandasdata~10 mins

Selecting data with MultiIndex in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Selecting data with MultiIndex
Create MultiIndex DataFrame
Choose level(s) to select
Use .loc or .xs with keys
Extract subset DataFrame or Series
Use result for analysis or display
Start with a DataFrame that has multiple index levels. Pick which index level(s) you want to select by using .loc or .xs methods with the right keys. This extracts the data subset you want.
Execution Sample
Pandas
import pandas as pd
idx = pd.MultiIndex.from_tuples([('A', 1), ('A', 2), ('B', 1), ('B', 2)])
df = pd.DataFrame({'Value': [10, 20, 30, 40]}, index=idx)
subset = df.loc['A']
Create a MultiIndex DataFrame and select all rows where the first index level is 'A'.
Execution Table
StepActionIndex Level(s) UsedSelection KeyResulting Data
1Create MultiIndex from tuples('A',1), ('A',2), ('B',1), ('B',2)N/AMultiIndex with 2 levels created
2Create DataFrame with MultiIndexMultiIndex with 2 levelsN/ADataFrame with 4 rows and 'Value' column
3Select rows with .loc['A']Level 0'A'Rows with index ('A',1) and ('A',2) selected
4Display subsetLevel 0'A'DataFrame with 2 rows: Value 10 and 20
5Select row with .xs(('B',2))Level 0 and 1('B',2)Single row with Value 40 selected
6Select rows with .loc[('B', slice(None))]Level 0 and 1('B', slice(None))Rows with index ('B',1) and ('B',2) selected
7ExitN/AN/ASelection complete
💡 All desired selections done, no more keys to select
Variable Tracker
VariableStartAfter Step 3After Step 5After Step 6Final
idxEmpty[('A',1), ('A',2), ('B',1), ('B',2)][('A',1), ('A',2), ('B',1), ('B',2)][('A',1), ('A',2), ('B',1), ('B',2)][('A',1), ('A',2), ('B',1), ('B',2)]
dfEmptyDataFrame with 4 rowsDataFrame with 4 rowsDataFrame with 4 rowsDataFrame with 4 rows
subsetN/ADataFrame with 2 rows (index 'A')Single row (index ('B',2))DataFrame with 2 rows (index 'B')DataFrame with 2 rows (index 'B')
Key Moments - 3 Insights
Why does df.loc['A'] select multiple rows?
Because 'A' matches the first level of the MultiIndex, and all rows with 'A' at level 0 are selected, as shown in execution_table step 3.
How does .xs differ from .loc when selecting MultiIndex data?
.xs allows selecting data by specifying keys for multiple levels at once, returning a single row or subset, as in step 5, while .loc can select slices or levels.
What does slice(None) mean in df.loc[('B', slice(None))]?
slice(None) means select all entries in that level, so this selects all rows where level 0 is 'B' regardless of level 1, as in step 6.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 3, what rows does df.loc['A'] select?
ARows with index ('A',1) and ('A',2)
BRows with index ('B',1) and ('B',2)
COnly row with index ('A',1)
DOnly row with index ('B',2)
💡 Hint
See execution_table row for step 3 under 'Resulting Data'
At which step does the selection return a single row?
AStep 6
BStep 3
CStep 5
DStep 2
💡 Hint
Check execution_table row for step 5 under 'Resulting Data'
If you want to select all rows where level 0 is 'B', which selection is correct?
Adf.loc['A']
Bdf.loc[('B', slice(None))]
Cdf.xs('B')
Ddf.loc[('A', 1)]
💡 Hint
See execution_table step 6 for selecting all rows with 'B' at level 0
Concept Snapshot
Selecting data with MultiIndex:
- Use .loc[key] to select all rows matching key at first index level.
- Use .loc[(key1, key2)] or .xs((key1, key2)) to select specific rows by multiple levels.
- slice(None) selects all entries in that level.
- Result is a subset DataFrame or Series for analysis.
Full Transcript
This lesson shows how to select data from a pandas DataFrame with a MultiIndex. We start by creating a MultiIndex from tuples and build a DataFrame with it. Then, we select rows where the first index level matches a key using .loc. We also use .xs to select a single row by specifying keys for multiple levels. Using slice(None) allows selecting all entries in a level. The execution table traces each step, showing how the selection narrows down the data. Key moments clarify why .loc['A'] returns multiple rows and how .xs differs. The visual quiz tests understanding of which rows are selected at each step. The snapshot summarizes the main methods and rules for selecting MultiIndex data.