0
0
Data Analysis Pythondata~10 mins

MultiIndex (hierarchical indexing) in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - MultiIndex (hierarchical indexing)
Create MultiIndex from tuples
Assign MultiIndex to DataFrame
Access data by outer level
Access data by inner level
Slice or filter using MultiIndex
We build a MultiIndex from tuples, assign it to a DataFrame, then access or slice data by levels.
Execution Sample
Data Analysis Python
import pandas as pd
index = pd.MultiIndex.from_tuples([('A', 1), ('A', 2), ('B', 1)])
df = pd.DataFrame({'Value': [10, 20, 30]}, index=index)
print(df.loc['A'])
Create a DataFrame with a MultiIndex and select rows where the first level is 'A'.
Execution Table
StepActionVariable/ExpressionResult/Value
1Create MultiIndex from tuplespd.MultiIndex.from_tuples([('A',1),('A',2),('B',1)])MultiIndex([('A', 1), ('A', 2), ('B', 1)], names=[None, None])
2Create DataFrame with MultiIndexpd.DataFrame({'Value':[10,20,30]}, index=index)DataFrame with MultiIndex: Value A 1 10 2 20 B 1 30
3Access rows with first level 'A'df.loc['A']DataFrame: Value 1 10 2 20
4Access rows with second level 1 under 'A'df.loc[('A',1)]Series: Value 10 Name: (A, 1), dtype: int64
5Slice rows where second level >= 2df.loc[pd.IndexSlice[:, 2:], :]DataFrame: Value A 2 20
6ExitNo more operationsEnd of example
💡 All steps executed to show MultiIndex creation, assignment, and access.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3After Step 4After Step 5Final
indexNoneMultiIndex([('A', 1), ('A', 2), ('B', 1)])MultiIndex([('A', 1), ('A', 2), ('B', 1)])SameSameSameSame
dfNoneNoneDataFrame with MultiIndex and valuesSameSameSameSame
df.loc['A']NoneNoneNoneDataFrame with rows for 'A'SameSameSame
df.loc[('A',1)]NoneNoneNoneNoneSeries with value 10SameSame
df.loc[pd.IndexSlice[:, 2:], :]NoneNoneNoneNoneNoneDataFrame with rows where second level >= 2Same
Key Moments - 3 Insights
Why does df.loc['A'] return multiple rows?
Because 'A' is the first level of the MultiIndex and selecting it returns all rows where the first level is 'A', as shown in execution_table step 3.
What happens if you try df.loc['C'] which is not in the index?
It raises a KeyError because 'C' is not present in the first level of the MultiIndex, similar to how normal indexing works but with hierarchical levels.
How does slicing with pd.IndexSlice[:, 2:] work?
It selects all rows where the second level of the MultiIndex is 2 or greater, as shown in execution_table step 5.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table step 3, what rows does df.loc['A'] return?
AAll rows in the DataFrame
BRows where second level is 1
CRows where first level is 'A' (two rows)
DOnly the row ('A', 1)
💡 Hint
Check the 'Result/Value' column in step 3 of execution_table.
At which step does df.loc[('A',1)] return a Series?
AStep 2
BStep 4
CStep 5
DStep 3
💡 Hint
Look at the 'Action' and 'Result/Value' columns in execution_table for step 4.
If we change the tuple ('B', 1) to ('B', 3), which step's output changes?
AStep 5 output changes
BStep 3 output changes
CStep 4 output changes
DNo output changes
💡 Hint
Step 5 filters rows where second level >= 2, so changing ('B',1) to ('B',3) affects this.
Concept Snapshot
MultiIndex (hierarchical indexing) lets you use multiple levels of row labels.
Create with pd.MultiIndex.from_tuples() and assign as DataFrame index.
Access data by levels using df.loc[level1] or df.loc[(level1, level2)].
Use pd.IndexSlice for slicing across levels.
It helps organize complex data with multiple keys.
Full Transcript
This example shows how to create a MultiIndex from tuples and assign it to a DataFrame. We then access data by selecting the first level 'A', which returns multiple rows. Selecting a full tuple like ('A',1) returns a single row as a Series. We also slice rows where the second level is greater or equal to 2 using pd.IndexSlice. The variable tracker shows how variables change after each step. Key moments clarify common confusions about selecting levels and slicing. The visual quiz tests understanding of these steps. MultiIndex helps organize data with multiple keys for easier access and slicing.