0
0
Data Analysis Pythondata~5 mins

MultiIndex (hierarchical indexing) in Data Analysis Python

Choose your learning style9 modes available
Introduction

MultiIndex lets you organize data with multiple levels of labels. It helps you work with complex tables easily.

You have data grouped by categories and subcategories, like sales by country and city.
You want to analyze data at different levels, such as yearly and monthly data together.
You need to store and access data with multiple keys, like product type and color.
You want to perform operations like grouping or slicing on multiple index levels.
Syntax
Data Analysis Python
pd.MultiIndex.from_tuples(tuples, names=None)
pd.MultiIndex.from_product(iterables, names=None)

# Example:
index = pd.MultiIndex.from_tuples([('A', 1), ('A', 2), ('B', 1)], names=['letter', 'number'])

Use from_tuples when you have a list of label pairs or triples.

Use from_product to create all combinations from multiple lists.

Examples
This creates a MultiIndex with two levels: Country and State.
Data Analysis Python
import pandas as pd

# Create MultiIndex from tuples
index = pd.MultiIndex.from_tuples(
    [('USA', 'NY'), ('USA', 'CA'), ('Canada', 'ON')],
    names=['Country', 'State']
)
print(index)
This creates all combinations of letters and numbers as a MultiIndex.
Data Analysis Python
import pandas as pd

# Create MultiIndex from product
index = pd.MultiIndex.from_product(
    [['A', 'B'], [1, 2]],
    names=['Letter', 'Number']
)
print(index)
DataFrame with MultiIndex rows showing categories and items.
Data Analysis Python
import pandas as pd

# Use MultiIndex in DataFrame
index = pd.MultiIndex.from_tuples(
    [('Fruit', 'Apple'), ('Fruit', 'Banana'), ('Veg', 'Carrot')],
    names=['Category', 'Item']
)
data = [10, 20, 30]
df = pd.DataFrame(data, index=index, columns=['Quantity'])
print(df)
Sample Program

This program creates a DataFrame with a MultiIndex for country and state. It then shows how to access data by country and by specific state.

Data Analysis Python
import pandas as pd

# Create MultiIndex from tuples
index = pd.MultiIndex.from_tuples(
    [('USA', 'NY'), ('USA', 'CA'), ('Canada', 'ON')],
    names=['Country', 'State']
)

# Create DataFrame with MultiIndex
data = [100, 200, 300]
df = pd.DataFrame(data, index=index, columns=['Sales'])

print("DataFrame with MultiIndex:")
print(df)

# Access data for USA
print("\nSales in USA:")
print(df.loc['USA'])

# Access data for USA, NY
print("\nSales in USA, NY:")
print(df.loc[('USA', 'NY')])
OutputSuccess
Important Notes

MultiIndex makes it easy to work with grouped or hierarchical data.

You can select data by one or more levels using loc.

MultiIndex can be used for rows or columns in a DataFrame.

Summary

MultiIndex lets you label data with multiple levels.

It helps organize and access complex data easily.

You create MultiIndex using from_tuples or from_product.