0
0
Pandasdata~5 mins

Creating MultiIndex DataFrames in Pandas

Choose your learning style9 modes available
Introduction

MultiIndex DataFrames help organize data with multiple levels of labels. This makes it easier to work with complex data that has more than one category.

You have sales data by year and by region and want to analyze both together.
You want to group students by class and then by subject scores.
You need to store weather data by city and by month in one table.
You want to compare product prices by category and by store location.
Syntax
Pandas
pd.MultiIndex.from_tuples(tuples, names=level_names)
pd.MultiIndex.from_product([list1, list2], names=level_names)
pd.DataFrame(data, index=multiindex)

Use from_tuples when you have pairs or groups of labels ready.

Use from_product to create all combinations of multiple lists.

Examples
This example creates a MultiIndex from pairs of year and region, then makes a DataFrame showing sales.
Pandas
import pandas as pd

# Create MultiIndex from tuples
index = pd.MultiIndex.from_tuples(
    [('2023', 'East'), ('2023', 'West'), ('2024', 'East'), ('2024', 'West')],
    names=['Year', 'Region']
)

# Create DataFrame with MultiIndex
df = pd.DataFrame({'Sales': [100, 150, 200, 250]}, index=index)
print(df)
This example creates all combinations of years and regions automatically, then builds the DataFrame.
Pandas
import pandas as pd

# Create MultiIndex from product of lists
index = pd.MultiIndex.from_product(
    [['2023', '2024'], ['East', 'West']],
    names=['Year', 'Region']
)

# Create DataFrame with MultiIndex
df = pd.DataFrame({'Sales': [100, 150, 200, 250]}, index=index)
print(df)
Sample Program

This program creates a MultiIndex DataFrame from a list of tuples containing year, region, and sales. It shows how to separate the index and data, then build the DataFrame.

Pandas
import pandas as pd

# Define tuples for MultiIndex
sales_data = [
    ('2023', 'East', 100),
    ('2023', 'West', 150),
    ('2024', 'East', 200),
    ('2024', 'West', 250)
]

# Create MultiIndex from tuples
index = pd.MultiIndex.from_tuples(
    [(year, region) for year, region, sales in sales_data],
    names=['Year', 'Region']
)

# Extract sales values
sales_values = [sales for year, region, sales in sales_data]

# Create DataFrame
df = pd.DataFrame({'Sales': sales_values}, index=index)

print(df)
OutputSuccess
Important Notes

MultiIndex lets you select data by multiple levels, like df.loc['2023', 'East'].

Always name your index levels to keep data clear.

Summary

MultiIndex DataFrames organize data with multiple label levels.

Use pd.MultiIndex.from_tuples or from_product to create MultiIndex.

MultiIndex helps analyze complex data by grouping on several categories.