MultiIndex DataFrames help organize data with multiple levels of labels. This makes it easier to work with complex data that has more than one category.
0
0
Creating MultiIndex DataFrames in Pandas
Introduction
You have sales data by year and by region and want to analyze both together.
You want to group students by class and then by subject scores.
You need to store weather data by city and by month in one table.
You want to compare product prices by category and by store location.
Syntax
Pandas
pd.MultiIndex.from_tuples(tuples, names=level_names) pd.MultiIndex.from_product([list1, list2], names=level_names) pd.DataFrame(data, index=multiindex)
Use from_tuples when you have pairs or groups of labels ready.
Use from_product to create all combinations of multiple lists.
Examples
This example creates a MultiIndex from pairs of year and region, then makes a DataFrame showing sales.
Pandas
import pandas as pd # Create MultiIndex from tuples index = pd.MultiIndex.from_tuples( [('2023', 'East'), ('2023', 'West'), ('2024', 'East'), ('2024', 'West')], names=['Year', 'Region'] ) # Create DataFrame with MultiIndex df = pd.DataFrame({'Sales': [100, 150, 200, 250]}, index=index) print(df)
This example creates all combinations of years and regions automatically, then builds the DataFrame.
Pandas
import pandas as pd # Create MultiIndex from product of lists index = pd.MultiIndex.from_product( [['2023', '2024'], ['East', 'West']], names=['Year', 'Region'] ) # Create DataFrame with MultiIndex df = pd.DataFrame({'Sales': [100, 150, 200, 250]}, index=index) print(df)
Sample Program
This program creates a MultiIndex DataFrame from a list of tuples containing year, region, and sales. It shows how to separate the index and data, then build the DataFrame.
Pandas
import pandas as pd # Define tuples for MultiIndex sales_data = [ ('2023', 'East', 100), ('2023', 'West', 150), ('2024', 'East', 200), ('2024', 'West', 250) ] # Create MultiIndex from tuples index = pd.MultiIndex.from_tuples( [(year, region) for year, region, sales in sales_data], names=['Year', 'Region'] ) # Extract sales values sales_values = [sales for year, region, sales in sales_data] # Create DataFrame df = pd.DataFrame({'Sales': sales_values}, index=index) print(df)
OutputSuccess
Important Notes
MultiIndex lets you select data by multiple levels, like df.loc['2023', 'East'].
Always name your index levels to keep data clear.
Summary
MultiIndex DataFrames organize data with multiple label levels.
Use pd.MultiIndex.from_tuples or from_product to create MultiIndex.
MultiIndex helps analyze complex data by grouping on several categories.