0
0
Pandasdata~5 mins

Setting columns as MultiIndex in Pandas

Choose your learning style9 modes available
Introduction

We use MultiIndex columns to organize data with multiple levels of labels. It helps to group related columns together clearly.

You have data with categories and subcategories in columns, like sales by region and product.
You want to compare multiple measurements for the same items side by side.
Your data has hierarchical information that needs clear separation in columns.
You want to perform operations on grouped columns easily.
You want to display complex tables neatly with multiple column headers.
Syntax
Pandas
df.columns = pd.MultiIndex.from_tuples([('level1', 'level2'), ...])
Use a list of tuples where each tuple represents the levels for one column.
You can also create MultiIndex from arrays or product of lists.
Examples
Set two columns under the same group 'Group1' with sublabels 'A' and 'B'.
Pandas
import pandas as pd

df = pd.DataFrame({
    'A': [1, 2],
    'B': [3, 4]
})
df.columns = pd.MultiIndex.from_tuples([('Group1', 'A'), ('Group1', 'B')])
print(df)
Change columns to show subject 'Math' with two sub-columns 'Score' and 'Grade'.
Pandas
df.columns = pd.MultiIndex.from_tuples([('Math', 'Score'), ('Math', 'Grade')])
Sample Program

This example shows sales data for two products in two regions (North, South) over two quarters (Q1, Q2). We set columns as MultiIndex to group by region and quarter.

Pandas
import pandas as pd

# Create a simple DataFrame
sales = pd.DataFrame({
    'North_Q1': [100, 150],
    'North_Q2': [110, 160],
    'South_Q1': [90, 120],
    'South_Q2': [95, 130]
}, index=['Product A', 'Product B'])

# Define MultiIndex for columns
multi_cols = pd.MultiIndex.from_tuples([
    ('North', 'Q1'),
    ('North', 'Q2'),
    ('South', 'Q1'),
    ('South', 'Q2')
])

# Set the MultiIndex columns
sales.columns = multi_cols

print(sales)
OutputSuccess
Important Notes

MultiIndex columns let you access data by levels, e.g., df['North']['Q1'].

Be careful to keep the number of tuples equal to the number of columns.

You can also create MultiIndex from arrays using pd.MultiIndex.from_arrays().

Summary

MultiIndex columns organize data with multiple header levels.

Use pd.MultiIndex.from_tuples() with a list of tuples to set MultiIndex columns.

This helps group related columns and makes data easier to analyze.