0
0
Data Analysis Pythondata~30 mins

MultiIndex (hierarchical indexing) in Data Analysis Python - Mini Project: Build & Apply

Choose your learning style9 modes available
Working with MultiIndex (hierarchical indexing) in pandas
📖 Scenario: You work in a small store that sells fruits in different cities. You have sales data for each fruit in each city for two months. You want to organize this data so you can easily find sales by city and fruit.
🎯 Goal: You will create a pandas DataFrame with a MultiIndex using city and fruit. Then you will select sales data for a specific city and fruit using the MultiIndex.
📋 What You'll Learn
Create a pandas DataFrame with MultiIndex from city and fruit
Use a tuple list to create the MultiIndex
Select sales data for a specific city and fruit using .loc
Print the selected sales data
💡 Why This Matters
🌍 Real World
Stores and businesses often have sales data organized by multiple categories like location and product. MultiIndex helps organize and access this data easily.
💼 Career
Data analysts and scientists use MultiIndex in pandas to handle complex datasets with multiple levels of grouping, making data analysis more efficient.
Progress0 / 4 steps
1
Create the sales data dictionary
Create a dictionary called sales_data with these exact entries: ('New York', 'Apple'): [100, 120], ('New York', 'Banana'): [90, 110], ('Los Angeles', 'Apple'): [80, 95], ('Los Angeles', 'Banana'): [70, 85]. The lists represent sales for two months.
Data Analysis Python
Hint

Use tuples as keys in the dictionary to represent city and fruit pairs.

2
Create the MultiIndex and DataFrame
Create a variable called index using pd.MultiIndex.from_tuples() with the keys of sales_data. Then create a DataFrame called df with the values of sales_data, the index as index, and columns named ['Month 1', 'Month 2'].
Data Analysis Python
Hint

Use list(sales_data.values()) to get the data for the DataFrame.

3
Select sales data for New York Apples
Use df.loc with the tuple ('New York', 'Apple') to select the sales data for apples in New York. Store this in a variable called ny_apple_sales.
Data Analysis Python
Hint

Use df.loc[('New York', 'Apple')] to get the row for New York apples.

4
Print the selected sales data
Print the variable ny_apple_sales to display the sales for apples in New York.
Data Analysis Python
Hint

Use print(ny_apple_sales) to show the sales data.