0
0
Pandasdata~15 mins

Selecting data with MultiIndex in Pandas - Mini Project: Build & Apply

Choose your learning style9 modes available
Selecting data with MultiIndex
📖 Scenario: You work in a store that sells fruits in different cities. You have sales data organized by city and fruit type. You want to find sales numbers for specific cities and fruits easily.
🎯 Goal: Build a pandas DataFrame with a MultiIndex for city and fruit. Then select sales data for a specific city and fruit using MultiIndex selection.
📋 What You'll Learn
Create a pandas DataFrame with a MultiIndex from tuples of city and fruit
Add a sales column with integer values
Create a variable for the city to select
Use .loc with the MultiIndex to select sales data for the chosen city
Print the selected sales data
💡 Why This Matters
🌍 Real World
Stores and companies often have sales data organized by multiple categories like city and product. MultiIndex helps manage and analyze such data efficiently.
💼 Career
Data analysts and scientists use MultiIndex DataFrames to handle complex datasets with multiple levels of grouping, making data selection and aggregation easier.
Progress0 / 4 steps
1
Create a MultiIndex DataFrame
Import pandas as pd. Create a list of tuples called index_tuples with these exact pairs: ('New York', 'Apple'), ('New York', 'Banana'), ('Los Angeles', 'Apple'), ('Los Angeles', 'Banana'). Then create a MultiIndex called multi_index from index_tuples with names ['City', 'Fruit']. Finally, create a DataFrame called sales_data with this MultiIndex and a column 'Sales' with values [100, 150, 200, 250].
Pandas
Need a hint?

Use pd.MultiIndex.from_tuples() to create the MultiIndex. Then pass it as the index when creating the DataFrame.

2
Set the city to select
Create a variable called selected_city and set it to the string 'New York'.
Pandas
Need a hint?

Just assign the string 'New York' to the variable selected_city.

3
Select sales data for the chosen city
Use sales_data.loc with selected_city to select all fruit sales for that city. Store the result in a variable called city_sales.
Pandas
Need a hint?

Use sales_data.loc[selected_city] to get all rows where the first level of the MultiIndex matches selected_city.

4
Print the selected sales data
Print the variable city_sales to display the sales data for the selected city.
Pandas
Need a hint?

Use print(city_sales) to show the sales for the selected city.