0
0
Pandasdata~30 mins

Split-apply-combine mental model in Pandas - Mini Project: Build & Apply

Choose your learning style9 modes available
Split-apply-combine mental model
📖 Scenario: Imagine you work at a small bookstore. You have sales data for different book genres and want to find out the average sales for each genre. This helps you understand which types of books sell best.
🎯 Goal: You will create a pandas DataFrame with sales data, then use the split-apply-combine method to calculate the average sales per genre.
📋 What You'll Learn
Create a pandas DataFrame named sales_data with columns 'Genre' and 'Sales' using the exact data provided.
Create a variable named group_column and set it to the string 'Genre'.
Use the split-apply-combine method by grouping sales_data by group_column and calculating the mean sales for each genre, storing the result in average_sales.
Print the average_sales DataFrame to show the average sales per genre.
💡 Why This Matters
🌍 Real World
Bookstores and many businesses use the split-apply-combine method to analyze grouped data, like sales by category or customer segment.
💼 Career
Data analysts and data scientists often use pandas groupby operations to summarize and understand data quickly.
Progress0 / 4 steps
1
Create the sales data DataFrame
Create a pandas DataFrame called sales_data with two columns: 'Genre' and 'Sales'. Use this exact data: 'Genre' values are 'Fiction', 'Non-Fiction', 'Fiction', 'Science', 'Non-Fiction', 'Science'. Corresponding 'Sales' values are 100, 150, 200, 130, 170, 120. Import pandas as pd first.
Pandas
Need a hint?

Use pd.DataFrame with a dictionary where keys are column names and values are lists of data.

2
Set the group column
Create a variable called group_column and set it to the string 'Genre'.
Pandas
Need a hint?

Just assign the string 'Genre' to the variable group_column.

3
Calculate average sales per genre
Use the split-apply-combine method by grouping sales_data by group_column and calculating the mean of 'Sales'. Store the result in a variable called average_sales.
Pandas
Need a hint?

Use sales_data.groupby(group_column)['Sales'].mean() to get average sales per genre.

4
Print the average sales
Print the average_sales variable to display the average sales per genre.
Pandas
Need a hint?

Use print(average_sales) to show the result.