0
0
Data Analysis Pythondata~30 mins

Single and multiple column grouping in Data Analysis Python - Mini Project: Build & Apply

Choose your learning style9 modes available
Single and Multiple Column Grouping
📖 Scenario: You work in a small bookstore. You have sales data for different books, including the genre and the month when the book was sold. You want to understand how many books were sold by genre and by month.
🎯 Goal: Build a program that groups sales data by one column (genre) and then by two columns (genre and month) to count the number of books sold in each group.
📋 What You'll Learn
Create a dictionary with sales data including book titles, genres, and months sold
Create a pandas DataFrame from the dictionary
Group the data by the 'Genre' column and count the number of books sold per genre
Group the data by both 'Genre' and 'Month' columns and count the number of books sold per group
Print the results of both groupings
💡 Why This Matters
🌍 Real World
Grouping data by one or more columns is common in sales analysis, customer segmentation, and many other business tasks to summarize and understand data.
💼 Career
Data analysts and data scientists often use grouping to prepare data for reports and insights that help businesses make decisions.
Progress0 / 4 steps
1
Create the sales data dictionary and DataFrame
Create a dictionary called sales_data with these exact entries: 'Title' as a list of ["Book A", "Book B", "Book C", "Book D", "Book E"], 'Genre' as ["Fiction", "Fiction", "Non-Fiction", "Fiction", "Non-Fiction"], and 'Month' as ["Jan", "Feb", "Jan", "Feb", "Jan"]. Then create a pandas DataFrame called df from sales_data.
Data Analysis Python
Hint

Use pd.DataFrame() to create the DataFrame from the dictionary.

2
Create a grouping configuration variable
Create a variable called group_column and set it to the string 'Genre'.
Data Analysis Python
Hint

Just assign the string 'Genre' to the variable group_column.

3
Group by one and two columns and count books sold
Use group_column to group df by one column and count the number of books sold per genre. Store the result in grouped_one. Then group df by both 'Genre' and 'Month' and count the books sold per group. Store this in grouped_two.
Data Analysis Python
Hint

Use df.groupby(...).size() to count the number of rows in each group.

4
Print the grouped results
Print grouped_one and then print grouped_two to show the counts of books sold by genre and by genre with month.
Data Analysis Python
Hint

Use two print() statements, one for each grouped result.