Cross-tabulation Advanced Usage
📖 Scenario: You work as a data analyst for a retail company. You have sales data that includes the product category, the region where the product was sold, and whether the sale was made online or in-store. Your manager wants to understand the relationship between product categories, sales channels, and regions to improve marketing strategies.
🎯 Goal: Build a cross-tabulation table using pandas that shows the count of sales for each product category by sales channel and region. Then, add margins (totals) and normalize the data by row to see proportions.
📋 What You'll Learn
Create a pandas DataFrame with the exact sales data provided.
Create a variable for the normalization axis.
Use pandas crosstab to create a multi-index cross-tabulation of product category vs sales channel and region.
Add margins (totals) to the cross-tabulation.
Normalize the cross-tabulation by rows using the normalization axis variable.
Print the final normalized cross-tabulation table.
💡 Why This Matters
🌍 Real World
Cross-tabulation helps businesses analyze relationships between multiple categorical variables, such as product sales by region and channel, to make informed decisions.
💼 Career
Data analysts and data scientists use cross-tabulation to summarize and explore data patterns, which supports marketing, sales, and operational strategies.
Progress0 / 4 steps