Data Analysis Pythondata~10 mins

Cross-tabulation with crosstab() in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Cross-tabulation with crosstab()

Start with DataFrame

↓

Select two columns

↓

Apply pd.crosstab()

↓

Count combinations

↓

Create cross-tab table

↓

Display result

We start with a data table, pick two columns, count how often each pair appears, and show the counts in a new table.

Execution Sample

Data Analysis Python

import pandas as pd

data = {'Gender': ['M', 'F', 'F', 'M', 'F'],
        'Preference': ['Tea', 'Coffee', 'Tea', 'Coffee', 'Tea']}
df = pd.DataFrame(data)

ct = pd.crosstab(df['Gender'], df['Preference'])
print(ct)

This code counts how many males and females prefer Tea or Coffee and shows the counts in a table.

Execution Table

Step	Action	DataFrame State	Cross-tab Result
1	Create DataFrame with Gender and Preference columns	{'Gender': ['M', 'F', 'F', 'M', 'F'], 'Preference': ['Tea', 'Coffee', 'Tea', 'Coffee', 'Tea']}	N/A
2	Select 'Gender' and 'Preference' columns	DataFrame with two columns	N/A
3	Apply pd.crosstab() to count combinations	Same DataFrame	Counts of each Gender-Preference pair
4	Count 'M' with 'Tea' → 1	Same DataFrame	M Tea: 1
5	Count 'M' with 'Coffee' → 2	Same DataFrame	M Coffee: 2
6	Count 'F' with 'Tea' → 2	Same DataFrame	F Tea: 2
7	Count 'F' with 'Coffee' → 1	Same DataFrame	F Coffee: 1
8	Build cross-tab table	Same DataFrame	Table: Coffee Tea F 1 2 M 2 1
9	Print cross-tab table	Same DataFrame	Output displayed
10	End of execution	Same DataFrame	Execution stops

💡 All rows processed, cross-tabulation complete

Variable Tracker

Variable	Start	After Step 1	After Step 3	Final
df	None	{'Gender': ['M', 'F', 'F', 'M', 'F'], 'Preference': ['Tea', 'Coffee', 'Tea', 'Coffee', 'Tea']}	Same DataFrame	Same DataFrame
ct	None	None	Counts of Gender-Preference pairs	Cross-tab table with counts

Key Moments - 2 Insights

Why does the cross-tab table show zeros for some combinations?

Can crosstab() work with columns that have missing values?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution table at step 6, what is the count of females who prefer Tea?

Concept Snapshot

pd.crosstab(index, columns) counts occurrences of combinations between two columns.
Input: two Series or columns from DataFrame.
Output: table with counts for each pair.
Useful for quick frequency tables.
Missing pairs show zero count by default.

Full Transcript

Cross-tabulation with pd.crosstab() takes two columns from a data table and counts how often each pair of values appears together. We start with a DataFrame containing columns like Gender and Preference. Then we select these columns and apply pd.crosstab() to count combinations. The result is a new table showing counts for each Gender-Preference pair. For example, how many males prefer Tea or Coffee. The process counts each pair step-by-step and builds the table. Missing pairs get zero counts. This method helps summarize relationships between two categorical variables quickly.