Recall & Review
beginner
What is the purpose of the
crosstab() function in pandas?The
crosstab() function creates a table that shows the frequency (count) of combinations between two or more categorical variables. It helps to summarize and compare data easily.Click to reveal answer
beginner
How do you create a simple cross-tabulation between two columns
col1 and col2 in a DataFrame df?Use
pd.crosstab(df['col1'], df['col2']). This counts how many times each combination of values from col1 and col2 appears.Click to reveal answer
intermediate
What does the
normalize parameter do in crosstab()?The
normalize parameter changes counts into proportions or percentages. For example, normalize='index' shows row-wise proportions, and normalize='columns' shows column-wise proportions.Click to reveal answer
intermediate
Can
crosstab() handle more than two variables? How?Yes, by passing multiple arrays or columns as arguments. For example,
pd.crosstab([df['col1'], df['col2']], df['col3']) creates a multi-index table showing counts for combinations of col1 and col2 against col3.Click to reveal answer
beginner
What is a real-life example where
crosstab() is useful?Imagine a survey with answers about gender and favorite fruit.
crosstab() can show how many males and females prefer each fruit, helping to understand preferences by group.Click to reveal answer
What does
pd.crosstab(df['A'], df['B']) return?✗ Incorrect
crosstab() counts how often each pair of values appears in the two columns.
How do you get proportions instead of counts in
crosstab()?✗ Incorrect
Use normalize='index' for row proportions or normalize='columns' for column proportions.
Which of these is a valid way to use
crosstab() with three variables?✗ Incorrect
Passing a list of columns as the first argument allows cross-tabulation on multiple variables.
What type of data is best suited for
crosstab()?✗ Incorrect
crosstab() works best with categorical data to count combinations.
If you want to see how many customers bought product A or B by region, which function helps?
✗ Incorrect
crosstab() quickly shows counts of combinations like product and region.
Explain how to use
crosstab() to analyze the relationship between two categorical columns in a DataFrame.Think about how to count how often each pair of categories appears.
You got /3 concepts.
Describe how the
normalize parameter changes the output of crosstab() and why it might be useful.Consider when percentages are easier to understand than raw counts.
You got /3 concepts.