0
0
Data Analysis Pythondata~20 mins

Cross-tabulation with crosstab() in Data Analysis Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Cross-tabulation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
data_output
intermediate
2:00remaining
Output of simple crosstab() with two categorical columns
Given the DataFrame below, what is the output of the crosstab() function?
Data Analysis Python
import pandas as pd

data = {'Gender': ['Male', 'Female', 'Female', 'Male', 'Female'],
        'Preference': ['Coffee', 'Coffee', 'Tea', 'Coffee', 'Tea']}
df = pd.DataFrame(data)

result = pd.crosstab(df['Gender'], df['Preference'])
print(result)
A
Preference  Coffee  Tea
Gender                 
Female          1    2
Male            2    0
B
Preference  Coffee  Tea
Gender                 
Female          2    1
Male            1    2
C
Preference  Coffee  Tea
Gender                 
Female          1    3
Male            2    0
D
Preference  Coffee  Tea
Gender                 
Female          0    2
Male            2    1
Attempts:
2 left
💡 Hint
Count how many times each Gender prefers Coffee or Tea.
🧠 Conceptual
intermediate
1:30remaining
Understanding margins parameter in crosstab()
What does setting the parameter margins=true do in the pd.crosstab() function?
ANormalizes the crosstab values to show proportions instead of counts.
BAdds a row and column with totals (sum) for each category and overall.
CFilters the crosstab to only show rows with more than 5 counts.
DSorts the crosstab rows and columns alphabetically.
Attempts:
2 left
💡 Hint
Think about what 'margins' means in tables.
🔧 Debug
advanced
2:00remaining
Identify the error in crosstab() usage
What error will this code raise?
Data Analysis Python
import pandas as pd

ages = [23, 45, 31, 35]
genders = ['M', 'F', 'F', 'M']

# Valid: passing lists to crosstab
result = pd.crosstab(ages, genders, margins=True)
print(result)
ANo error, prints the crosstab table
BTypeError: unhashable type: 'list'
CTypeError: Cannot interpret 'ages' as a data frame column
DValueError: Index contains duplicate entries, cannot reshape
Attempts:
2 left
💡 Hint
pd.crosstab accepts lists as array-like inputs.
visualization
advanced
2:30remaining
Visualizing crosstab output with a heatmap
Which code snippet correctly creates a heatmap visualization of the crosstab result using seaborn?
Data Analysis Python
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = {'City': ['NY', 'LA', 'NY', 'LA', 'NY'],
        'Transport': ['Car', 'Bus', 'Bus', 'Car', 'Car']}
df = pd.DataFrame(data)

ct = pd.crosstab(df['City'], df['Transport'])
A
plt.plot(ct)
plt.show()
B
sns.barplot(data=ct)
plt.show()
C
sns.heatmap(ct, annot=True)
plt.show()
D
sns.scatterplot(data=ct)
plt.show()
Attempts:
2 left
💡 Hint
Heatmaps are good for showing tables with numbers.
🚀 Application
expert
3:00remaining
Using crosstab() to analyze survey data with normalization
You have a DataFrame with columns 'AgeGroup' and 'Satisfaction' from a survey. You want to see the proportion of each satisfaction level within each age group (rows sum to 1). Which crosstab() call achieves this?
Data Analysis Python
import pandas as pd

data = {'AgeGroup': ['18-25', '18-25', '26-35', '26-35', '26-35', '36-45'],
        'Satisfaction': ['High', 'Low', 'Medium', 'High', 'Low', 'High']}
df = pd.DataFrame(data)
Apd.crosstab(df['AgeGroup'], df['Satisfaction'], margins=True, normalize='all')
Bpd.crosstab(df['AgeGroup'], df['Satisfaction'], normalize='columns')
Cpd.crosstab(df['AgeGroup'], df['Satisfaction'], normalize=True)
Dpd.crosstab(df['AgeGroup'], df['Satisfaction'], normalize='index')
Attempts:
2 left
💡 Hint
Normalization by 'index' means row-wise proportions.