0
0
Pandasdata~20 mins

Cross-tabulation advanced usage in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Cross-tabulation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of multi-index cross-tabulation with margins
What is the output of this code snippet using pandas crosstab with multi-index and margins?
Pandas
import pandas as pd

data = {'Gender': ['Male', 'Female', 'Female', 'Male', 'Female'],
        'AgeGroup': ['Adult', 'Adult', 'Child', 'Child', 'Adult'],
        'Preference': ['B', 'A', 'A', 'B', 'A']}
df = pd.DataFrame(data)

result = pd.crosstab(index=[df['Gender'], df['AgeGroup']], columns=df['Preference'], margins=True)
print(result)
A{'A': {'Female': {'Adult': 2, 'Child': 1}, 'Male': {'Adult': 0, 'Child': 0}, 'All': 3}, 'B': {'Female': {'Adult': 0, 'Child': 0}, 'Male': {'Adult': 1, 'Child': 1}, 'All': 2}, 'All': {'Female': {'Adult': 2, 'Child': 1}, 'Male': {'Adult': 1, 'Child': 1}, 'All': 5}}
B
Preference       A  B  All
Gender AgeGroup           
Female Adult      2  0    2
       Child      1  0    1
Male   Adult      1  0    1
       Child      0  1    1
All              4  1    5
C
Preference       A  B  All
Gender AgeGroup           
Female Adult      2  0    2
       Child      1  0    1
Male   Adult      0  1    1
       Child      0  1    1
All              3  2    5
D
Preference       A  B  All
Gender AgeGroup           
Female Male       0  0    0
       Adult      2  0    2
       Child      1  0    1
Male   Adult      0  1    1
       Child      0  1    1
All              3  2    5
Attempts:
2 left
💡 Hint
Look carefully at the counts for each Gender and AgeGroup combination and the Preference columns.
data_output
intermediate
1:30remaining
Number of unique values in crosstab result
After running this code, how many unique values are in the resulting crosstab DataFrame?
Pandas
import pandas as pd

records = {'City': ['NY', 'LA', 'NY', 'LA', 'NY', 'LA'],
           'Product': ['X', 'X', 'Y', 'Y', 'X', 'Y'],
           'Sales': [10, 20, 10, 30, 20, 30]}
df = pd.DataFrame(records)

ct = pd.crosstab(df['City'], df['Product'], values=df['Sales'], aggfunc='sum', dropna=False)
unique_values = ct.nunique().sum()
A4
B5
C3
D6
Attempts:
2 left
💡 Hint
Check the sums of sales for each City and Product combination.
visualization
advanced
2:30remaining
Visualizing crosstab with normalization
Which option shows the correct heatmap visualization code for a normalized crosstab of 'Department' vs 'Satisfaction'?
Pandas
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

data = {'Department': ['HR', 'HR', 'IT', 'IT', 'Sales', 'Sales', 'Sales'],
        'Satisfaction': ['High', 'Low', 'High', 'Low', 'High', 'Low', 'Low']}
df = pd.DataFrame(data)

ct = pd.crosstab(df['Department'], df['Satisfaction'], normalize='index')
A
sns.heatmap(ct, annot=True, cmap='coolwarm')
plt.show()
B
sns.heatmap(ct.T, annot=True, cmap='viridis')
plt.show()
C
sns.heatmap(ct, annot=False, cmap='coolwarm')
plt.show()
D
sns.heatmap(ct, annot=True, cmap='Blues', cbar=False)
plt.show()
Attempts:
2 left
💡 Hint
Normalization is by index, so rows sum to 1. Annotated heatmap helps see values.
🔧 Debug
advanced
1:30remaining
Identify the error in crosstab with missing values
What error will this code raise when running the crosstab with missing values in the data?
Pandas
import pandas as pd

data = {'Team': ['A', 'B', 'A', None, 'B'],
        'Result': ['Win', 'Lose', None, 'Win', 'Lose']}
df = pd.DataFrame(data)

ct = pd.crosstab(df['Team'], df['Result'])
print(ct)
AKeyError
BNo error, prints crosstab with NaN rows/columns dropped
CTypeError
DValueError
Attempts:
2 left
💡 Hint
Check how pandas crosstab handles missing values by default.
🚀 Application
expert
2:30remaining
Calculate weighted crosstab with custom aggregation
Given this DataFrame, which option correctly computes a weighted crosstab of 'Category' vs 'Type' using the sum of 'Weight' as aggregation?
Pandas
import pandas as pd

data = {'Category': ['X', 'X', 'Y', 'Y', 'Z', 'Z'],
        'Type': ['A', 'B', 'A', 'B', 'A', 'B'],
        'Weight': [1.5, 2.0, 3.0, 1.0, 2.5, 0.5]}
df = pd.DataFrame(data)
Apd.crosstab(index=df['Category'], columns=df['Type'], values=df['Weight'], aggfunc='mean')
Bpd.crosstab(df['Category'], df['Type'], aggfunc='sum')
Cpd.crosstab(df['Category'], df['Type'], values='Weight', aggfunc='sum')
Dpd.crosstab(df['Category'], df['Type'], values=df['Weight'], aggfunc='sum')
Attempts:
2 left
💡 Hint
Check the correct parameter names for values and aggregation function in pandas crosstab.