Complete the code to create a simple cross-tabulation of 'Gender' and 'Preference'.
import pandas as pd data = {'Gender': ['Male', 'Female', 'Female', 'Male'], 'Preference': ['A', 'B', 'A', 'B']} df = pd.DataFrame(data) ct = pd.crosstab(df['Gender'], df[[1]]) print(ct)
The second argument to pd.crosstab should be the column to cross-tabulate against 'Gender'. Here, it is 'Preference'.
Complete the code to add margins (totals) to the cross-tabulation.
ct = pd.crosstab(df['Gender'], df['Preference'], [1]=True) print(ct)
The margins=True argument adds row and column totals to the cross-tabulation.
Fix the error in the code to normalize the cross-tabulation by columns.
ct = pd.crosstab(df['Gender'], df['Preference'], normalize=[1]) print(ct)
To normalize by columns, use normalize='columns'. Using True normalizes over all values.
Fill both blanks to create a cross-tabulation with aggregation of mean 'Score' by 'Gender' and 'Preference'.
data = {'Gender': ['Male', 'Female', 'Female', 'Male'], 'Preference': ['A', 'B', 'A', 'B'], 'Score': [10, 20, 15, 25]}
df = pd.DataFrame(data)
ct = pd.crosstab(df['Gender'], df['Preference'], values=df[[1]], aggfunc=[2])
print(ct)The values argument specifies the column to aggregate ('Score'), and aggfunc specifies the aggregation function ('mean').
Fill all three blanks to create a normalized cross-tabulation with margins, aggregating the sum of 'Score' by 'Gender' and 'Preference'.
ct = pd.crosstab(df[[1]], df[[2]], values=df['Score'], aggfunc=[3], normalize='all', margins=True) print(ct)
The first two blanks are the row and column variables ('Gender' and 'Preference'). The aggregation function is 'sum' to total the scores.