What is the output of the following code?
import pandas as pd
data = {'Gender': ['Male', 'Female', 'Female', 'Male', 'Female'],
'Preference': ['Coffee', 'Coffee', 'Tea', 'Coffee', 'Tea']}
df = pd.DataFrame(data)
result = pd.crosstab(df['Gender'], df['Preference'])
print(result)import pandas as pd data = {'Gender': ['Male', 'Female', 'Female', 'Male', 'Female'], 'Preference': ['Coffee', 'Coffee', 'Tea', 'Coffee', 'Tea']} df = pd.DataFrame(data) result = pd.crosstab(df['Gender'], df['Preference']) print(result)
Count how many times each gender prefers each drink.
The crosstab counts occurrences of each combination. Females prefer Tea 2 times and Coffee 1 time. Males prefer Coffee 2 times and Tea 0 times.
What does the margins=True parameter add to the output of pd.crosstab()?
Think about what 'margins' means in a table context.
Margins add totals for rows and columns, helping to see overall counts.
What is the output of this code?
import pandas as pd
data = {'Team': ['A', 'A', 'B', 'B', 'A'],
'Result': ['Win', 'Lose', 'Win', 'Lose', 'Win'],
'Points': [3, 0, 3, 0, 3]}
df = pd.DataFrame(data)
result = pd.crosstab(df['Team'], df['Result'], values=df['Points'], aggfunc='sum', margins=True)
print(result)import pandas as pd data = {'Team': ['A', 'A', 'B', 'B', 'A'], 'Result': ['Win', 'Lose', 'Win', 'Lose', 'Win'], 'Points': [3, 0, 3, 0, 3]} df = pd.DataFrame(data) result = pd.crosstab(df['Team'], df['Result'], values=df['Points'], aggfunc='sum', margins=True) print(result)
Sum points for each team and result, then add totals.
Team A has two wins (3+3=6 points) and one lose (0). Team B has one win (3) and one lose (0). Totals sum accordingly.
What error will this code produce?
import pandas as pd
data = {'Category': ['X', 'Y', 'X'], 'Value': [10, 20, 30]}
df = pd.DataFrame(data)
result = pd.crosstab(df['Category'], df['Value'], aggfunc='sum')
print(result)import pandas as pd data = {'Category': ['X', 'Y', 'X'], 'Value': [10, 20, 30]} df = pd.DataFrame(data) result = pd.crosstab(df['Category'], df['Value'], aggfunc='sum') print(result)
Check if 'values' parameter is provided when using 'aggfunc'.
aggfunc requires a 'values' argument to know what to aggregate. Without it, pandas raises a TypeError.
You have survey data with columns AgeGroup and FavoriteFruit. You want to find the percentage distribution of favorite fruits within each age group. Which code snippet produces this result?
Think about normalizing rows to get percentages within each age group.
normalize='index' normalizes counts row-wise, giving percentage distribution within each age group.