Recall & Review
beginner
What is the purpose of the
pivot_table() function in pandas?The
pivot_table() function is used to summarize and aggregate data in a DataFrame by creating a new table that groups data by one or more keys and applies aggregation functions like sum, mean, or count.Click to reveal answer
beginner
Which parameters are commonly used in
pivot_table()?- data: The DataFrame to summarize.
- index: Column(s) to group by on rows.
- columns: Column(s) to group by on columns.
- values: Column(s) to aggregate.
- aggfunc: Aggregation function like
sum,mean, orcount.
Click to reveal answer
intermediate
How does
pivot_table() handle missing values by default?By default,
pivot_table() fills missing values with NaN. You can change this behavior using the fill_value parameter to replace missing values with a specific value like 0.Click to reveal answer
intermediate
What is the difference between
pivot() and pivot_table()?pivot() reshapes data without aggregation and requires unique index/column pairs. pivot_table() allows aggregation and can handle duplicate entries by applying aggregation functions.Click to reveal answer
beginner
Write a simple example of using
pivot_table() to find the average sales per product category.Example:<br><pre>import pandas as pd
data = {'Category': ['A', 'A', 'B', 'B'], 'Sales': [100, 150, 200, 250]}
df = pd.DataFrame(data)
pivot = df.pivot_table(index='Category', values='Sales', aggfunc='mean')
print(pivot)</pre>Click to reveal answer
What does the
aggfunc parameter in pivot_table() specify?✗ Incorrect
The aggfunc parameter tells pivot_table() how to combine data, like using sum, mean, or count.
Which parameter in
pivot_table() controls the rows of the new table?✗ Incorrect
The index parameter sets which column(s) become the rows in the pivot table.
If your data has duplicate entries for the same index and column, which function should you use to summarize it?
✗ Incorrect
pivot_table() can handle duplicates by aggregating them, unlike pivot().
What will
pivot_table() fill missing values with by default?✗ Incorrect
By default, missing values are shown as NaN unless fill_value is set.
Which of these is NOT a valid aggregation function for
aggfunc?✗ Incorrect
sort is not an aggregation function; it is used for ordering data.
Explain how you would use
pivot_table() to summarize sales data by region and product.Think about grouping rows by region and columns by product, then summarizing sales.
You got /4 concepts.
Describe the difference between
pivot() and pivot_table() and when to use each.Consider if your data has duplicates or needs aggregation.
You got /4 concepts.