Pandasdata~10 mins

pivot_table() for summarization in Pandas - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - pivot_table() for summarization

Start with DataFrame

↓

Choose index (rows)

↓

Choose columns (optional)

↓

Choose values to summarize

↓

Choose aggregation function

↓

pivot_table() creates summary table

↓

Use or display summarized data

pivot_table() takes a DataFrame and summarizes data by grouping rows and columns, then applying an aggregation like sum or mean.

Execution Sample

Pandas

import pandas as pd

data = {'City': ['NY', 'LA', 'NY', 'LA'],
        'Year': [2020, 2020, 2021, 2021],
        'Sales': [100, 200, 150, 250]}

df = pd.DataFrame(data)

summary = df.pivot_table(index='City', columns='Year', values='Sales', aggfunc='sum')

This code summarizes sales by city and year, showing total sales for each city-year pair.

Execution Table

Step	Action	DataFrame State	Result
1	Create DataFrame	{'City': ['NY', 'LA', 'NY', 'LA'], 'Year': [2020, 2020, 2021, 2021], 'Sales': [100, 200, 150, 250]}	DataFrame with 4 rows and 3 columns
2	Call pivot_table with index='City', columns='Year', values='Sales', aggfunc='sum'	Same DataFrame	Grouped sales by City and Year, summed values
3	Group rows by City: NY and LA	Groups: NY (rows 0,2), LA (rows 1,3)	Two groups formed
4	Group columns by Year: 2020 and 2021	Columns split into 2020 and 2021	Two columns for years
5	Sum Sales for NY in 2020	Rows with City=NY and Year=2020	100
6	Sum Sales for NY in 2021	Rows with City=NY and Year=2021	150
7	Sum Sales for LA in 2020	Rows with City=LA and Year=2020	200
8	Sum Sales for LA in 2021	Rows with City=LA and Year=2021	250
9	Create pivot table with sums	Aggregated sums	pivot_table with City as index, Year as columns, Sales sums as values
10	Display pivot table	Final summarized table	City\Year 2020 2021 LA 200 250 NY 100 150
11	End	No further action	Execution complete

💡 All groups processed and summarized, pivot_table created successfully

Variable Tracker

Variable	Start	After Step 2	After Step 9	Final
df	undefined	DataFrame with 4 rows and 3 columns	Same DataFrame	Same DataFrame
summary	undefined	undefined	pivot_table DataFrame with summarized sales	pivot_table DataFrame with summarized sales

Key Moments - 3 Insights

Why do we need to specify 'index' and 'columns' in pivot_table?

What happens if we don't specify an aggregation function?

Why is the result a new DataFrame and not modifying the original?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at step 5. What is the sum of Sales for NY in 2020?

A100

B150

C200

D250

Concept Snapshot

pivot_table() summarizes data by grouping rows (index) and columns,
then applies an aggregation function (aggfunc) like sum or mean.
Syntax: df.pivot_table(index='row_col', columns='col_col', values='val_col', aggfunc='sum')
Returns a new DataFrame with summarized values.
Useful for quick data summaries and cross-tabulations.

Full Transcript

We start with a DataFrame containing sales data by city and year. Using pivot_table(), we choose 'City' as rows (index), 'Year' as columns, and 'Sales' as values to summarize. The aggregation function 'sum' adds sales for each city-year pair. Step by step, the data groups by city and year, sums sales, and creates a new summarized table. The original DataFrame remains unchanged. This method helps quickly see total sales per city and year in a clear table format.