Pandasdata~10 mins

Pivot with aggregation functions in Pandas - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Pivot with aggregation functions

Start with DataFrame

↓

Choose index, columns, values

↓

Apply aggregation function

↓

Create pivot table

↓

View summarized data

We start with a DataFrame, select which columns to use as index, columns, and values, then apply an aggregation function to summarize data in a pivot table.

Execution Sample

Pandas

import pandas as pd

data = {'City': ['NY', 'LA', 'NY', 'LA', 'NY'],
        'Year': [2020, 2020, 2021, 2021, 2021],
        'Sales': [100, 200, 150, 250, 300]}

df = pd.DataFrame(data)

pivot = df.pivot_table(index='City', columns='Year', values='Sales', aggfunc='sum')

This code creates a pivot table that sums sales for each city by year.

Execution Table

Step	Action	DataFrame State	Pivot Table State
1	Create DataFrame from data dictionary	City Year Sales 0 NY 2020 100 1 LA 2020 200 2 NY 2021 150 3 LA 2021 250 4 NY 2021 300	N/A
2	Call pivot_table with index='City', columns='Year', values='Sales', aggfunc='sum'	Same as step 1	Year 2020 2021 City LA 200 250 NY 100 450
3	Pivot table created with sums of Sales by City and Year	Same as step 1	Same as step 2

💡 Pivot table created after aggregation sums sales grouped by City and Year.

Variable Tracker

Variable	Start	After Step 1	After Step 2	Final
df	None	DataFrame with 5 rows and 3 columns	Same	Same
pivot	None	None	Pivot table with summed sales	Same

Key Moments - 2 Insights

Why do we use aggfunc='sum' in pivot_table?

What happens if we don't specify values in pivot_table?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at step 2, what is the sum of sales for NY in 2021?

A450

B300

C150

D100

Concept Snapshot

pivot_table(index, columns, values, aggfunc)
- index: rows to group by
- columns: columns to spread
- values: data to aggregate
- aggfunc: function to summarize (sum, mean, etc.)
Creates a summary table by grouping and aggregating data.

Full Transcript

We start with a DataFrame containing sales data by city and year. Using pandas pivot_table, we select 'City' as index, 'Year' as columns, and 'Sales' as values. We apply the sum aggregation function to add sales for each city-year pair. The pivot table shows total sales per city for each year. This process groups data and summarizes it in a clear table format.