Data Analysis Pythondata~10 mins

Pivot tables with pivot_table() in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Pivot tables with pivot_table()

Start with DataFrame

↓

Choose index (rows)

↓

Choose columns (optional)

↓

Choose values to aggregate

↓

Choose aggregation function

↓

pivot_table() creates summary table

↓

Use or display pivot table

Pivot tables summarize data by grouping rows and columns and applying aggregation functions.

Execution Sample

Data Analysis Python

import pandas as pd

data = {'City': ['NY', 'LA', 'NY', 'LA'],
        'Year': [2020, 2020, 2021, 2021],
        'Sales': [100, 200, 150, 250]}

df = pd.DataFrame(data)
pivot = df.pivot_table(index='City', columns='Year', values='Sales', aggfunc='sum')
print(pivot)

This code creates a pivot table showing total sales by city and year.

Execution Table

Step	Action	DataFrame State	Pivot Table State	Explanation
1	Create DataFrame df	{'City': ['NY', 'LA', 'NY', 'LA'], 'Year': [2020, 2020, 2021, 2021], 'Sales': [100, 200, 150, 250]}	None	Initial data with city, year, and sales columns.
2	Call pivot_table()	df unchanged	Empty	Start creating pivot table with index='City', columns='Year', values='Sales', aggfunc='sum'.
3	Group data by City and Year	df unchanged	Groups: NY-2020, LA-2020, NY-2021, LA-2021	Data grouped by city and year combinations.
4	Aggregate Sales with sum	df unchanged	{NY: {2020: 100, 2021: 150}, LA: {2020: 200, 2021: 250}}	Sum sales for each city-year group.
5	Build pivot table DataFrame	df unchanged	Year 2020 2021 City LA 200 250 NY 100 150	Pivot table shows sales by city (rows) and year (columns).
6	Print pivot table	df unchanged	Same as step 5	Output the pivot table to console.
7	End	df unchanged	Pivot table ready	Pivot table creation complete.

💡 All data grouped and aggregated; pivot table created successfully.

Variable Tracker

Variable	Start	After pivot_table call	Final
df	Empty	{'City': ['NY', 'LA', 'NY', 'LA'], 'Year': [2020, 2020, 2021, 2021], 'Sales': [100, 200, 150, 250]}	Same as after call
pivot	None	Empty	Year 2020 2021 City LA 200 250 NY 100 150

Key Moments - 3 Insights

Why do we specify 'index' and 'columns' in pivot_table?

What happens if multiple rows have the same index and column values?

Why do we use aggfunc='sum'?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at step 4, what is the sum of sales for city 'LA' in year 2021?

A150

B200

C250

D100

Concept Snapshot

pivot_table() creates a summary table from data.
Use index= for rows, columns= for columns.
values= selects data to aggregate.
aggfunc= defines how to combine data (sum, mean, etc.).
Result is a DataFrame showing grouped summaries.

Full Transcript

We start with a DataFrame containing city, year, and sales data. Using pivot_table(), we choose 'City' as the index (rows), 'Year' as columns, and 'Sales' as values to summarize. The function groups data by city and year, then sums sales for each group. The result is a new table showing total sales per city for each year. This pivot table helps us quickly compare sales across cities and years. Key points include specifying index and columns to organize data, and using aggfunc to control aggregation. The process ends with a clear summary table ready for analysis.