0
0
Pandasdata~10 mins

Long to wide format conversion in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Long to wide format conversion
Start with long format DataFrame
Choose index columns
Choose columns to spread
Choose values to fill
Apply pivot or pivot_table
Get wide format DataFrame
End
Convert a long table with repeated identifiers into a wide table by spreading values across new columns.
Execution Sample
Pandas
import pandas as pd

df = pd.DataFrame({
    'Name': ['Anna', 'Anna', 'Bob', 'Bob'],
    'Year': [2020, 2021, 2020, 2021],
    'Score': [85, 88, 90, 92]
})

wide_df = df.pivot(index='Name', columns='Year', values='Score')
This code converts a long DataFrame of scores by year into a wide format with years as columns.
Execution Table
StepActionDataFrame StateResult
1Create long DataFrameName, Year, Score columns with 4 rowsDataFrame with repeated Names and Years
2Select index='Name', columns='Year', values='Score'Prepare to pivotReady to reshape
3Apply pivot()Reshape dataWide DataFrame with Names as index, Years as columns, Scores as values
4Resulting DataFrameIndex: Anna, Bob; Columns: 2020, 2021Anna: 85, 88; Bob: 90, 92
5EndWide format achievedConversion complete
💡 Pivot completed, long format converted to wide format with unique index-column pairs
Variable Tracker
VariableStartAfter pivotFinal
dfLong format DataFrame with 4 rowsUnchangedUnchanged
wide_dfNot definedWide DataFrame with 2 rows and 2 columnsWide DataFrame ready for use
Key Moments - 2 Insights
Why do we need to specify index, columns, and values in pivot?
Because pivot needs to know which column to use as row labels (index), which to spread across columns, and which contains the values to fill. See execution_table step 2 and 3.
What happens if there are duplicate index-column pairs in the long data?
pivot() will raise an error because it expects unique pairs. You would need pivot_table() with an aggregation function instead. This is not shown here but is important to remember.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table step 4, what value is at row 'Bob' and column 2021?
A90
B92
C88
D85
💡 Hint
Check the 'Result' column in step 4 of execution_table for Bob's 2021 score.
At which step does the DataFrame change from long to wide format?
AStep 1
BStep 2
CStep 3
DStep 5
💡 Hint
Look at the 'Action' and 'Result' columns in execution_table to find when pivot() is applied.
If the 'Year' column had duplicate entries for the same 'Name', what would happen when pivot() is called?
AIt would raise an error due to duplicate index-column pairs
BIt would ignore duplicates automatically
CIt would work fine, creating multiple columns
DIt would fill duplicates with NaN
💡 Hint
Refer to key_moments about pivot() behavior with duplicates.
Concept Snapshot
Long to wide format conversion:
Use pandas pivot() to reshape data.
Specify index (rows), columns (new columns), and values (cell values).
Data must have unique index-column pairs.
Result is a wide DataFrame with spread columns.
Use pivot_table() if duplicates exist.
Full Transcript
This visual execution shows how to convert a long format DataFrame into a wide format using pandas pivot(). We start with a DataFrame where each row has a Name, Year, and Score. By choosing 'Name' as the index, 'Year' as the columns, and 'Score' as the values, pivot() reshapes the data so each Name has one row and each Year becomes a column with the corresponding Score. The execution table traces each step, showing the DataFrame state before and after pivoting. Key moments clarify why specifying index, columns, and values is necessary and what happens if duplicates exist. The quiz tests understanding of the resulting table and pivot behavior. This method is useful to reorganize data for easier analysis or visualization.