0
0
Data Analysis Pythondata~10 mins

Stack and unstack in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Stack and unstack
Start with DataFrame
Apply stack()
DataFrame becomes Series with MultiIndex
Apply unstack()
Series with MultiIndex becomes DataFrame
End
Stack turns columns into rows creating a Series with MultiIndex. Unstack reverses this, turning rows back into columns.
Execution Sample
Data Analysis Python
import pandas as pd

df = pd.DataFrame({
    'A': [1, 2],
    'B': [3, 4]
})
stacked = df.stack()
unstacked = stacked.unstack()
This code creates a simple DataFrame, stacks it to convert columns into rows, then unstacks it back to original form.
Execution Table
StepActionData Structure TypeShape/IndexSample Data
1Create DataFrameDataFrame(2 rows, 2 columns){(0, 'A'):1, (0, 'B'):3, (1, 'A'):2, (1, 'B'):4}
2Apply stack()Series with MultiIndex(4 elements){(0, 'A'):1, (0, 'B'):3, (1, 'A'):2, (1, 'B'):4}
3Apply unstack()DataFrame(2 rows, 2 columns){(0, 'A'):1, (0, 'B'):3, (1, 'A'):2, (1, 'B'):4}
4EndDataFrame(2 rows, 2 columns)Same as Step 1
💡 Unstack reverses stack, restoring original DataFrame shape and data.
Variable Tracker
VariableStartAfter stack()After unstack()
dfDataFrame with 2 rows, 2 columnsUnchangedUnchanged
stackedNot definedSeries with MultiIndex, 4 elementsUnchanged
unstackedNot definedNot definedDataFrame same as df
Key Moments - 2 Insights
Why does stack() result in a Series with MultiIndex instead of a DataFrame?
Stack compresses columns into a single index level, so the result is a Series indexed by the original row index plus the stacked column labels, forming a MultiIndex (see execution_table Step 2).
Does unstack() always restore the original DataFrame exactly?
Unstack reverses stack only if the MultiIndex matches the original structure. If the Series has missing or extra index levels, unstack may produce a different shape (see execution_table Step 3).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what is the data structure type after applying stack()?
ADataFrame
BSeries with MultiIndex
CList
DDictionary
💡 Hint
Check Step 2 in execution_table under 'Data Structure Type'
At which step does the data shape change from (2 rows, 2 columns) to a Series with 4 elements?
AStep 2
BStep 1
CStep 3
DStep 4
💡 Hint
Look at the 'Shape/Index' column in execution_table for Step 2
If the stacked Series had missing values, how would unstack() affect the final DataFrame?
AIt would produce an error
BIt would remove rows with missing values
CIt would fill missing values with NaN in the DataFrame
DIt would ignore missing values and produce the same DataFrame
💡 Hint
Unstack creates a DataFrame with NaN where data is missing, see pandas behavior for unstack
Concept Snapshot
Stack and unstack:
- stack() converts columns into rows, creating a Series with MultiIndex.
- unstack() reverses stack, turning MultiIndex Series back into a DataFrame.
- Useful for reshaping data between wide and long formats.
- stack() reduces columns, unstack() expands them.
- Missing data after stack leads to NaN after unstack.
Full Transcript
Stack and unstack are pandas methods to reshape data. Starting with a DataFrame, stack() compresses columns into rows, resulting in a Series with a MultiIndex combining original row and column labels. This changes the shape from a 2D table to a 1D Series with hierarchical indexing. Unstack() reverses this process, taking a MultiIndex Series and expanding it back into a DataFrame with columns restored. The execution trace shows creating a DataFrame with two columns and two rows, stacking it to get a Series with four elements indexed by row and column labels, then unstacking back to the original DataFrame shape. Key points include understanding that stack changes the data structure type and shape, and unstack can restore the original DataFrame if the index matches. Missing data in the stacked Series will appear as NaN in the unstacked DataFrame. This process is useful for converting data between wide and long formats in data analysis.