0
0
Pandasdata~10 mins

Memory usage analysis in Pandas - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Memory usage analysis
Load DataFrame
Call memory_usage()
Calculate memory per column
Sum memory if deep=True
Return memory usage info
Use info for optimization
This flow shows how pandas calculates memory usage of each DataFrame column and optionally sums it for total usage.
Execution Sample
Pandas
import pandas as pd

df = pd.DataFrame({
    'A': range(3),
    'B': ['x', 'y', 'z']
})

mem = df.memory_usage(deep=True)
This code creates a DataFrame and calculates memory usage of each column including object data.
Execution Table
StepActionColumnMemory Usage (bytes)Total Memory (bytes)
1Calculate memory for columnIndex128128
2Calculate memory for columnA24152
3Calculate memory for columnB177329
4Sum all columns memoryAll329329
5Return memory usage Series and total---
💡 All columns processed and total memory usage calculated
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3Final
memory_per_column{}{'Index':128}{'Index':128, 'A':24}{'Index':128, 'A':24, 'B':177}{'Index':128, 'A':24, 'B':177}
total_memory0000329
Key Moments - 2 Insights
Why does the 'Index' column appear in memory usage?
The index is part of the DataFrame structure and uses memory, so pandas includes it in the memory_usage output as shown in steps 1 and 4.
What does the 'deep=True' parameter do?
It makes pandas calculate the actual memory used by object types like strings, not just pointers, which is why column 'B' has 177 bytes in step 3.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the memory usage of column 'A' at step 2?
A128 bytes
B24 bytes
C99 bytes
D251 bytes
💡 Hint
Check the 'Memory Usage (bytes)' column for step 2 in the execution_table.
At which step is the total memory usage calculated?
AStep 4
BStep 3
CStep 2
DStep 5
💡 Hint
Look for the row where 'Sum all columns memory' is the action in the execution_table.
If 'deep=False' was used, how would the memory for column 'B' change?
AIt would be larger
BIt would be the same
CIt would be smaller
DColumn 'B' would be excluded
💡 Hint
Recall that 'deep=True' counts actual string memory; without it, only pointers are counted, so memory is less.
Concept Snapshot
pandas.DataFrame.memory_usage(deep=False)
- Returns memory usage per column including index
- deep=True counts actual object memory (e.g., strings)
- Useful to find heavy columns
- Helps optimize DataFrame memory
- Returns Series with bytes per column plus index
Full Transcript
This visual trace shows how pandas calculates memory usage of a DataFrame. First, it loads the DataFrame with columns 'A' and 'B'. Then, calling memory_usage(deep=True) calculates memory for each column including the index. The index uses 128 bytes, column 'A' uses 24 bytes, and column 'B' uses 177 bytes because it contains strings. These values are summed to get total memory usage of 329 bytes. The deep=True parameter is important to count actual string memory, not just pointers. This helps understand which parts of the DataFrame use the most memory and can guide optimization.