0
0
Pandasdata~5 mins

Memory usage analysis in Pandas

Choose your learning style9 modes available
Introduction

Memory usage analysis helps you understand how much computer memory your data uses. This is important to keep your programs fast and avoid crashes.

When working with large datasets and you want to check if your computer can handle them.
Before saving or sharing data, to know how big the file might be in memory.
When optimizing your code to use less memory and run faster.
To compare memory use before and after changing data types or cleaning data.
Syntax
Pandas
DataFrame.memory_usage(index=True, deep=False)

index: If True, includes memory used by the index.

deep: If True, gives a more accurate memory use by inspecting object types deeply.

Examples
Shows memory used by each column and the index (default index=True).
Pandas
df.memory_usage()
Shows memory used by columns only, excluding the index.
Pandas
df.memory_usage(index=False)
Gives a detailed memory usage including the actual size of objects like strings.
Pandas
df.memory_usage(deep=True)
Sample Program

This code creates a small table with names, ages, and cities. It then shows memory used by each column and the index, including detailed size for strings. Finally, it sums all to show total memory used.

Pandas
import pandas as pd

data = {'name': ['Alice', 'Bob', 'Charlie'],
        'age': [25, 30, 35],
        'city': ['New York', 'Los Angeles', 'Chicago']}

df = pd.DataFrame(data)

# Check memory usage including index
mem_usage = df.memory_usage(deep=True)
print(mem_usage)

# Total memory used by DataFrame
total_mem = mem_usage.sum()
print(f"Total memory usage: {total_mem} bytes")
OutputSuccess
Important Notes

Using deep=True is helpful for object columns like strings to get accurate memory use.

Memory usage is shown in bytes, which is a small unit of computer memory.

Checking memory helps you decide if you need to reduce data size or change data types.

Summary

Memory usage analysis tells you how much memory your data takes.

Use memory_usage() to see memory per column and index.

Use deep=True for more accurate memory info on objects.