0
0
Pandasdata~20 mins

Memory usage analysis in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Memory Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Memory usage of DataFrame columns

What is the output of the following code showing memory usage of each column in bytes?

Pandas
import pandas as pd
import numpy as np
df = pd.DataFrame({
    'A': np.arange(1000, dtype='int64'),
    'B': np.random.rand(1000),
    'C': ['text']*1000
})
mem_usage = df.memory_usage(deep=True)
print(mem_usage)
A
Index    128
A      8000
B      8000
C     24000
dtype: int64
B
Index    128
A      8000
B      8000
C     4000
dtype: int64
C
Index    128
A      8000
B      8000
C     32000
dtype: int64
D
Index    128
A      1000
B      8000
C     24000
dtype: int64
Attempts:
2 left
💡 Hint

Remember that strings with deep=True count the actual string memory, not just pointers.

data_output
intermediate
2:00remaining
Total memory usage of DataFrame

What is the total memory usage in bytes of the DataFrame below, including index and deep memory?

Pandas
import pandas as pd
import numpy as np
df = pd.DataFrame({
    'X': np.random.randint(0, 100, 5000),
    'Y': ['a']*5000
})
total_mem = df.memory_usage(deep=True).sum()
print(total_mem)
A90000
B50000
C100128
D40000
Attempts:
2 left
💡 Hint

Consider index memory plus each column's memory with deep=True.

🔧 Debug
advanced
2:30remaining
Why does memory usage not decrease after dropping a column?

Given the code below, why does the memory usage of df not decrease after dropping column 'B'?

Pandas
import pandas as pd
import numpy as np
df = pd.DataFrame({
    'A': np.arange(10000),
    'B': ['text']*10000
})
print(df.memory_usage(deep=True).sum())
df.drop('B', axis=1, inplace=True)
print(df.memory_usage(deep=True).sum())
ABecause drop with inplace=True does not free memory immediately due to pandas internal caching.
BBecause the 'B' column is still referenced elsewhere in the code, so memory is not freed.
CBecause memory_usage(deep=True) always returns the same value regardless of columns.
DBecause drop does not remove columns, it only hides them temporarily.
Attempts:
2 left
💡 Hint

Think about how pandas manages memory internally and when garbage collection happens.

🧠 Conceptual
advanced
1:30remaining
Effect of categorical dtype on memory usage

Which statement best describes the effect of converting a string column to category dtype on memory usage?

AIt converts strings to integers but increases memory due to overhead.
BIt increases memory usage because categories store extra metadata.
CIt has no effect on memory usage but speeds up computations.
DIt decreases memory usage by storing unique values once and using integer codes.
Attempts:
2 left
💡 Hint

Think about how categories store repeated values efficiently.

🚀 Application
expert
3:00remaining
Optimize memory usage of a large DataFrame

You have a DataFrame with 1 million rows and columns of various types. Which approach will most effectively reduce memory usage without losing data?

AConvert all integer columns to <code>int8</code> and all floats to <code>float16</code> regardless of value range.
BConvert object columns with few unique values to <code>category</code> and downcast numeric columns to smallest suitable subtype.
CDrop all columns with missing values to reduce memory usage.
DConvert all columns to string type to unify data and compress memory.
Attempts:
2 left
💡 Hint

Consider safe downcasting and categorical conversion for repeated strings.