Challenge - 5 Problems

🎖️

Memory Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Memory usage of DataFrame columns

What is the output of the following code showing memory usage of each column in bytes?

Pandas

import pandas as pd
import numpy as np
df = pd.DataFrame({
    'A': np.arange(1000, dtype='int64'),
    'B': np.random.rand(1000),
    'C': ['text']*1000
})
mem_usage = df.memory_usage(deep=True)
print(mem_usage)

Index    128
A      8000
B      8000
C     24000
dtype: int64

Index    128
A      8000
B      8000
C     4000
dtype: int64

Index    128
A      8000
B      8000
C     32000
dtype: int64

Index    128
A      1000
B      8000
C     24000
dtype: int64

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Total memory usage of DataFrame

What is the total memory usage in bytes of the DataFrame below, including index and deep memory?

Pandas

import pandas as pd
import numpy as np
df = pd.DataFrame({
    'X': np.random.randint(0, 100, 5000),
    'Y': ['a']*5000
})
total_mem = df.memory_usage(deep=True).sum()
print(total_mem)

A90000

B50000

C100128

D40000

Attempts:

2 left

🔧 Debug

advanced

2:30remaining

Why does memory usage not decrease after dropping a column?

Given the code below, why does the memory usage of df not decrease after dropping column 'B'?

Pandas

import pandas as pd
import numpy as np
df = pd.DataFrame({
    'A': np.arange(10000),
    'B': ['text']*10000
})
print(df.memory_usage(deep=True).sum())
df.drop('B', axis=1, inplace=True)
print(df.memory_usage(deep=True).sum())

ABecause drop with inplace=True does not free memory immediately due to pandas internal caching.

BBecause the 'B' column is still referenced elsewhere in the code, so memory is not freed.

CBecause memory_usage(deep=True) always returns the same value regardless of columns.

DBecause drop does not remove columns, it only hides them temporarily.

Attempts:

2 left

🧠 Conceptual

advanced

1:30remaining

Effect of categorical dtype on memory usage

Which statement best describes the effect of converting a string column to category dtype on memory usage?

AIt converts strings to integers but increases memory due to overhead.

BIt increases memory usage because categories store extra metadata.

CIt has no effect on memory usage but speeds up computations.

DIt decreases memory usage by storing unique values once and using integer codes.

Attempts:

2 left

🚀 Application

expert

3:00remaining

Optimize memory usage of a large DataFrame

You have a DataFrame with 1 million rows and columns of various types. Which approach will most effectively reduce memory usage without losing data?

AConvert all integer columns to <code>int8</code> and all floats to <code>float16</code> regardless of value range.

BConvert object columns with few unique values to <code>category</code> and downcast numeric columns to smallest suitable subtype.

CDrop all columns with missing values to reduce memory usage.

DConvert all columns to string type to unify data and compress memory.

Attempts:

2 left