Challenge - 5 Problems

🎖️

Memory Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of memory-efficient data filtering

What is the output of this code that filters a large list using a generator expression?

Data Analysis Python

data = range(1_000_000)
filtered = (x for x in data if x % 100_000 == 0)
result = list(filtered)
print(result)

A[0, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000]

B[0, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000]

C[100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000]

DSyntaxError

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Memory usage difference between list and generator

Which option correctly describes the memory usage difference when creating a list vs a generator for numbers 0 to 999,999?

Data Analysis Python

import sys
list_data = list(range(1_000_000))
gen_data = (x for x in range(1_000_000))
print(sys.getsizeof(list_data))
print(sys.getsizeof(gen_data))

ABoth use the same amount of memory

BGenerator uses more memory than list

CList uses more memory than generator

DBoth cause MemoryError

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in memory-efficient data processing

What error will this code raise when trying to sum a large dataset using a generator?

Data Analysis Python

data = (int(x) for x in ['1', '2', 'three', '4'])
total = sum(data)
print(total)

AValueError

BNo error, output is 10

CSyntaxError

DTypeError

Attempts:

2 left

🚀 Application

advanced

2:00remaining

Choosing memory-efficient data aggregation method

You have a huge CSV file with millions of rows. Which method is most memory-efficient to calculate the average of a numeric column without loading all data at once?

AConvert CSV to JSON and then parse all data into memory

BLoad entire CSV into a pandas DataFrame and use df['col'].mean()

CUse list comprehension to create a list of all values, then average

DRead the file line by line, sum values and count rows, then compute average

Attempts:

2 left

🧠 Conceptual

expert

2:00remaining

Understanding memory-efficient streaming with pandas

Which pandas function allows processing large CSV files in chunks to reduce memory usage?

Apd.read_csv with chunksize parameter

Bpd.read_csv with header=None

Cpd.read_csv with skiprows parameter

Dpd.read_csv with nrows parameter

Attempts:

2 left