Challenge - 5 Problems

🎖️

Dtype Performance Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of NumPy array with different dtypes

What is the output of this code snippet that creates two NumPy arrays with different dtypes and sums them?

NumPy

import numpy as np
arr_int = np.array([1, 2, 3], dtype=np.int8)
arr_float = np.array([1.0, 2.0, 3.0], dtype=np.float64)
sum_arr = arr_int + arr_float
print(sum_arr)
print(sum_arr.dtype)

[2. 4. 6.]
float64

[2 4 6]
int8

[2. 4. 6.]
int8

[2 4 6]
float64

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Memory usage difference between dtypes

Which option correctly shows the memory usage in bytes of two NumPy arrays with 1 million elements each, one with dtype int8 and the other with dtype int64?

NumPy

import numpy as np
arr_small = np.ones(1_000_000, dtype=np.int8)
arr_large = np.ones(1_000_000, dtype=np.int64)
print(arr_small.nbytes)
print(arr_large.nbytes)

1000000
8000000

8000000
1000000

1000000
1000000

8000000
8000000

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Why does this NumPy operation run slower?

You have two arrays: one with dtype float32 and one with dtype float64. You add them many times in a loop. Why might this be slower than adding two arrays both with dtype float32?

NumPy

import numpy as np
import time
arr1 = np.ones(1000000, dtype=np.float32)
arr2 = np.ones(1000000, dtype=np.float64)
start = time.time()
for _ in range(100):
    result = arr1 + arr2
end = time.time()
print(end - start)

ABecause arr2 is downcast to float32 each time, causing data loss.

BBecause arr1 is upcast to float64 each time, causing extra computation and memory use.

CBecause the loop is too small to measure performance differences.

DBecause float32 arrays are always slower than float64 arrays.

Attempts:

2 left

🧠 Conceptual

advanced

2:00remaining

Effect of dtype on vectorized operations

Which statement best explains why using smaller dtypes can improve performance in NumPy vectorized operations?

ASmaller dtypes always increase precision, leading to faster calculations.

BSmaller dtypes force NumPy to use slower Python loops internally.

CSmaller dtypes reduce memory usage, which improves cache efficiency and speeds up computations.

DSmaller dtypes increase the number of CPU instructions needed per operation.

Attempts:

2 left

🚀 Application

expert

3:00remaining

Choosing dtypes for large dataset processing

You have a dataset with 10 million integer values ranging from 0 to 1000. You want to minimize memory usage without losing data. Which dtype choice is best for a NumPy array to store this data?

Anp.float64

Bnp.int8

Cnp.int32

Dnp.uint16

Attempts:

2 left