Challenge - 5 Problems
Sparse Data Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of sparse matrix multiplication
What is the output of this code that multiplies two sparse matrices using SciPy?
Data Analysis Python
import numpy as np from scipy.sparse import csr_matrix A = csr_matrix([[0, 0, 1], [1, 0, 0], [0, 0, 0]]) B = csr_matrix([[0, 2, 0], [0, 0, 0], [3, 0, 0]]) C = A.dot(B) print(C.toarray())
Attempts:
2 left
💡 Hint
Remember that sparse matrix multiplication follows normal matrix multiplication rules but only non-zero elements are stored.
✗ Incorrect
Matrix A has a 1 at position (0,2) and (1,0). Matrix B has 3 at (2,0) and 2 at (0,1). Multiplying A and B, the element at (0,0) is 1*3=3, and at (1,1) is 1*2=2. Others are zero.
❓ data_output
intermediate1:30remaining
Number of non-zero elements in sparse matrix
Given this sparse matrix, how many non-zero elements does it contain?
Data Analysis Python
from scipy.sparse import coo_matrix row = [0, 3, 1, 0] col = [0, 3, 1, 2] data = [4, 5, 7, 9] matrix = coo_matrix((data, (row, col)), shape=(4, 4)) print(matrix.nnz)
Attempts:
2 left
💡 Hint
nnz attribute counts all stored non-zero elements.
✗ Incorrect
The data list has 4 values, so the sparse matrix stores 4 non-zero elements.
🔧 Debug
advanced2:00remaining
Identify error in sparse matrix conversion
What error does this code raise when converting a dense numpy array with NaN to a sparse CSR matrix?
Data Analysis Python
import numpy as np from scipy.sparse import csr_matrix dense = np.array([[1, 0, np.nan], [0, 2, 3]]) sparse = csr_matrix(dense, dtype=int) print(sparse.toarray())
Attempts:
2 left
💡 Hint
Sparse matrices do not support NaN values directly in integer dtype arrays.
✗ Incorrect
csr_matrix tries to convert the dense array to integer dtype by default, but NaN cannot be converted to int, causing ValueError.
🚀 Application
advanced1:30remaining
Choosing sparse format for efficient row slicing
You have a large sparse dataset and need to frequently access rows quickly. Which sparse matrix format is best suited for this?
Attempts:
2 left
💡 Hint
Think about which format stores data optimized for row access.
✗ Incorrect
CSR format stores data compressed by rows, making row slicing very fast compared to other formats.
🧠 Conceptual
expert2:00remaining
Impact of sparsity on machine learning model training
How does high sparsity in input data typically affect training of linear models like logistic regression?
Attempts:
2 left
💡 Hint
Consider how sparse data reduces computation and storage needs.
✗ Incorrect
Sparse data reduces the number of features processed, speeding up training and lowering memory use without necessarily harming accuracy.