Imagine you have a large matrix mostly filled with zeros. Why does using a sparse matrix save memory compared to a regular dense matrix?
Think about what data is actually stored in a sparse matrix.
Sparse matrices save memory by storing only the non-zero values and their locations, instead of storing every element including zeros.
What is the output of this code showing memory usage in bytes?
import numpy as np from scipy.sparse import csr_matrix size = 10000 matrix_dense = np.zeros((size, size)) matrix_dense[0, 0] = 1 matrix_sparse = csr_matrix(matrix_dense) print(matrix_dense.nbytes) print(matrix_sparse.data.nbytes + matrix_sparse.indptr.nbytes + matrix_sparse.indices.nbytes)
Check how many bytes the dense matrix uses and compare to the sparse matrix components.
The dense matrix stores all elements (10000x10000 floats), using 800 million bytes. The sparse matrix stores only non-zero data and index arrays, much smaller.
Given a 5x5 matrix with only 3 non-zero elements, how many elements does the sparse matrix store?
from scipy.sparse import csr_matrix import numpy as np matrix = np.zeros((5,5)) matrix[0,1] = 10 matrix[2,3] = 20 matrix[4,4] = 30 sparse = csr_matrix(matrix) print(len(sparse.data))
Count how many non-zero values are in the matrix.
The sparse matrix stores only the non-zero values. Here, there are exactly 3 non-zero elements.
What error does this code raise and why?
from scipy.sparse import csr_matrix matrix = [[0, 0], [0, 0]] sparse = csr_matrix(matrix) print(sparse.data[0])
Check what happens when the matrix has no non-zero elements.
The sparse matrix has no non-zero elements, so sparse.data is empty. Accessing sparse.data[0] causes IndexError.
You have a large sparse matrix with many rows but few non-zero elements per row. Which sparse format is best to save memory and why?
Think about which format stores data efficiently for many rows with few non-zero elements each.
CSR format stores data efficiently by compressing rows and storing only non-zero elements and their column indices, saving memory for matrices with many rows and few non-zero elements per row.