0
0
SciPydata~10 mins

CSR format (Compressed Sparse Row) in SciPy - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - CSR format (Compressed Sparse Row)
Start with Sparse Matrix
Extract non-zero values
Record column indices of non-zeros
Build row pointer array
Store as CSR format: data, indices, indptr
CSR format stores a sparse matrix by saving only non-zero values, their column positions, and pointers to row starts.
Execution Sample
SciPy
import numpy as np
from scipy.sparse import csr_matrix

matrix = np.array([[0,0,1],[2,0,0],[0,3,0]])
csr = csr_matrix(matrix)
print(csr.data)
print(csr.indices)
print(csr.indptr)
This code converts a dense matrix to CSR format and prints its internal arrays.
Execution Table
StepActionData ArrayIndices ArrayIndptr Array
1Start with matrix [[0,0,1],[2,0,0],[0,3,0]]
2Extract non-zero values[1, 2, 3]
3Record column indices of non-zeros[1, 2, 3][2, 0, 1]
4Build row pointer array[1, 2, 3][2, 0, 1][0, 1, 2, 3]
5Store as CSR format[1, 2, 3][2, 0, 1][0, 1, 2, 3]
💡 CSR arrays fully built representing the sparse matrix.
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
data[][1, 2, 3][1, 2, 3][1, 2, 3][1, 2, 3]
indices[][][2, 0, 1][2, 0, 1][2, 0, 1]
indptr[0][0][0][0, 1, 2, 3][0, 1, 2, 3]
Key Moments - 2 Insights
Why does the 'indices' array have values [2, 0, 1] instead of sorted?
The 'indices' array records the column positions of non-zero values in the order they appear row-wise, matching the original matrix layout (see execution_table step 3).
What does the 'indptr' array represent?
'indptr' shows where each row's data starts in the 'data' array. For example, indptr[1]=1 means row 1's data starts at data[1] (see execution_table step 4).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 3, what is the value of indices[1]?
A2
B1
C0
D3
💡 Hint
Check the 'indices' array column in execution_table row for step 3.
At which step does the 'indptr' array get its final form?
AStep 2
BStep 4
CStep 3
DStep 5
💡 Hint
Look at the 'indptr' column in execution_table and see when it changes from [0] to [0,1,2,3].
If the matrix had an extra zero row at the end, how would 'indptr' change?
A[0, 1, 2, 3, 3]
B[0, 1, 2, 3]
C[0, 1, 2, 3, 4]
D[0, 1, 2, 4]
💡 Hint
The last number in 'indptr' equals total non-zero elements; zero row adds no new data but adds a pointer.
Concept Snapshot
CSR format stores sparse matrices efficiently by:
- data: non-zero values in row order
- indices: column indices of these values
- indptr: pointers to row starts in data
Use csr_matrix() in scipy to convert dense to CSR
Access arrays via .data, .indices, .indptr
Full Transcript
CSR format compresses sparse matrices by storing only non-zero values, their column positions, and row start pointers. The example matrix [[0,0,1],[2,0,0],[0,3,0]] has non-zero values [1,2,3]. Their column indices are [2,0,1] respectively. The indptr array [0,1,2,3] marks where each row's data starts in the data array. This structure saves memory and speeds up matrix operations. The execution table shows step-by-step how these arrays build up. Key points include understanding that indices follow the order of non-zero elements row-wise and indptr length is number of rows plus one. Quizzes test understanding of these arrays and their changes.