0
0
SciPydata~5 mins

CSR format (Compressed Sparse Row) in SciPy

Choose your learning style9 modes available
Introduction

CSR format helps store big tables with many zeros in a small space. It makes working with these tables faster and uses less memory.

When you have a large matrix with mostly zero values.
When you want to save memory while storing sparse data.
When you need to do fast math operations on sparse matrices.
When working with graphs or networks represented as adjacency matrices.
When loading or saving sparse data efficiently.
Syntax
SciPy
from scipy.sparse import csr_matrix

csr_matrix(data, shape=(rows, cols))

data can be a dense matrix, list of lists, or coordinate format data.

The shape parameter defines the size of the matrix.

Examples
This converts a normal list of lists into a CSR sparse matrix.
SciPy
from scipy.sparse import csr_matrix

# Create CSR from dense matrix
dense = [[0, 0, 1], [1, 0, 0], [0, 2, 0]]
sparse = csr_matrix(dense)
This builds a CSR matrix from separate arrays of values and their positions.
SciPy
from scipy.sparse import csr_matrix

# Create CSR from data, row indices, and column indices
values = [3, 4, 5]
rows = [0, 1, 2]
cols = [2, 0, 1]
sparse = csr_matrix((values, (rows, cols)), shape=(3, 3))
Sample Program

This program shows how to convert a dense matrix to CSR format and back. It prints the internal CSR arrays that store only the non-zero values and their positions.

SciPy
from scipy.sparse import csr_matrix
import numpy as np

# Create a dense matrix with many zeros
matrix = np.array([
    [0, 0, 1, 0],
    [2, 0, 0, 0],
    [0, 0, 0, 3],
    [0, 4, 0, 0]
])

# Convert to CSR format
csr = csr_matrix(matrix)

# Print the CSR data arrays
print('data:', csr.data)
print('indices:', csr.indices)
print('indptr:', csr.indptr)

# Convert back to dense to check
print('dense matrix from CSR:\n', csr.toarray())
OutputSuccess
Important Notes

CSR format stores three arrays: data (non-zero values), indices (column positions), and indptr (row start points).

It is efficient for row slicing and matrix-vector multiplication.

To see the full matrix, use toarray() or todense().

Summary

CSR format saves space by storing only non-zero values and their positions.

It is useful for large sparse matrices common in data science and machine learning.

Scipy provides easy tools to convert between dense and CSR formats.