0
0
SciPydata~5 mins

CSC format (Compressed Sparse Column) in SciPy

Choose your learning style9 modes available
Introduction

CSC format helps store big sparse matrices efficiently by saving only non-zero values and their positions. This saves memory and speeds up calculations.

When you have a large matrix mostly filled with zeros and want to save memory.
When you need to quickly access or modify columns of a sparse matrix.
When performing matrix operations like multiplication on sparse data.
When working with data like text features or graphs that are sparse.
When you want to convert data into a format that many scientific libraries understand.
Syntax
SciPy
from scipy.sparse import csc_matrix

csc = csc_matrix((data, (row_indices, col_indices)), shape=(rows, cols))

data is a list or array of non-zero values.

row_indices and col_indices specify the positions of these values.

Examples
This creates a 3x3 sparse matrix with values 4 at (0,0), 5 at (1,2), and 7 at (2,2).
SciPy
from scipy.sparse import csc_matrix

data = [4, 5, 7]
rows = [0, 1, 2]
cols = [0, 2, 2]
csc = csc_matrix((data, (rows, cols)), shape=(3, 3))
print(csc.toarray())
Convert a dense numpy array to CSC format and print the non-zero values.
SciPy
import numpy as np
from scipy.sparse import csc_matrix

arr = np.array([[0, 0, 1], [2, 0, 0], [0, 3, 0]])
csc = csc_matrix(arr)
print(csc.data)
Sample Program

This program creates a 3x3 sparse matrix with four non-zero values placed at specific row and column positions. It then prints the full matrix as a normal 2D array.

SciPy
from scipy.sparse import csc_matrix

# Define non-zero values and their positions
data = [10, 20, 30, 40]
row_indices = [0, 2, 2, 0]
col_indices = [0, 0, 1, 2]

# Create a 3x3 sparse matrix in CSC format
matrix = csc_matrix((data, (row_indices, col_indices)), shape=(3, 3))

# Print the dense form to see the full matrix
print(matrix.toarray())
OutputSuccess
Important Notes

CSC format stores data column by column, which makes column operations faster.

To access rows efficiently, consider CSR format instead.

You can convert between sparse formats easily using tocsc() and tocsr() methods.

Summary

CSC format stores only non-zero values and their row positions, organized by columns.

It is memory efficient for large sparse matrices and speeds up column-based operations.

Use scipy.sparse.csc_matrix to create and work with CSC matrices in Python.