Matrix factorization helps us break a big table of numbers into smaller, simpler tables. This makes it easier to find hidden patterns and make predictions.
Matrix factorization basics in ML Python
Given a matrix R (m x n), find two matrices P (m x k) and Q (k x n) such that: R ≈ P x Q where k is the number of latent features (smaller than m and n).
The number k controls how much detail you keep.
Matrix multiplication of P and Q reconstructs an approximation of R.
R = [[5, 3, 0], [4, 0, 0], [1, 1, 0], [0, 0, 5], [0, 0, 4]] Find P (5 x 2) and Q (2 x 3) such that R ≈ P x Q
P = [[1.2, 0.5], [1.0, 0.3], [0.5, 0.7], [0.0, 1.5], [0.1, 1.3]] Q = [[4.0, 2.5, 0.1], [0.2, 0.3, 4.5]]
This code shows a simple matrix factorization using gradient descent to approximate a user-item rating matrix. It updates two smaller matrices to reconstruct the original matrix.
import numpy as np # Original matrix with missing values as zeros R = np.array([ [5, 3, 0], [4, 0, 0], [1, 1, 0], [0, 0, 5], [0, 0, 4] ], dtype=float) # Number of latent features k = 2 # Initialize P and Q with random values np.random.seed(42) P = np.random.rand(R.shape[0], k) Q = np.random.rand(k, R.shape[1]) # Set learning rate and iterations alpha = 0.01 iterations = 5000 # Matrix factorization using gradient descent for _ in range(iterations): for i in range(R.shape[0]): for j in range(R.shape[1]): if R[i, j] > 0: # Only update for known ratings eij = R[i, j] - np.dot(P[i, :], Q[:, j]) for r in range(k): P[i, r] += alpha * 2 * eij * Q[r, j] Q[r, j] += alpha * 2 * eij * P[i, r] # Reconstruct the matrix R_hat = np.dot(P, Q) # Print original and reconstructed matrices print("Original matrix R:") print(R) print("\nReconstructed matrix R_hat (approximation):") print(np.round(R_hat, 2))
Matrix factorization works best when the matrix is mostly filled with known values.
Choosing the right number of latent features (k) is important: too small loses detail, too large may overfit.
Gradient descent updates P and Q to reduce the difference between the original and reconstructed matrix.
Matrix factorization breaks a big matrix into smaller ones to find hidden patterns.
It helps predict missing values and understand relationships in data.
Simple algorithms like gradient descent can find these smaller matrices step by step.