ML Pythonml~12 mins

Matrix factorization basics in ML Python - Model Pipeline Trace

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Model Pipeline - Matrix factorization basics

This pipeline shows how matrix factorization breaks a big table of numbers into two smaller tables. This helps us find hidden patterns and make predictions, like guessing missing ratings in a movie rating table.

Data Flow - 5 Stages

1Input data

1000 rows x 500 columns→Original user-item rating matrix with some missing values→1000 rows x 500 columns

User 1 rated Movie 1 as 4, Movie 2 as missing, Movie 3 as 5

↓

2Initialize factor matrices

1000 rows x 500 columns→Create two smaller matrices: User features (1000 x 10) and Item features (500 x 10)→User features: 1000 rows x 10 columns, Item features: 500 rows x 10 columns

User 1 features: [0.1, 0.3, ..., 0.05], Movie 1 features: [0.2, 0.4, ..., 0.1]

↓

3Matrix multiplication

User features (1000 x 10), Item features (500 x 10)→Multiply user and item feature matrices to approximate original ratings→1000 rows x 500 columns

Predicted rating for User 1 and Movie 1: 3.8

↓

4Loss calculation

Original ratings and predicted ratings (1000 x 500)→Calculate difference only on known ratings to measure error→Single loss value (scalar)

Loss = 0.25

↓

5Update features

User features (1000 x 10), Item features (500 x 10), loss scalar→Adjust user and item features to reduce loss using gradient descent→Updated user features (1000 x 10), updated item features (500 x 10)

User 1 features updated to [0.12, 0.28, ..., 0.06]

Training Trace - Epoch by Epoch


Loss
1.2 |**************
1.0 |**********
0.8 |*******
0.6 |*****
0.4 |***
0.2 |**
0.0 +----------------
     1  5 10 15 20 Epochs

Epoch	Loss ↓	Accuracy ↑	Observation
1	1.20	N/A	Initial loss is high because features are random
5	0.75	N/A	Loss decreases as features start to capture patterns
10	0.45	N/A	Loss continues to decrease steadily
15	0.30	N/A	Model is learning well, loss is much lower
20	0.25	N/A	Loss stabilizes, model converges

Prediction Trace - 3 Layers

Layer 1: Input user and item features

Layer 2: Dot product of user and item features

Layer 3: Compare predicted rating to known rating

Model Quiz - 3 Questions

Test your understanding

What happens to the loss value as training progresses?

AIt stays the same

BIt increases steadily

CIt decreases steadily

DIt jumps randomly

Key Insight

Matrix factorization learns smaller user and item features that, when multiplied, recreate the original data closely. This helps predict missing values by capturing hidden patterns.