0
0
ML Pythonml~12 mins

Gaussian Mixture Models in ML Python - Model Pipeline Trace

Choose your learning style9 modes available
Model Pipeline - Gaussian Mixture Models

This pipeline uses Gaussian Mixture Models (GMM) to find groups in data by assuming each group looks like a bell curve. It learns the shape and position of these bell curves to best explain the data.

Data Flow - 6 Stages
1Data in
300 rows x 2 columnsRaw data points with two features300 rows x 2 columns
[[5.1, 3.5], [4.9, 3.0], [6.7, 3.1]]
2Preprocessing
300 rows x 2 columnsStandardize features to zero mean and unit variance300 rows x 2 columns
[[0.12, -0.45], [-0.34, -1.02], [1.23, 0.15]]
3Feature Engineering
300 rows x 2 columnsNo additional features added; use standardized features300 rows x 2 columns
[[0.12, -0.45], [-0.34, -1.02], [1.23, 0.15]]
4Model Trains
300 rows x 2 columnsFit GMM with 3 components using Expectation-MaximizationModel with 3 Gaussian components parameters
Means: [[-0.8, 0.5], [0.1, -0.2], [1.5, 1.0]]; Covariances: [[[0.5,0],[0,0.3]], ...]
5Metrics Improve
Model parametersLog-likelihood increases, convergence reachedFinal log-likelihood: -420.5
Log-likelihood per iteration: [-500, -460, -430, -420.5]
6Prediction
1 row x 2 columnsCalculate probabilities of belonging to each Gaussian component1 row x 3 columns (probabilities sum to 1)
[0.05, 0.90, 0.05]
Training Trace - Epoch by Epoch
Log-likelihood
-500 |************
-460 |*********
-430 |******
-420 |*****
      1  2  3  4  Epochs
EpochLoss ↓Accuracy ↑Observation
1N/AInitial log-likelihood before EM steps
2N/ALog-likelihood improved after first EM iteration
3N/AModel parameters better fit data clusters
4N/AConvergence reached; log-likelihood stabilizes
Prediction Trace - 4 Layers
Layer 1: Input sample
Layer 2: Calculate Gaussian probabilities
Layer 3: Normalize probabilities
Layer 4: Assign cluster
Model Quiz - 3 Questions
Test your understanding
What does the Gaussian Mixture Model assume about the data?
AData is made of several bell-shaped groups
BData is perfectly linear
CData has no structure
DData is only one cluster
Key Insight
Gaussian Mixture Models find hidden groups by fitting bell-shaped curves to data. They use probabilities to softly assign points to clusters, allowing flexible and realistic grouping.