Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is a Gaussian Mixture Model (GMM)?
A Gaussian Mixture Model is a way to represent data as a mix of several bell-shaped curves (Gaussians). Each curve represents a group or cluster in the data.
Click to reveal answer
intermediate
How does GMM differ from K-Means clustering?
GMM assumes data points come from a mix of Gaussian distributions and can assign probabilities to clusters, while K-Means assigns each point to exactly one cluster without probabilities.
Click to reveal answer
intermediate
What is the role of the Expectation-Maximization (EM) algorithm in GMM?
EM helps find the best parameters for the Gaussian curves by repeating two steps: guessing which points belong to which curve (Expectation), then updating the curves to better fit those points (Maximization).
Click to reveal answer
beginner
What are the main parameters of a Gaussian component in GMM?
Each Gaussian has a mean (center), covariance (shape and spread), and a weight (how much it contributes to the overall mix).
Click to reveal answer
beginner
Why is GMM considered a soft clustering method?
Because it assigns probabilities to each data point for belonging to each cluster, instead of a hard yes/no assignment.
Click to reveal answer
What does each component in a Gaussian Mixture Model represent?
AA decision tree node
BA Gaussian distribution representing a cluster
CA single data point
DA linear regression line
✗ Incorrect
Each component is a Gaussian distribution that models one cluster in the data.
Which algorithm is commonly used to estimate parameters in GMM?
AExpectation-Maximization
BGradient Descent
CK-Nearest Neighbors
DSupport Vector Machine
✗ Incorrect
Expectation-Maximization (EM) is used to find the best parameters for the Gaussian components.
What does the covariance matrix in a Gaussian component describe?
AThe center of the cluster
BThe probability of the cluster
CThe number of clusters
DThe shape and spread of the cluster
✗ Incorrect
Covariance describes how data spreads and the shape of the Gaussian curve.
In GMM, what does a higher weight for a Gaussian component mean?
AIt contributes less to the overall model
BIt has fewer data points
CIt contributes more to the overall model
DIt has a smaller spread
✗ Incorrect
A higher weight means the component has more influence in the mixture.
Why might GMM be preferred over K-Means for clustering?
AGMM can model clusters with different shapes and sizes
BGMM is faster to compute
CGMM only works with binary data
DGMM does not require parameter tuning
✗ Incorrect
GMM can model clusters with different shapes because it uses covariance matrices, unlike K-Means which assumes spherical clusters.
Explain how the Expectation-Maximization algorithm works in Gaussian Mixture Models.
Think about guessing cluster membership and then improving the guess.
You got /3 concepts.
Describe the difference between hard clustering and soft clustering with examples.
Consider how certain or uncertain the cluster assignment is.
You got /4 concepts.
Practice
(1/5)
1. What is the main idea behind a Gaussian Mixture Model (GMM)?
easy
A. It assumes data is made of several bell-shaped groups mixed together.
B. It uses decision trees to split data into groups.
C. It finds the single best line to fit the data points.
D. It clusters data by measuring distances only.
Solution
Step 1: Understand GMM concept
GMM assumes data comes from multiple groups, each shaped like a bell curve (Gaussian).
Step 2: Compare with other methods
Unlike decision trees or distance-only methods, GMM models overlapping groups with probabilities.
Final Answer:
It assumes data is made of several bell-shaped groups mixed together. -> Option A
Quick Check:
GMM = mixture of Gaussians [OK]
Hint: Remember GMM = mix of bell curves for groups [OK]
Common Mistakes:
Confusing GMM with decision trees
Thinking GMM finds one line only
Assuming GMM uses only distances
2. Which Python library provides a built-in Gaussian Mixture Model class?
easy
A. matplotlib
B. pandas
C. scikit-learn
D. tensorflow
Solution
Step 1: Identify libraries for ML models
scikit-learn is a popular library with many ML models including GMM.
Step 2: Check other libraries' purpose
matplotlib is for plotting, pandas for data handling, tensorflow for deep learning, not GMM specifically.
Final Answer:
scikit-learn -> Option C
Quick Check:
GMM in scikit-learn [OK]
Hint: GMM class is in scikit-learn, not plotting or deep learning libs [OK]
Common Mistakes:
Choosing matplotlib for modeling
Confusing pandas with ML models
Picking tensorflow for GMM
3. What will the following Python code output?
from sklearn.mixture import GaussianMixture
import numpy as np
X = np.array([[1], [2], [3], [10], [11], [12]])
gmm = GaussianMixture(n_components=2, random_state=0)
gmm.fit(X)
labels = gmm.predict(X)
print(labels.tolist())
medium
A. [1, 0, 1, 0, 1, 0]
B. [0, 0, 0, 1, 1, 1]
C. [0, 1, 0, 1, 0, 1]
D. [1, 1, 1, 0, 0, 0]
Solution
Step 1: Understand data and model
Data has two clear groups: near 1-3 and near 10-12. GMM with 2 components fits these groups.
Step 2: Predict labels
GMM assigns first three points to one group (label 0) and last three to another (label 1).
Final Answer:
[0, 0, 0, 1, 1, 1] -> Option B
Quick Check:
Groups split as low and high values [OK]
Hint: GMM labels cluster points close together [OK]
Common Mistakes:
Mixing label order (0 vs 1)
Assuming alternating labels
Ignoring clear group separation
4. Identify the error in this GMM code snippet:
from sklearn.mixture import GaussianMixture
X = [[1, 2], [3, 4], [5, 6]]
gmm = GaussianMixture(n_components=2)
gmm.fit(X)
labels = gmm.predict(X)
print(labels)
medium
A. GaussianMixture requires a random_state parameter.
B. n_components must be 3 or more for this data.
C. fit() method should be called after predict().
D. X should be a NumPy array, not a list of lists.
Solution
Step 1: Check data format for GMM
GMM expects input as a NumPy array, not a plain Python list.
Step 2: Verify other parameters and method order
n_components=2 is valid, random_state is optional, fit() must be before predict().
Final Answer:
X should be a NumPy array, not a list of lists. -> Option D
Quick Check:
Input data type matters for GMM [OK]
Hint: Use NumPy arrays for GMM input data [OK]
Common Mistakes:
Passing lists instead of arrays
Wrong order of fit and predict
Thinking random_state is mandatory
5. You have a dataset with overlapping groups of different sizes and shapes. Which advantage of Gaussian Mixture Models makes them suitable here?
hard
A. They can model overlapping groups with different shapes using probabilities.
B. They always create groups of equal size.
C. They only work for groups that are perfectly separated.
D. They require groups to be circular and same size.
Solution
Step 1: Understand group overlap and shape
Real data groups often overlap and differ in shape and size.
Step 2: Match GMM strengths
GMM uses probabilities to model overlapping groups with different shapes, unlike simpler methods.
Final Answer:
They can model overlapping groups with different shapes using probabilities. -> Option A
Quick Check:
GMM handles overlap and shape variation [OK]
Hint: GMM models overlap and shape differences well [OK]