Bird
Raised Fist0
ML Pythonml~5 mins

Multi-label classification in ML Python

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Multi-label classification helps us find all the correct answers when one thing can belong to many groups at the same time.
Tagging photos where one photo can have cats, dogs, and cars all together.
Detecting emotions in a sentence where multiple feelings like happy and surprised can appear.
Classifying news articles that can belong to sports, politics, and health categories simultaneously.
Identifying diseases in medical images where a patient might have more than one condition.
Recommending products that fit multiple interests of a customer at once.
Syntax
ML Python
model = SomeMultiLabelModel()
model.fit(X_train, Y_train)
predictions = model.predict(X_test)
Y_train and predictions are arrays where each example can have multiple labels marked as 1 or 0.
Use special loss functions like binary cross-entropy to train multi-label models.
Examples
Using scikit-learn's MultiOutputClassifier to handle multi-label classification with logistic regression.
ML Python
from sklearn.multioutput import MultiOutputClassifier
from sklearn.linear_model import LogisticRegression

model = MultiOutputClassifier(LogisticRegression())
model.fit(X_train, Y_train)
predictions = model.predict(X_test)
A simple neural network with sigmoid activation for multi-label outputs and binary cross-entropy loss.
ML Python
import tensorflow as tf

model = tf.keras.Sequential([
  tf.keras.layers.Dense(64, activation='relu'),
  tf.keras.layers.Dense(num_labels, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, Y_train, epochs=5)
predictions = model.predict(X_test)
Sample Model
This example shows how to train and test a multi-label classifier using decision trees. It prints accuracy for each label and the overall Hamming loss, which tells how many labels were predicted wrong on average.
ML Python
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.multioutput import MultiOutputClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, hamming_loss

# Create sample data: 100 samples, 5 features
X = np.random.rand(100, 5)

# Create multi-label targets: 3 labels per sample
# Each label is 0 or 1 randomly
Y = np.random.randint(2, size=(100, 3))

# Split data
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

# Create multi-label model
model = MultiOutputClassifier(DecisionTreeClassifier(random_state=42))

# Train model
model.fit(X_train, Y_train)

# Predict
Y_pred = model.predict(X_test)

# Calculate accuracy per label
acc = [accuracy_score(Y_test[:, i], Y_pred[:, i]) for i in range(Y.shape[1])]

# Calculate Hamming loss (fraction of wrong labels)
hloss = hamming_loss(Y_test, Y_pred)

print(f"Accuracy per label: {acc}")
print(f"Hamming loss: {hloss:.3f}")
OutputSuccess
Important Notes
Multi-label classification is different from multi-class classification where each example has only one label.
Use sigmoid activation and binary cross-entropy loss in neural networks for multi-label tasks.
Metrics like Hamming loss and accuracy per label help understand multi-label model performance.
Summary
Multi-label classification finds all correct labels for each example, not just one.
It is useful when items belong to multiple groups at once, like tagging photos or emotions.
Special models and metrics are needed to handle multiple labels properly.

Practice

(1/5)
1. What is the main difference between multi-label classification and multi-class classification?
easy
A. Multi-label classification uses regression, multi-class uses classification.
B. Multi-label classification assigns only one label, multi-class assigns multiple labels.
C. Multi-label classification is used only for images, multi-class for text.
D. Multi-label classification assigns multiple labels to one example, multi-class assigns only one.

Solution

  1. Step 1: Understand multi-label classification

    Multi-label classification means each example can have more than one correct label at the same time.
  2. Step 2: Compare with multi-class classification

    Multi-class classification means each example can have only one label from many possible classes.
  3. Final Answer:

    Multi-label classification assigns multiple labels to one example, multi-class assigns only one. -> Option D
  4. Quick Check:

    Multi-label = multiple labels, multi-class = single label [OK]
Hint: Remember: multi-label means many labels per example [OK]
Common Mistakes:
  • Confusing multi-label with multi-class
  • Thinking multi-label assigns only one label
  • Mixing up classification with regression
  • Assuming multi-label is only for images
2. Which of the following is a correct way to represent labels for multi-label classification in Python?
easy
A. labels = [0, 1, 2]
B. labels = [[1, 0, 1], [0, 1, 0]]
C. labels = 'cat,dog,bird'
D. labels = 3

Solution

  1. Step 1: Understand label representation for multi-label

    Multi-label classification uses a list or array where each position represents a label, with 1 or 0 indicating presence or absence.
  2. Step 2: Check options for correct format

    labels = [[1, 0, 1], [0, 1, 0]] shows a list of lists with 1s and 0s, correctly representing multiple labels per example.
  3. Final Answer:

    labels = [[1, 0, 1], [0, 1, 0]] -> Option B
  4. Quick Check:

    Multi-label uses binary vectors per example [OK]
Hint: Use binary lists to show multiple labels [OK]
Common Mistakes:
  • Using a single integer for labels
  • Using a string instead of list
  • Using a flat list without nested structure
  • Confusing multi-class label format with multi-label
3. Given this Python code snippet for multi-label classification predictions:
import numpy as np
preds = np.array([[0.8, 0.1, 0.6], [0.3, 0.7, 0.2]])
threshold = 0.5
binary_preds = (preds > threshold).astype(int)
print(binary_preds)

What is the output?
medium
A. [[1 1 1] [0 0 0]]
B. [[0 1 0] [1 0 1]]
C. [[1 0 1] [0 1 0]]
D. [[0 0 0] [1 1 1]]

Solution

  1. Step 1: Apply threshold to predictions

    Compare each value in preds with 0.5: values > 0.5 become 1, else 0.
  2. Step 2: Convert boolean to int and print

    First row: 0.8>0.5=1, 0.1>0.5=0, 0.6>0.5=1; Second row: 0.3>0.5=0, 0.7>0.5=1, 0.2>0.5=0.
  3. Final Answer:

    [[1 0 1] [0 1 0]] -> Option C
  4. Quick Check:

    Thresholding preds > 0.5 = binary labels [OK]
Hint: Compare each prediction to threshold for binary output [OK]
Common Mistakes:
  • Confusing > with >=
  • Not converting boolean to int
  • Mixing rows and columns in output
  • Using wrong threshold value
4. You trained a multi-label model but it always predicts only one label per example. What is the most likely cause?
medium
A. Using softmax activation instead of sigmoid in the output layer
B. Using sigmoid activation instead of softmax in the output layer
C. Using binary cross-entropy loss
D. Using a threshold of 0.1 for predictions

Solution

  1. Step 1: Understand output activations for multi-label

    Multi-label models use sigmoid activation to allow independent probabilities per label.
  2. Step 2: Identify problem with softmax

    Softmax forces probabilities to sum to 1, so only one label gets high probability, limiting multi-label predictions.
  3. Final Answer:

    Using softmax activation instead of sigmoid in the output layer -> Option A
  4. Quick Check:

    Softmax limits to one label, sigmoid allows many [OK]
Hint: Use sigmoid for multi-label, softmax for single-label [OK]
Common Mistakes:
  • Confusing softmax and sigmoid activations
  • Ignoring loss function compatibility
  • Setting threshold too low or high
  • Assuming threshold fixes activation issues
5. You have a dataset where each image can have multiple tags like 'beach', 'sunset', and 'people'. You want to build a multi-label classifier. Which metric is best to evaluate your model's performance?
hard
A. Precision, Recall, and F1-score calculated per label and averaged
B. Accuracy (percentage of exact matches of all labels)
C. Mean Squared Error
D. Confusion matrix for single-label classification

Solution

  1. Step 1: Understand evaluation needs for multi-label

    Exact match accuracy is too strict because all labels must match perfectly, which is rare.
  2. Step 2: Choose suitable metrics

    Precision, Recall, and F1-score per label, then averaged, give a balanced view of performance on each label.
  3. Final Answer:

    Precision, Recall, and F1-score calculated per label and averaged -> Option A
  4. Quick Check:

    Use per-label metrics averaged for multi-label evaluation [OK]
Hint: Use per-label precision/recall for multi-label metrics [OK]
Common Mistakes:
  • Using strict accuracy that ignores partial matches
  • Using regression metrics like MSE
  • Using single-label confusion matrix
  • Ignoring label imbalance in metrics