Bird
Raised Fist0
ML Pythonml~5 mins

Multi-label classification in ML Python - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is multi-label classification?
Multi-label classification is a type of machine learning task where each example can belong to more than one class or category at the same time. For example, a photo can have both 'cat' and 'dog' labels.
Click to reveal answer
beginner
How does multi-label classification differ from multi-class classification?
In multi-class classification, each example belongs to only one class out of many. In multi-label classification, each example can belong to multiple classes simultaneously.
Click to reveal answer
intermediate
Name a common method to handle multi-label classification.
One common method is to use a separate binary classifier for each label. This means the model predicts yes/no for each label independently.
Click to reveal answer
intermediate
What metric can be used to evaluate multi-label classification models?
Metrics like Hamming Loss, F1-score (micro and macro), and Jaccard Index are used to evaluate multi-label classification models because they consider multiple labels per example.
Click to reveal answer
beginner
Why is multi-label classification important in real life?
Because many real-world problems involve items that belong to multiple categories, like tagging photos, music genres, or medical diagnoses, multi-label classification helps models understand and predict these complex cases.
Click to reveal answer
In multi-label classification, an example can have:
AMultiple labels at the same time
BOnly one label
CNo labels
DLabels that are mutually exclusive
Which metric is suitable for evaluating multi-label classification?
AAccuracy for single label
BMean Squared Error
CHamming Loss
DBLEU score
A simple way to build a multi-label classifier is to:
ATrain one multi-class classifier
BTrain one binary classifier per label
CUse clustering algorithms
DIgnore label dependencies
Which of these is NOT true about multi-label classification?
ALabels are mutually exclusive
BEach example can belong to multiple classes
CIt is used in tagging images
DIt requires special evaluation metrics
Multi-label classification is useful when:
AThere are no categories
BItems belong to exactly one category
CCategories are hierarchical only
DItems can belong to several categories
Explain what multi-label classification is and how it differs from multi-class classification.
Think about how many labels an example can have.
You got /3 concepts.
    Describe one common method to build a multi-label classification model and name a metric to evaluate it.
    Consider how to predict each label separately.
    You got /3 concepts.

      Practice

      (1/5)
      1. What is the main difference between multi-label classification and multi-class classification?
      easy
      A. Multi-label classification uses regression, multi-class uses classification.
      B. Multi-label classification assigns only one label, multi-class assigns multiple labels.
      C. Multi-label classification is used only for images, multi-class for text.
      D. Multi-label classification assigns multiple labels to one example, multi-class assigns only one.

      Solution

      1. Step 1: Understand multi-label classification

        Multi-label classification means each example can have more than one correct label at the same time.
      2. Step 2: Compare with multi-class classification

        Multi-class classification means each example can have only one label from many possible classes.
      3. Final Answer:

        Multi-label classification assigns multiple labels to one example, multi-class assigns only one. -> Option D
      4. Quick Check:

        Multi-label = multiple labels, multi-class = single label [OK]
      Hint: Remember: multi-label means many labels per example [OK]
      Common Mistakes:
      • Confusing multi-label with multi-class
      • Thinking multi-label assigns only one label
      • Mixing up classification with regression
      • Assuming multi-label is only for images
      2. Which of the following is a correct way to represent labels for multi-label classification in Python?
      easy
      A. labels = [0, 1, 2]
      B. labels = [[1, 0, 1], [0, 1, 0]]
      C. labels = 'cat,dog,bird'
      D. labels = 3

      Solution

      1. Step 1: Understand label representation for multi-label

        Multi-label classification uses a list or array where each position represents a label, with 1 or 0 indicating presence or absence.
      2. Step 2: Check options for correct format

        labels = [[1, 0, 1], [0, 1, 0]] shows a list of lists with 1s and 0s, correctly representing multiple labels per example.
      3. Final Answer:

        labels = [[1, 0, 1], [0, 1, 0]] -> Option B
      4. Quick Check:

        Multi-label uses binary vectors per example [OK]
      Hint: Use binary lists to show multiple labels [OK]
      Common Mistakes:
      • Using a single integer for labels
      • Using a string instead of list
      • Using a flat list without nested structure
      • Confusing multi-class label format with multi-label
      3. Given this Python code snippet for multi-label classification predictions:
      import numpy as np
      preds = np.array([[0.8, 0.1, 0.6], [0.3, 0.7, 0.2]])
      threshold = 0.5
      binary_preds = (preds > threshold).astype(int)
      print(binary_preds)

      What is the output?
      medium
      A. [[1 1 1] [0 0 0]]
      B. [[0 1 0] [1 0 1]]
      C. [[1 0 1] [0 1 0]]
      D. [[0 0 0] [1 1 1]]

      Solution

      1. Step 1: Apply threshold to predictions

        Compare each value in preds with 0.5: values > 0.5 become 1, else 0.
      2. Step 2: Convert boolean to int and print

        First row: 0.8>0.5=1, 0.1>0.5=0, 0.6>0.5=1; Second row: 0.3>0.5=0, 0.7>0.5=1, 0.2>0.5=0.
      3. Final Answer:

        [[1 0 1] [0 1 0]] -> Option C
      4. Quick Check:

        Thresholding preds > 0.5 = binary labels [OK]
      Hint: Compare each prediction to threshold for binary output [OK]
      Common Mistakes:
      • Confusing > with >=
      • Not converting boolean to int
      • Mixing rows and columns in output
      • Using wrong threshold value
      4. You trained a multi-label model but it always predicts only one label per example. What is the most likely cause?
      medium
      A. Using softmax activation instead of sigmoid in the output layer
      B. Using sigmoid activation instead of softmax in the output layer
      C. Using binary cross-entropy loss
      D. Using a threshold of 0.1 for predictions

      Solution

      1. Step 1: Understand output activations for multi-label

        Multi-label models use sigmoid activation to allow independent probabilities per label.
      2. Step 2: Identify problem with softmax

        Softmax forces probabilities to sum to 1, so only one label gets high probability, limiting multi-label predictions.
      3. Final Answer:

        Using softmax activation instead of sigmoid in the output layer -> Option A
      4. Quick Check:

        Softmax limits to one label, sigmoid allows many [OK]
      Hint: Use sigmoid for multi-label, softmax for single-label [OK]
      Common Mistakes:
      • Confusing softmax and sigmoid activations
      • Ignoring loss function compatibility
      • Setting threshold too low or high
      • Assuming threshold fixes activation issues
      5. You have a dataset where each image can have multiple tags like 'beach', 'sunset', and 'people'. You want to build a multi-label classifier. Which metric is best to evaluate your model's performance?
      hard
      A. Precision, Recall, and F1-score calculated per label and averaged
      B. Accuracy (percentage of exact matches of all labels)
      C. Mean Squared Error
      D. Confusion matrix for single-label classification

      Solution

      1. Step 1: Understand evaluation needs for multi-label

        Exact match accuracy is too strict because all labels must match perfectly, which is rare.
      2. Step 2: Choose suitable metrics

        Precision, Recall, and F1-score per label, then averaged, give a balanced view of performance on each label.
      3. Final Answer:

        Precision, Recall, and F1-score calculated per label and averaged -> Option A
      4. Quick Check:

        Use per-label metrics averaged for multi-label evaluation [OK]
      Hint: Use per-label precision/recall for multi-label metrics [OK]
      Common Mistakes:
      • Using strict accuracy that ignores partial matches
      • Using regression metrics like MSE
      • Using single-label confusion matrix
      • Ignoring label imbalance in metrics