Bird
Raised Fist0
ML Pythonml~20 mins

Threshold tuning in ML Python - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
Threshold Tuning Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding the effect of threshold on classification

In a binary classification model, what happens to the precision and recall when you increase the decision threshold from 0.5 to 0.8?

ABoth precision and recall increase
BPrecision decreases and recall increases
CPrecision increases and recall decreases
DBoth precision and recall decrease
Attempts:
2 left
💡 Hint

Think about how raising the threshold affects which predictions are labeled positive.

Predict Output
intermediate
2:00remaining
Output of threshold tuning code snippet

What is the output of the following Python code that applies threshold tuning on model probabilities?

ML Python
import numpy as np
probs = np.array([0.2, 0.6, 0.8, 0.4, 0.9])
threshold = 0.7
preds = (probs >= threshold).astype(int)
print(preds.tolist())
A[0, 1, 0, 0, 1]
B[0, 0, 1, 0, 1]
C[0, 1, 1, 0, 1]
D[1, 1, 1, 0, 1]
Attempts:
2 left
💡 Hint

Check which probabilities are greater than or equal to 0.7.

Model Choice
advanced
2:00remaining
Choosing threshold for imbalanced data

You have a highly imbalanced dataset with very few positive cases. Which threshold tuning strategy is best to maximize recall while keeping false positives reasonable?

ARandomly select threshold to avoid bias
BSet a high threshold to reduce false positives, ignoring recall
CUse threshold 0.5 without tuning since it is default
DSet a low threshold to catch most positives, then use precision-recall curve to find balance
Attempts:
2 left
💡 Hint

Recall is about catching positives; think about how threshold affects it.

Metrics
advanced
2:00remaining
Effect of threshold on F1 score

Given a model with these confusion matrix values at threshold 0.5: TP=40, FP=10, FN=20, TN=130. If threshold is increased to 0.7, TP=30, FP=5, FN=30, TN=135. What happens to the F1 score?

AF1 score decreases
BF1 score increases
CF1 score stays the same
DCannot determine without precision and recall
Attempts:
2 left
💡 Hint

Calculate precision and recall for both thresholds, then compute F1.

🔧 Debug
expert
3:00remaining
Debugging threshold tuning code with unexpected output

Consider this Python code snippet for threshold tuning. It produces unexpected predictions. What is the cause?

import numpy as np
probs = np.array([0.3, 0.7, 0.5, 0.9])
threshold = 0.5
preds = (probs > threshold).astype(int)
print(preds.tolist())
AUsing '>' excludes probabilities equal to threshold, causing some positives to be missed
BThe threshold variable is not used correctly; should be a list
Castype(int) converts floats incorrectly causing wrong predictions
DNumpy array probs is not sorted, causing wrong output
Attempts:
2 left
💡 Hint

Check how the comparison operator affects predictions equal to threshold.

Practice

(1/5)
1. What is the main purpose of threshold tuning in machine learning classification?
easy
A. To find the best cutoff probability to decide between classes
B. To increase the size of the training dataset
C. To reduce the number of features used in the model
D. To speed up the training process

Solution

  1. Step 1: Understand threshold tuning concept

    Threshold tuning is about choosing a cutoff value for predicted probabilities to decide class labels.
  2. Step 2: Identify the main goal

    The goal is to find the cutoff that best separates positive and negative classes for better decisions.
  3. Final Answer:

    To find the best cutoff probability to decide between classes -> Option A
  4. Quick Check:

    Threshold tuning = best cutoff choice [OK]
Hint: Threshold tuning picks the cutoff to decide yes/no [OK]
Common Mistakes:
  • Confusing threshold tuning with feature selection
  • Thinking threshold tuning changes training data size
  • Assuming threshold tuning speeds up training
2. Which of the following is the correct way to apply a threshold of 0.7 to predicted probabilities probs in Python to get binary predictions?
easy
A. preds = (probs > 0.7).astype(int)
B. preds = probs > 0.7
C. preds = int(probs > 0.7)
D. preds = probs >= 0.7

Solution

  1. Step 1: Understand threshold application

    We compare each probability to 0.7 to get True/False, then convert to 0/1 integers.
  2. Step 2: Check correct syntax

    Using (probs > 0.7).astype(int) converts boolean array to integer array correctly.
  3. Final Answer:

    preds = (probs > 0.7).astype(int) -> Option A
  4. Quick Check:

    Threshold applied with boolean then int cast [OK]
Hint: Use boolean comparison then convert to int for binary labels [OK]
Common Mistakes:
  • Forgetting to convert boolean to int
  • Using int() on entire array instead of element-wise
  • Using >= instead of > changes threshold logic
3. Given the following code, what will be the printed F1 score after threshold tuning?
from sklearn.metrics import f1_score
probs = [0.2, 0.8, 0.6, 0.4]
true_labels = [0, 1, 1, 0]
threshold = 0.5
preds = [1 if p > threshold else 0 for p in probs]
f1 = f1_score(true_labels, preds)
print(round(f1, 2))
medium
A. 0.80
B. 0.67
C. 1.00
D. 0.50

Solution

  1. Step 1: Calculate predictions with threshold 0.5

    probs > 0.5 gives preds = [0, 1, 1, 0]
  2. Step 2: Compute F1 score for preds vs true_labels

    True positives = 2, false positives = 0, false negatives = 0, so F1 = 2*TP/(2*TP+FP+FN) = 2*2/(4+0+0) = 1.0, since preds and true_labels are identical.
  3. Final Answer:

    1.00 -> Option C
  4. Quick Check:

    Perfect match means F1 = 1.00 [OK]
Hint: Check predicted labels carefully before scoring [OK]
Common Mistakes:
  • Miscomputing predictions from threshold
  • Confusing precision and recall in F1 calculation
  • Rounding errors in final score
4. The following code tries to tune threshold but gives an error. What is the error?
probs = [0.1, 0.4, 0.6, 0.9]
true_labels = [0, 0, 1, 1]
thresholds = [0.3, 0.5, 0.7]
best_f1 = 0
for t in thresholds:
    preds = (probs > t)
    f1 = f1_score(true_labels, preds)
    if f1 > best_f1:
        best_f1 = f1
print(best_f1)
medium
A. Thresholds list is empty
B. Missing import of f1_score
C. preds is boolean, should be integers
D. Loop variable t is not used

Solution

  1. Step 1: Check code for missing imports

    The code uses f1_score but does not import it from sklearn.metrics.
  2. Step 2: Identify error cause

    Without importing f1_score, Python will raise a NameError when calling f1_score.
  3. Final Answer:

    Missing import of f1_score -> Option B
  4. Quick Check:

    Always import functions before use [OK]
Hint: Check if all functions are imported before use [OK]
Common Mistakes:
  • Assuming boolean preds cause error (they don't)
  • Ignoring missing import errors
  • Thinking loop variable is unused
5. You have a model predicting probabilities for a rare disease. You want to tune the threshold to catch as many sick patients as possible but avoid too many false alarms. Which approach best balances this trade-off?
hard
A. Choose threshold maximizing recall only
B. Choose threshold minimizing accuracy
C. Choose threshold maximizing precision only
D. Choose threshold maximizing F1 score

Solution

  1. Step 1: Understand the trade-off

    High recall catches more sick patients but may increase false alarms; precision reduces false alarms but may miss sick patients.
  2. Step 2: Identify best metric for balance

    F1 score balances precision and recall, making it best to tune threshold for this trade-off.
  3. Final Answer:

    Choose threshold maximizing F1 score -> Option D
  4. Quick Check:

    F1 balances recall and precision [OK]
Hint: Use F1 score to balance recall and precision [OK]
Common Mistakes:
  • Maximizing recall ignores false alarms
  • Maximizing precision ignores missed cases
  • Minimizing accuracy is not meaningful