Privacy in machine learning is about protecting personal data. Metrics here focus on how well the model or system keeps data safe. Common metrics include differential privacy guarantees that measure how much information about an individual can leak from the model. Another key metric is membership inference attack success rate, which shows how easily an attacker can tell if a person's data was used to train the model. Lower values mean better privacy.
Privacy considerations in ML Python - Model Metrics & Evaluation
For privacy attacks like membership inference, a confusion matrix shows how well an attacker guesses if data was in training:
| Actual In Training | Actual Not In Training |
|--------------------|-----------------------|
| True Positive (TP) | False Positive (FP) |
| False Negative (FN) | True Negative (TN) |
Privacy risk is higher if TP and FP are high, meaning the attacker guesses membership correctly or wrongly often. A good privacy model has low TP and FP for the attacker.
In privacy attacks, precision means how often the attacker's positive guess is correct (true membership). Recall means how many actual members the attacker finds.
High precision, low recall: Attacker is sure when guessing membership but misses many members. Privacy is better.
High recall, low precision: Attacker finds many members but also guesses wrongly often. Privacy risk is still high.
For privacy, we want both precision and recall of attacks to be low, meaning the attacker cannot reliably find training data.
- Good: Differential privacy epsilon close to 0 (strong privacy), attacker precision and recall near random guessing (e.g., 0.5), low membership inference attack success.
- Bad: High epsilon (weak privacy), attacker precision and recall close to 1 (attacker can easily identify training data), high leakage of sensitive info.
- Ignoring privacy metrics: Focusing only on accuracy can hide privacy risks.
- Data leakage: If training data leaks into test sets, privacy metrics may be misleading.
- Overfitting: Models that memorize training data increase privacy risk but may show good accuracy.
- Misinterpreting epsilon: Not understanding that smaller epsilon means better privacy.
This question is about fraud detection, not privacy, but it shows why metrics matter. High accuracy can be misleading if fraud cases are rare. A 12% recall means the model finds only 12% of frauds, which is poor. For privacy, similarly, a model can be accurate but leak private data. Always check privacy metrics, not just accuracy.