Bird
Raised Fist0
Computer Visionml~8 mins

Fairness in face recognition in Computer Vision - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - Fairness in face recognition
Which metric matters for Fairness in face recognition and WHY

In face recognition, fairness means the model works equally well for all groups, like different skin colors, ages, or genders. We use False Positive Rate (FPR) and False Negative Rate (FNR) for each group to check fairness. If one group has many more mistakes, the model is unfair. We also look at Equal Error Rate (EER) and Demographic Parity to compare groups. These metrics help us find if the model treats everyone fairly.

Confusion matrix example for two groups
Group A confusion matrix:
  TP = 90  FP = 10
  FN = 5   TN = 95

Group B confusion matrix:
  TP = 70  FP = 30
  FN = 20  TN = 80

Total samples per group = 200

Calculations for Group A:
  Precision = 90 / (90 + 10) = 0.9
  Recall = 90 / (90 + 5) = 0.947
  FPR = 10 / (10 + 95) = 0.095

Calculations for Group B:
  Precision = 70 / (70 + 30) = 0.7
  Recall = 70 / (70 + 20) = 0.778
  FPR = 30 / (30 + 80) = 0.273

Notice Group B has worse recall and higher false positives, showing unfairness.
    
Precision vs Recall tradeoff with fairness examples

Imagine a face recognition system for unlocking phones. If it has high precision but low recall for a group, it means it rarely mistakes others for that person (good), but often fails to recognize the real user (bad). This frustrates users in that group.

On the other hand, if recall is high but precision is low, the system might unlock for wrong people in that group, risking security.

Fairness means balancing these so no group suffers more false rejections or false acceptances than others.

What "good" vs "bad" metric values look like for fairness

Good fairness: Similar precision, recall, FPR, and FNR across all groups. For example, all groups have recall around 0.9 and FPR around 0.05.

Bad fairness: One group has recall 0.95 but another 0.6, or one group's FPR is 0.01 but another's is 0.3. This means the model is biased and treats groups unequally.

Common pitfalls in fairness metrics
  • Ignoring group differences: Reporting only overall accuracy hides if some groups have poor results.
  • Data imbalance: If some groups have fewer samples, metrics can be misleading.
  • Overfitting to majority group: Model may perform well on large groups but poorly on minorities.
  • Using accuracy alone: Accuracy can be high if the model always guesses the majority group correctly, ignoring fairness.
Self-check question

Your face recognition model has 98% overall accuracy but only 50% recall for a minority group. Is it good for production? Why or why not?

Answer: No, it is not good. Even though overall accuracy is high, the low recall for the minority group means many real users in that group are not recognized. This is unfair and harms user experience for that group.

Key Result
Fairness in face recognition requires similar precision, recall, and error rates across all demographic groups to ensure equal treatment.

Practice

(1/5)
1.

What does fairness in face recognition mainly aim to achieve?

easy
A. More complex model architecture
B. Faster processing speed
C. Higher resolution images
D. Equal accuracy for all demographic groups

Solution

  1. Step 1: Understand fairness goal

    Fairness means the model should work equally well for all groups, not just some.
  2. Step 2: Identify fairness metric

    Accuracy or error rates should be similar across different demographic groups.
  3. Final Answer:

    Equal accuracy for all demographic groups -> Option D
  4. Quick Check:

    Fairness = Equal accuracy [OK]
Hint: Fairness means equal results for everyone [OK]
Common Mistakes:
  • Thinking fairness means faster models
  • Confusing fairness with image quality
  • Assuming complex models are always fair
2.

Which of the following is the correct way to check fairness in a face recognition model?

metrics = {'group_A': 0.92, 'group_B': 0.85}
# What should we compare?
easy
A. Only check metrics['group_A']
B. Compare metrics['group_A'] and metrics['group_B'] for equality
C. Ignore metrics and check model size
D. Compare metrics['group_A'] with a random number

Solution

  1. Step 1: Identify fairness check

    Fairness requires comparing performance metrics across groups.
  2. Step 2: Apply comparison

    Compare accuracy or error rates between group_A and group_B to find bias.
  3. Final Answer:

    Compare metrics['group_A'] and metrics['group_B'] for equality -> Option B
  4. Quick Check:

    Fairness check = Compare group metrics [OK]
Hint: Compare group metrics to check fairness [OK]
Common Mistakes:
  • Checking only one group
  • Ignoring metrics and focusing on model size
  • Comparing to unrelated values
3.

Consider this Python code snippet evaluating fairness metrics:

group_accuracies = {'A': 0.90, 'B': 0.75, 'C': 0.88}
threshold = 0.80
biased_groups = [g for g, acc in group_accuracies.items() if acc < threshold]
print(biased_groups)

What is the output?

medium
A. ['B']
B. ['A', 'B']
C. ['C']
D. []

Solution

  1. Step 1: Understand the code logic

    The code collects groups with accuracy less than 0.80 into biased_groups.
  2. Step 2: Check each group's accuracy

    Group A: 0.90 > 0.80 (not biased), B: 0.75 < 0.80 (biased), C: 0.88 > 0.80 (not biased)
  3. Final Answer:

    ['B'] -> Option A
  4. Quick Check:

    Only group B accuracy < threshold [OK]
Hint: Filter groups with accuracy below threshold [OK]
Common Mistakes:
  • Including groups with accuracy above threshold
  • Misreading comparison operator
  • Confusing list comprehension output
4.

Find the error in this fairness evaluation code snippet:

metrics = {'group1': 0.85, 'group2': 0.80}
threshold = 0.82
biased = [g for g, v in metrics if v < threshold]
print(biased)
medium
A. Missing .items() when iterating over dictionary
B. Wrong comparison operator
C. Threshold value is too high
D. Print statement syntax error

Solution

  1. Step 1: Identify dictionary iteration error

    Iterating over a dictionary directly gives keys, not key-value pairs.
  2. Step 2: Fix iteration to use .items()

    Use metrics.items() to get (key, value) pairs for comparison.
  3. Final Answer:

    Missing .items() when iterating over dictionary -> Option A
  4. Quick Check:

    Dictionary iteration needs .items() [OK]
Hint: Use .items() to get key-value pairs from dict [OK]
Common Mistakes:
  • Iterating dict keys instead of items
  • Changing threshold unnecessarily
  • Assuming print syntax is wrong
5.

You have a face recognition model with accuracy 0.95 on group X and 0.70 on group Y. Which approach best improves fairness?

hard
A. Ignore group Y and focus on group X
B. Increase model complexity without changing data
C. Collect more balanced training data including group Y
D. Reduce accuracy on group X to match group Y

Solution

  1. Step 1: Identify fairness problem

    Model performs worse on group Y, showing bias.
  2. Step 2: Choose best fairness improvement

    Balanced data helps model learn features for all groups equally.
  3. Step 3: Evaluate other options

    Increasing complexity alone may not fix bias; ignoring group Y is unfair; reducing group X accuracy is not ideal.
  4. Final Answer:

    Collect more balanced training data including group Y -> Option C
  5. Quick Check:

    Balanced data improves fairness [OK]
Hint: Balance training data to reduce bias [OK]
Common Mistakes:
  • Thinking model complexity fixes bias alone
  • Ignoring underperforming groups
  • Lowering accuracy on better groups