In privacy-focused computer vision, traditional accuracy metrics are not enough. We need metrics that measure how well the model protects sensitive data. Examples include differential privacy guarantees, membership inference attack success rates, and data anonymization effectiveness. These metrics help us understand if the model leaks private information or if it respects user privacy.
Privacy considerations in Computer Vision - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
For privacy, a confusion matrix is less relevant. Instead, consider a table showing attack success rates on private data:
+----------------------------+---------------------+
| Attack Type | Success Rate (%) |
+----------------------------+---------------------+
| Membership Inference | 5 |
| Model Inversion | 3 |
| Attribute Inference | 7 |
+----------------------------+---------------------+
Lower success rates mean better privacy protection.
In privacy, there is a tradeoff between model utility (accuracy) and privacy protection. For example, adding noise to images can reduce model accuracy but improve privacy by hiding sensitive details.
Example:
- High accuracy, low privacy: Model recognizes faces well but leaks identity information.
- High privacy, low accuracy: Model blurs faces to protect identity but struggles to detect objects.
Finding the right balance depends on the application needs.
Good privacy metrics:
- Membership inference attack success rate < 10%
- Differential privacy epsilon < 1 (strong privacy)
- Minimal data leakage detected
Bad privacy metrics:
- Attack success rates > 50%
- High epsilon values (e.g., > 10) indicating weak privacy
- Evidence of sensitive data reconstruction
- Ignoring privacy metrics: Focusing only on accuracy can hide privacy risks.
- Data leakage: Training data accidentally exposed in model outputs.
- Overfitting: Model memorizes training images, increasing privacy risk.
- False sense of security: Using weak privacy guarantees or incomplete tests.
Your computer vision model has 95% accuracy but a membership inference attack success rate of 60%. Is it good for privacy? Why or why not?
Answer: No, it is not good for privacy. A 60% attack success rate means attackers can often tell if a person's data was used to train the model. This leaks sensitive information despite high accuracy.
Practice
Solution
Step 1: Understand privacy protection in images
Blurring faces hides personal identity, which protects privacy.Step 2: Compare other options
Improving quality, reducing size, or artistic effects do not relate to privacy.Final Answer:
To protect people's privacy by hiding their identity -> Option DQuick Check:
Blurring faces = privacy protection [OK]
- Thinking blurring improves image quality
- Confusing file size reduction with privacy
- Assuming artistic effects protect privacy
Solution
Step 1: Identify proper metadata removal method
PIL's Image.save() with 'exif=None' removes metadata correctly.Step 2: Evaluate other options
cv2.imread/write does not remove metadata; renaming or editing text is invalid.Final Answer:
Use PIL's Image.save() with 'exif' parameter set to None -> Option AQuick Check:
Remove metadata = PIL save with exif=None [OK]
- Assuming cv2.imwrite removes metadata
- Renaming file extensions changes nothing
- Editing image as text corrupts the file
import cv2
image = cv2.imread('group_photo.jpg')
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
faces = face_cascade.detectMultiScale(image, scaleFactor=1.1, minNeighbors=5)
for (x, y, w, h) in faces:
face_region = image[y:y+h, x:x+w]
blurred_face = cv2.GaussianBlur(face_region, (99, 99), 30)
image[y:y+h, x:x+w] = blurred_face
cv2.imwrite('blurred_photo.jpg', image)What will be the result of running this code?Solution
Step 1: Trace the code execution
cv2.imread loads a color image. However, detectMultiScale requires a grayscale image input, so passing a color image will cause an error or incorrect detection.Step 2: Correct usage
The image should be converted to grayscale before calling detectMultiScale, e.g., gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY).Final Answer:
The code will raise an error because detectMultiScale requires a grayscale image -> Option CQuick Check:
detectMultiScale requires grayscale input [OK]
- Thinking detectMultiScale works directly on color images
- Assuming no error on color input
- Believing blur applies to whole image
Solution
Step 1: Identify privacy and legal requirements
Consent is needed; without it, faces must be anonymized.Step 2: Evaluate options for compliance
Blurring faces anonymizes identities; using images as is or adding noise does not protect privacy properly.Final Answer:
Blur all faces in the dataset before using it for training -> Option AQuick Check:
No consent = anonymize faces by blurring [OK]
- Assuming public availability means consent
- Thinking noise addition protects identity
- Removing images may lose valuable data unnecessarily
Solution
Step 1: Understand privacy law requirements
Explicit consent is required to use personal images legally.Step 2: Combine consent and anonymization
Blurring faces in public datasets protects privacy while allowing training.Step 3: Evaluate other options
Using images without consent or deleting after training does not ensure compliance; avoiding face recognition limits functionality.Final Answer:
Collect images only with explicit consent and blur faces in public datasets -> Option BQuick Check:
Consent + blur = privacy compliance and functionality [OK]
- Thinking encryption replaces consent
- Assuming deleting data after training is enough
- Avoiding face recognition is not always necessary
