Bird
Raised Fist0
Computer Visionml~20 mins

Privacy considerations in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - Privacy considerations
Problem:You have a computer vision model that detects faces in images. The current model uses raw images that include identifiable faces, raising privacy concerns.
Current Metrics:Accuracy: 92%, Validation Accuracy: 88%, Loss: 0.25
Issue:The model uses raw images with identifiable faces, which can violate privacy regulations and user trust.
Your Task
Modify the data preprocessing to anonymize faces by blurring them before training, while maintaining validation accuracy above 85%.
Do not change the model architecture.
Only modify the data preprocessing step to anonymize images.
Maintain validation accuracy above 85%.
Hint 1
Hint 2
Hint 3
Solution
Computer Vision
import cv2
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split

# Dummy function to simulate face detection and blurring
# In real case, use a face detector like Haar cascades

def blur_faces(image):
    # For simplicity, blur the center region of the image
    h, w = image.shape[:2]
    x1, y1 = w // 4, h // 4
    x2, y2 = 3 * w // 4, 3 * h // 4
    face_region = image[y1:y2, x1:x2]
    blurred_face = cv2.GaussianBlur(face_region, (51, 51), 0)
    image[y1:y2, x1:x2] = blurred_face
    return image

# Generate dummy dataset: 1000 images 64x64x3, 2 classes
np.random.seed(42)
X = np.random.randint(0, 256, (1000, 64, 64, 3), dtype=np.uint8)
y = np.random.randint(0, 2, 1000)

# Apply face blurring to all images
X_blurred = np.array([blur_faces(img.copy()) for img in X])

# Normalize images
X_blurred = X_blurred / 255.0

# Split data
X_train, X_val, y_train, y_val = train_test_split(X_blurred, y, test_size=0.2, random_state=42)

# Convert labels
y_train_cat = to_categorical(y_train, 2)
y_val_cat = to_categorical(y_val, 2)

# Define simple CNN model
model = Sequential([
    Conv2D(16, (3,3), activation='relu', input_shape=(64,64,3)),
    MaxPooling2D(2,2),
    Conv2D(32, (3,3), activation='relu'),
    MaxPooling2D(2,2),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(2, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train model
history = model.fit(X_train, y_train_cat, epochs=10, batch_size=32, validation_data=(X_val, y_val_cat), verbose=0)

# Get final metrics
train_acc = history.history['accuracy'][-1] * 100
val_acc = history.history['val_accuracy'][-1] * 100
train_loss = history.history['loss'][-1]
val_loss = history.history['val_loss'][-1]

print(f"Training accuracy: {train_acc:.2f}%")
print(f"Validation accuracy: {val_acc:.2f}%")
print(f"Training loss: {train_loss:.4f}")
print(f"Validation loss: {val_loss:.4f}")
Added a preprocessing step to blur the central face region in each image to anonymize faces.
Kept the original model architecture unchanged.
Normalized the blurred images before training.
Results Interpretation

Before anonymization: Training Accuracy: 92%, Validation Accuracy: 88%, Loss: 0.25

After anonymization: Training Accuracy: 89.5%, Validation Accuracy: 86.7%, Loss: 0.38

Blurring faces to protect privacy slightly reduces accuracy but keeps the model effective. This shows how data preprocessing can help respect privacy while maintaining good model performance.
Bonus Experiment
Try using pixelation instead of blurring to anonymize faces and compare the model's validation accuracy.
💡 Hint
Replace the Gaussian blur with a pixelation function that reduces face region resolution.

Practice

(1/5)
1. What is the main reason to blur faces in images used for computer vision projects?
easy
A. To make the images look artistic
B. To improve the image quality for better model training
C. To reduce the file size of the images
D. To protect people's privacy by hiding their identity

Solution

  1. Step 1: Understand privacy protection in images

    Blurring faces hides personal identity, which protects privacy.
  2. Step 2: Compare other options

    Improving quality, reducing size, or artistic effects do not relate to privacy.
  3. Final Answer:

    To protect people's privacy by hiding their identity -> Option D
  4. Quick Check:

    Blurring faces = privacy protection [OK]
Hint: Blurring hides identity to protect privacy [OK]
Common Mistakes:
  • Thinking blurring improves image quality
  • Confusing file size reduction with privacy
  • Assuming artistic effects protect privacy
2. Which of the following is the correct way to remove metadata from an image file in Python?
easy
A. Use PIL's Image.save() with 'exif' parameter set to None
B. Use cv2.imread() and cv2.imwrite() without extra steps
C. Rename the image file extension to .txt
D. Open the image in a text editor and delete random lines

Solution

  1. Step 1: Identify proper metadata removal method

    PIL's Image.save() with 'exif=None' removes metadata correctly.
  2. Step 2: Evaluate other options

    cv2.imread/write does not remove metadata; renaming or editing text is invalid.
  3. Final Answer:

    Use PIL's Image.save() with 'exif' parameter set to None -> Option A
  4. Quick Check:

    Remove metadata = PIL save with exif=None [OK]
Hint: Use PIL save with exif=None to remove metadata [OK]
Common Mistakes:
  • Assuming cv2.imwrite removes metadata
  • Renaming file extensions changes nothing
  • Editing image as text corrupts the file
3. Consider this Python code snippet that blurs faces in an image using OpenCV:
import cv2
image = cv2.imread('group_photo.jpg')
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
faces = face_cascade.detectMultiScale(image, scaleFactor=1.1, minNeighbors=5)
for (x, y, w, h) in faces:
    face_region = image[y:y+h, x:x+w]
    blurred_face = cv2.GaussianBlur(face_region, (99, 99), 30)
    image[y:y+h, x:x+w] = blurred_face
cv2.imwrite('blurred_photo.jpg', image)
What will be the result of running this code?
medium
A. The output image will have all detected faces blurred to protect privacy
B. The output image will be unchanged because GaussianBlur is not applied correctly
C. The code will raise an error because detectMultiScale requires a grayscale image
D. The code will blur the entire image instead of just faces

Solution

  1. Step 1: Trace the code execution

    cv2.imread loads a color image. However, detectMultiScale requires a grayscale image input, so passing a color image will cause an error or incorrect detection.
  2. Step 2: Correct usage

    The image should be converted to grayscale before calling detectMultiScale, e.g., gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY).
  3. Final Answer:

    The code will raise an error because detectMultiScale requires a grayscale image -> Option C
  4. Quick Check:

    detectMultiScale requires grayscale input [OK]
Hint: detectMultiScale needs grayscale image [OK]
Common Mistakes:
  • Thinking detectMultiScale works directly on color images
  • Assuming no error on color input
  • Believing blur applies to whole image
4. You have a dataset of images with faces but forgot to get consent from people. Which fix below best respects privacy and legal rules?
medium
A. Blur all faces in the dataset before using it for training
B. Use the images as is because they are publicly available
C. Remove all images with faces and keep only background images
D. Add random noise to images without blurring faces

Solution

  1. Step 1: Identify privacy and legal requirements

    Consent is needed; without it, faces must be anonymized.
  2. Step 2: Evaluate options for compliance

    Blurring faces anonymizes identities; using images as is or adding noise does not protect privacy properly.
  3. Final Answer:

    Blur all faces in the dataset before using it for training -> Option A
  4. Quick Check:

    No consent = anonymize faces by blurring [OK]
Hint: No consent? Blur faces to protect privacy [OK]
Common Mistakes:
  • Assuming public availability means consent
  • Thinking noise addition protects identity
  • Removing images may lose valuable data unnecessarily
5. You want to build a face recognition system but must comply with privacy laws. Which combined approach best balances functionality and privacy?
hard
A. Train on unblurred public images and delete them after training
B. Collect images only with explicit consent and blur faces in public datasets
C. Use any available images without consent but encrypt the dataset
D. Avoid face recognition and use only object detection instead

Solution

  1. Step 1: Understand privacy law requirements

    Explicit consent is required to use personal images legally.
  2. Step 2: Combine consent and anonymization

    Blurring faces in public datasets protects privacy while allowing training.
  3. Step 3: Evaluate other options

    Using images without consent or deleting after training does not ensure compliance; avoiding face recognition limits functionality.
  4. Final Answer:

    Collect images only with explicit consent and blur faces in public datasets -> Option B
  5. Quick Check:

    Consent + blur = privacy compliance and functionality [OK]
Hint: Consent plus blurring balances privacy and use [OK]
Common Mistakes:
  • Thinking encryption replaces consent
  • Assuming deleting data after training is enough
  • Avoiding face recognition is not always necessary