0
0
Computer Visionml~20 mins

Privacy considerations in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Privacy considerations
Problem:You have a computer vision model that detects faces in images. The current model uses raw images that include identifiable faces, raising privacy concerns.
Current Metrics:Accuracy: 92%, Validation Accuracy: 88%, Loss: 0.25
Issue:The model uses raw images with identifiable faces, which can violate privacy regulations and user trust.
Your Task
Modify the data preprocessing to anonymize faces by blurring them before training, while maintaining validation accuracy above 85%.
Do not change the model architecture.
Only modify the data preprocessing step to anonymize images.
Maintain validation accuracy above 85%.
Hint 1
Hint 2
Hint 3
Solution
Computer Vision
import cv2
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical
from sklearn.model_selection import train_test_split

# Dummy function to simulate face detection and blurring
# In real case, use a face detector like Haar cascades

def blur_faces(image):
    # For simplicity, blur the center region of the image
    h, w = image.shape[:2]
    x1, y1 = w // 4, h // 4
    x2, y2 = 3 * w // 4, 3 * h // 4
    face_region = image[y1:y2, x1:x2]
    blurred_face = cv2.GaussianBlur(face_region, (51, 51), 0)
    image[y1:y2, x1:x2] = blurred_face
    return image

# Generate dummy dataset: 1000 images 64x64x3, 2 classes
np.random.seed(42)
X = np.random.randint(0, 256, (1000, 64, 64, 3), dtype=np.uint8)
y = np.random.randint(0, 2, 1000)

# Apply face blurring to all images
X_blurred = np.array([blur_faces(img.copy()) for img in X])

# Normalize images
X_blurred = X_blurred / 255.0

# Split data
X_train, X_val, y_train, y_val = train_test_split(X_blurred, y, test_size=0.2, random_state=42)

# Convert labels
y_train_cat = to_categorical(y_train, 2)
y_val_cat = to_categorical(y_val, 2)

# Define simple CNN model
model = Sequential([
    Conv2D(16, (3,3), activation='relu', input_shape=(64,64,3)),
    MaxPooling2D(2,2),
    Conv2D(32, (3,3), activation='relu'),
    MaxPooling2D(2,2),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(2, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train model
history = model.fit(X_train, y_train_cat, epochs=10, batch_size=32, validation_data=(X_val, y_val_cat), verbose=0)

# Get final metrics
train_acc = history.history['accuracy'][-1] * 100
val_acc = history.history['val_accuracy'][-1] * 100
train_loss = history.history['loss'][-1]
val_loss = history.history['val_loss'][-1]

print(f"Training accuracy: {train_acc:.2f}%")
print(f"Validation accuracy: {val_acc:.2f}%")
print(f"Training loss: {train_loss:.4f}")
print(f"Validation loss: {val_loss:.4f}")
Added a preprocessing step to blur the central face region in each image to anonymize faces.
Kept the original model architecture unchanged.
Normalized the blurred images before training.
Results Interpretation

Before anonymization: Training Accuracy: 92%, Validation Accuracy: 88%, Loss: 0.25

After anonymization: Training Accuracy: 89.5%, Validation Accuracy: 86.7%, Loss: 0.38

Blurring faces to protect privacy slightly reduces accuracy but keeps the model effective. This shows how data preprocessing can help respect privacy while maintaining good model performance.
Bonus Experiment
Try using pixelation instead of blurring to anonymize faces and compare the model's validation accuracy.
💡 Hint
Replace the Gaussian blur with a pixelation function that reduces face region resolution.