Fairness metrics help us check if a machine learning model treats all groups of people equally. They show if the model is biased or fair.
Fairness metrics in ML Python
# Example: Calculate demographic parity difference from sklearn.metrics import confusion_matrix def demographic_parity_difference(y_true, y_pred, sensitive_attr): groups = set(sensitive_attr) rates = {} for group in groups: idx = [i for i, val in enumerate(sensitive_attr) if val == group] positive_rate = sum(y_pred[i] for i in idx) / len(idx) rates[group] = positive_rate return max(rates.values()) - min(rates.values())
Fairness metrics often compare model outcomes across different groups defined by sensitive attributes like gender or race.
Common fairness metrics include demographic parity, equal opportunity, and equalized odds.
# Demographic parity difference # Measures difference in positive prediction rates between groups def demographic_parity_difference(y_true, y_pred, sensitive_attr): groups = set(sensitive_attr) rates = {} for group in groups: idx = [i for i, val in enumerate(sensitive_attr) if val == group] positive_rate = sum(y_pred[i] for i in idx) / len(idx) rates[group] = positive_rate return max(rates.values()) - min(rates.values())
# Equal opportunity difference # Measures difference in true positive rates between groups def equal_opportunity_difference(y_true, y_pred, sensitive_attr): groups = set(sensitive_attr) tpr = {} for group in groups: idx = [i for i, val in enumerate(sensitive_attr) if val == group] tp = sum(1 for i in idx if y_pred[i] == 1 and y_true[i] == 1) fn = sum(1 for i in idx if y_pred[i] == 0 and y_true[i] == 1) tpr[group] = tp / (tp + fn) if (tp + fn) > 0 else 0 return max(tpr.values()) - min(tpr.values())
This program calculates two fairness metrics on a small example dataset. It shows how much the model's positive predictions and true positive rates differ between two groups.
from sklearn.metrics import confusion_matrix def demographic_parity_difference(y_true, y_pred, sensitive_attr): groups = set(sensitive_attr) rates = {} for group in groups: idx = [i for i, val in enumerate(sensitive_attr) if val == group] positive_rate = sum(y_pred[i] for i in idx) / len(idx) rates[group] = positive_rate return max(rates.values()) - min(rates.values()) def equal_opportunity_difference(y_true, y_pred, sensitive_attr): groups = set(sensitive_attr) tpr = {} for group in groups: idx = [i for i, val in enumerate(sensitive_attr) if val == group] tp = sum(1 for i in idx if y_pred[i] == 1 and y_true[i] == 1) fn = sum(1 for i in idx if y_pred[i] == 0 and y_true[i] == 1) tpr[group] = tp / (tp + fn) if (tp + fn) > 0 else 0 return max(tpr.values()) - min(tpr.values()) # Sample data # y_true: actual labels (1 = positive, 0 = negative) # y_pred: model predictions # sensitive_attr: group membership (e.g., 'A' or 'B') y_true = [1, 0, 1, 0, 1, 0, 1, 0] y_pred = [1, 0, 1, 1, 0, 0, 1, 0] sensitive_attr = ['A', 'A', 'B', 'B', 'A', 'A', 'B', 'B'] dp_diff = demographic_parity_difference(y_true, y_pred, sensitive_attr) eo_diff = equal_opportunity_difference(y_true, y_pred, sensitive_attr) print(f"Demographic Parity Difference: {dp_diff:.2f}") print(f"Equal Opportunity Difference: {eo_diff:.2f}")
Fairness metrics depend on the sensitive attribute you choose, like gender or age.
Perfect fairness is hard to achieve; sometimes you must balance fairness with accuracy.
Use fairness metrics to find and reduce bias in your models early.
Fairness metrics measure if a model treats groups equally.
Common metrics compare positive rates or true positive rates across groups.
Checking fairness helps build trustworthy and ethical AI systems.