0
0
MLOpsdevops~7 mins

Regulatory compliance (GDPR, AI Act) in MLOps - Commands & Configuration

Choose your learning style9 modes available
Introduction
Regulatory compliance means following laws that protect people's data and privacy. For AI and machine learning, this means making sure models and data handling meet rules like GDPR and the AI Act. This helps avoid legal trouble and builds trust with users.
When you collect personal data for training machine learning models and need to protect user privacy.
When deploying AI models in Europe where GDPR and AI Act rules apply.
When you want to document how your AI model uses data to show compliance during audits.
When you need to control who can access sensitive data and AI model outputs.
When you want to automate checks that your AI system respects data protection laws.
Config File - mlflow_tracking_config.py
mlflow_tracking_config.py
import mlflow
import os

# Set tracking URI to a secure server with access control
mlflow.set_tracking_uri("https://mlflow.example.com")

# Enable artifact encryption and access logging
os.environ["MLFLOW_ARTIFACT_ENCRYPTION"] = "true"
os.environ["MLFLOW_ACCESS_LOGGING"] = "true"

# Define tags for compliance tracking
mlflow.set_tag("compliance", "GDPR")
mlflow.set_tag("data_privacy", "enabled")

# Function to log model with data usage info

def log_model_with_compliance(model, data_description):
    mlflow.log_param("data_description", data_description)
    mlflow.sklearn.log_model(model, "model")
    print("Model logged with compliance tags and data description")

This Python config sets up MLflow tracking to comply with GDPR and AI Act rules.

  • Tracking URI: Points to a secure MLflow server with access control.
  • Artifact encryption: Ensures stored model files are encrypted.
  • Access logging: Records who accesses model data for audits.
  • Tags: Labels runs with compliance info for easy filtering.
  • Logging function: Logs model and describes data used, helping trace data lineage.
Commands
Run the Python script to configure MLflow tracking with compliance settings and log a model with data usage description.
Terminal
python mlflow_tracking_config.py
Expected OutputExpected
Model logged with compliance tags and data description
Start the MLflow UI to visually inspect logged models, parameters, and compliance tags.
Terminal
mlflow ui
Expected OutputExpected
2024/06/01 12:00:00 Starting MLflow UI at http://127.0.0.1:5000
--host - Bind the UI to a specific IP address for secure access
--port - Specify the port number for the UI
Query the MLflow server API to find all runs tagged with GDPR compliance for audit purposes.
Terminal
curl -X GET https://mlflow.example.com/api/2.0/mlflow/runs/search -d '{"filter": "tags.compliance = \"GDPR\""}'
Expected OutputExpected
{"runs": [{"run_id": "1234abcd", "tags": {"compliance": "GDPR", "data_privacy": "enabled"}}]}
Key Concept

If you remember nothing else from this pattern, remember: always track and document data usage and model metadata to prove compliance with data protection laws.

Code Example
MLOps
import mlflow
import mlflow.sklearn
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris

# Load sample data
iris = load_iris()
X, y = iris.data, iris.target

# Train a simple model
model = LogisticRegression(max_iter=100)
model.fit(X, y)

# Set tracking URI and tags
mlflow.set_tracking_uri("https://mlflow.example.com")
mlflow.set_tag("compliance", "GDPR")
mlflow.set_tag("data_privacy", "enabled")

# Log model with data description
with mlflow.start_run():
    mlflow.log_param("data_description", "Iris dataset with no personal data")
    mlflow.sklearn.log_model(model, "model")
    print("Model logged with compliance tags and data description")
OutputSuccess
Common Mistakes
Not tagging MLflow runs with compliance-related metadata.
Without tags, it's hard to filter and prove which models follow regulations during audits.
Always add clear tags like 'compliance' and 'data_privacy' when logging models.
Storing model artifacts without encryption or access logs.
This risks unauthorized access to sensitive data and violates GDPR requirements.
Enable artifact encryption and access logging in your MLflow server configuration.
Not describing the data used for training in the model logs.
Auditors need to see what data was used to ensure it complies with consent and privacy rules.
Log parameters or tags that describe the data source and privacy status.
Summary
Configure MLflow tracking to use secure servers with encryption and access logs.
Tag model runs with compliance metadata to easily find and audit them later.
Log detailed data descriptions alongside models to prove lawful data use.