0
0
HLDsystem_design~25 mins

Data privacy and compliance (GDPR) in HLD - System Design Exercise

Choose your learning style9 modes available
Design: Data Privacy and Compliance System (GDPR)
In scope: Personal data lifecycle management, consent management, audit logging, breach notification, data access and deletion APIs. Out of scope: Detailed legal advice, non-GDPR regional laws, physical security of data centers.
Functional Requirements
FR1: Store and process personal data with user consent
FR2: Allow users to access, modify, and delete their personal data
FR3: Log all data processing activities for audit purposes
FR4: Support data portability by exporting user data in a common format
FR5: Implement data minimization and purpose limitation principles
FR6: Notify users and authorities within 72 hours of a data breach
FR7: Ensure data is encrypted at rest and in transit
FR8: Support role-based access control for data access
FR9: Automatically delete or anonymize data after retention period
FR10: Provide mechanisms to handle user consent withdrawal
Non-Functional Requirements
NFR1: Handle up to 1 million active users with personal data
NFR2: API response latency under 300ms for data access requests
NFR3: System availability of 99.9% uptime (max 8.77 hours downtime/year)
NFR4: Compliance with GDPR legal requirements and audit standards
NFR5: Data breach notification within 72 hours
NFR6: Secure storage and transmission using industry standards
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
❓ Question 6
❓ Question 7
❓ Question 8
Key Components
Consent management service
Personal data storage with encryption
Audit logging system
User data access and deletion API
Data breach detection and notification module
Role-based access control (RBAC) system
Data anonymization and retention scheduler
Secure communication layer (TLS/HTTPS)
Design Patterns
Event sourcing for audit logs
CQRS for separating read and write operations
Encryption at rest and in transit
Token-based authentication and authorization
Data masking and anonymization
Data retention and scheduled deletion
Consent versioning and revocation
Reference Architecture
                    +-----------------------+
                    |   User Interface (UI)  |
                    +-----------+-----------+
                                |
                                v
                    +-----------------------+
                    |  API Gateway / Backend |
                    +-----------+-----------+
                                |
        +-----------------------+-----------------------+
        |                       |                       |
+-------v-------+       +-------v-------+       +-------v-------+
| Consent Mgmt  |       | Personal Data |       | Audit Logging |
| Service      |       | Storage       |       | Service       |
+-------+-------+       +-------+-------+       +-------+-------+
        |                       |                       |
        |                       |                       |
        |                       |                       |
+-------v-------+       +-------v-------+       +-------v-------+
| RBAC System   |       | Data Retention|       | Breach Notif. |
|              |       | & Anonymizer  |       | Module        |
+---------------+       +---------------+       +---------------+
Components
User Interface (UI)
Web/Mobile frontend
Allow users to manage consent, access, modify, and delete personal data
API Gateway / Backend
RESTful API server (e.g., Node.js, Spring Boot)
Handle client requests, enforce authentication and authorization, route to services
Consent Management Service
Microservice with database
Store and manage user consents with versioning and revocation support
Personal Data Storage
Encrypted relational database (e.g., PostgreSQL with TDE)
Store personal data securely with encryption at rest
Audit Logging Service
Append-only log storage (e.g., Elasticsearch, Kafka)
Record all data processing activities for compliance audits
Role-Based Access Control (RBAC) System
Authorization service
Control access to personal data based on user roles and permissions
Data Retention and Anonymization Scheduler
Background job system (e.g., cron jobs, message queues)
Automatically delete or anonymize data after retention period
Data Breach Notification Module
Monitoring and alerting system
Detect breaches and notify users and authorities within 72 hours
Request Flow
1. User accesses UI to provide or withdraw consent.
2. UI sends request to API Gateway, which authenticates user and checks RBAC.
3. Consent Management Service records consent changes with timestamps.
4. User requests personal data access or deletion via UI.
5. API Gateway routes request to Personal Data Storage after RBAC validation.
6. Audit Logging Service records all data access and modification events.
7. Data Retention Scheduler runs periodic jobs to anonymize or delete expired data.
8. Breach Notification Module monitors system logs and triggers alerts if breach detected.
9. Notifications sent to users and authorities within 72 hours if breach occurs.
Database Schema
Entities: - User: user_id (PK), name, email, etc. - Consent: consent_id (PK), user_id (FK), consent_type, status, timestamp, version - PersonalData: data_id (PK), user_id (FK), data_type, encrypted_data, created_at, updated_at - AuditLog: log_id (PK), user_id (FK), action_type, resource, timestamp, details - Role: role_id (PK), role_name - UserRole: user_id (FK), role_id (FK) Relationships: - One User has many Consents - One User has many PersonalData entries - One User has many AuditLogs - Many-to-many between User and Role via UserRole
Scaling Discussion
Bottlenecks
Database performance under high read/write load for personal data and audit logs
Latency in consent management and data access APIs
Storage growth due to audit logs and personal data retention
Timely detection and notification of data breaches
Managing encryption and decryption overhead
Solutions
Use database sharding and read replicas to distribute load
Implement caching for frequently accessed consent and user metadata
Archive old audit logs to cold storage and use log aggregation tools
Deploy real-time monitoring and alerting with automated workflows
Use hardware acceleration or optimized libraries for encryption tasks
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing the architecture and data flow, 10 minutes discussing scaling and compliance challenges, 5 minutes summarizing and answering questions.
Emphasize GDPR principles: consent, data minimization, user rights
Explain secure data storage and encryption strategies
Describe audit logging and its importance for compliance
Discuss how to handle data breach detection and notification
Highlight role-based access control for data security
Address scalability concerns with realistic solutions
Show awareness of legal and operational constraints