Microservicessystem_design~25 mins

Correlation IDs in Microservices - System Design Exercise

Choose your learning style10 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Design: Correlation ID Tracking in Microservices

Design the mechanism to generate, propagate, and log correlation IDs across microservices. Out of scope: detailed implementation of each microservice business logic.

Functional Requirements

FR1: Assign a unique correlation ID to each client request entering the system

FR2: Propagate the correlation ID through all microservices involved in processing the request

FR3: Log the correlation ID with all service logs related to the request

FR4: Allow tracing and debugging of requests across multiple services using the correlation ID

FR5: Support asynchronous communication patterns (e.g., message queues) with correlation ID propagation

FR6: Ensure minimal performance overhead when generating and passing correlation IDs

Non-Functional Requirements

NFR1: System must handle 10,000 concurrent requests with correlation tracking

NFR2: Correlation ID propagation latency should not add more than 5ms per service hop

NFR3: Logging with correlation IDs must not degrade overall system availability below 99.9%

NFR4: Correlation IDs must be globally unique and collision-resistant

NFR5: Support both synchronous HTTP calls and asynchronous messaging

Think Before You Design

Questions to Ask

❓ Question 1

❓ Question 2

❓ Question 3

❓ Question 4

❓ Question 5

❓ Question 6

Key Components

API Gateway or Edge Service for initial correlation ID generation

Middleware or Interceptors in each microservice to propagate and log correlation IDs

Logging system that supports structured logs with correlation ID fields

Message brokers or queues with headers to carry correlation IDs

Tracing tools or distributed tracing systems (optional)

Design Patterns

Request Context Propagation

Distributed Tracing

Correlation ID Injection via Middleware

Log Enrichment with Correlation IDs

Message Header Propagation

Reference Architecture

Client
  |
  v
API Gateway (generates correlation ID)
  |
  v
Microservice A <--> Microservice B <--> Microservice C
  |            |            |
  v            v            v
Logging System (logs with correlation ID)

Message Queue (with correlation ID in headers)
  ^
  |
Microservice D (consumes messages with correlation ID)

Components

API Gateway

Nginx/Envoy/Custom Gateway

Generate unique correlation IDs for incoming requests and inject them into request headers

Microservice Middleware

HTTP Middleware / gRPC Interceptors

Extract, propagate, and log correlation IDs for each request or message

Logging System

ELK Stack / Fluentd / Cloud Logging

Store structured logs enriched with correlation IDs for traceability

Message Broker

Kafka / RabbitMQ / AWS SQS

Carry correlation IDs in message headers for asynchronous propagation

Request Flow

1. Client sends request to API Gateway

2. API Gateway generates a unique correlation ID if missing and adds it to request headers

3. Request forwarded to Microservice A with correlation ID in headers

4. Microservice A middleware extracts correlation ID, logs it, and forwards request to Microservice B

5. Microservice B repeats extraction, logging, and propagation to Microservice C

6. If Microservice C sends an asynchronous message, it includes the correlation ID in message headers

7. Microservice D consumes the message, extracts correlation ID, and logs processing with the same ID

8. All logs from services include the correlation ID, enabling tracing of the full request path

Database Schema

No specific database schema required for correlation IDs as they are transient metadata passed in headers and logs. However, logging storage schema should include a 'correlation_id' field indexed for fast querying.

Scaling Discussion

Bottlenecks

High volume of logs with correlation IDs can overwhelm logging infrastructure

Propagation overhead may increase latency if correlation ID handling is inefficient

Message brokers may drop or lose correlation ID headers if not configured properly

Inconsistent correlation ID formats can cause tracing failures

Solutions

Use efficient, asynchronous logging with batching and compression to handle log volume

Implement lightweight middleware for correlation ID handling to minimize latency

Configure message brokers to preserve headers and validate correlation ID presence

Standardize correlation ID format (e.g., UUID v4) and validate on entry points

Interview Tips

Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing the architecture and data flow, 10 minutes discussing scaling and trade-offs, and 5 minutes summarizing.

Explain the importance of correlation IDs for debugging and tracing in microservices

Describe how and where correlation IDs are generated and propagated

Discuss synchronous and asynchronous propagation challenges

Highlight integration with logging and monitoring systems

Address performance and scalability considerations

Mention industry best practices and standards for correlation IDs

Practice

(1/5)

1. What is the primary purpose of a Correlation ID in microservices?

easy

A. To balance load between servers

B. To encrypt data between services

C. To track a single request across multiple services for easier debugging

D. To store user session information

Correlation IDs in Microservices - System Design Exercise

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of Correlation ID

Step 2: Identify its main use

Final Answer:

Quick Check:

Solution

Step 1: Review common practices for passing metadata

Step 2: Evaluate options

Final Answer:

Quick Check:

Solution

Step 1: Extract Correlation ID from headers

Step 2: Check the header value in the request

Final Answer:

Quick Check:

Solution

Step 1: Understand Correlation ID propagation

Step 2: Identify common propagation mistake

Final Answer:

Quick Check:

Solution

Step 1: Analyze the effect of generating new IDs per service

Step 2: Understand impact on traceability

Final Answer:

Quick Check: