Bird
Raised Fist0
Microservicessystem_design~7 mins

Correlation IDs in Microservices - System Design Guide

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Problem Statement
When multiple microservices handle parts of the same user request, it becomes nearly impossible to trace the full journey of that request across services. Without a shared identifier, debugging failures or performance issues requires sifting through disconnected logs, causing delays and errors in root cause analysis.
Solution
Correlation IDs assign a unique identifier to each user request at the entry point and pass this ID through all downstream services. Each service logs this ID alongside its own logs, enabling developers to trace the entire request flow end-to-end by filtering logs with the same correlation ID.
Architecture
Client/User
API Gateway
Service B

This diagram shows a client request entering through an API Gateway where a correlation ID is assigned. The ID flows through multiple services (A, B, C), allowing logs from all services to be linked by the same correlation ID.

Trade-offs
✓ Pros
Enables end-to-end tracing of requests across distributed services.
Simplifies debugging by linking logs from multiple services with a single ID.
Improves monitoring and alerting by correlating related events.
Supports distributed tracing tools and observability platforms.
✗ Cons
Requires all services to propagate the correlation ID consistently.
Adds slight overhead in request headers and logging.
Needs careful handling to avoid ID loss or duplication in asynchronous flows.
Use when your system has multiple microservices handling parts of the same user request, especially if you have more than 10 services or handle over 1000 requests per second.
Avoid if your system is a single monolith or has very simple request flows where tracing across services is unnecessary.
Real World Examples
Uber
Uber uses correlation IDs to trace ride requests across dozens of microservices, enabling quick diagnosis of delays or failures in the booking process.
Netflix
Netflix propagates correlation IDs through its streaming and recommendation services to monitor user sessions and troubleshoot playback issues.
Amazon
Amazon uses correlation IDs to track orders as they move through inventory, payment, and shipping microservices, ensuring smooth order fulfillment.
Code Example
The before code logs requests without any shared identifier, making it hard to trace. The after code assigns a unique correlation ID at the entry point and passes it explicitly to downstream services, which log it to enable tracing.
Microservices
### Before: No correlation ID propagation
import logging

def service_a(request):
    logging.info(f"Processing request {request}")
    # calls service_b without passing any ID
    service_b()

def service_b():
    logging.info("Processing in service B")


### After: Correlation ID propagation
import logging
import uuid

class RequestContext:
    correlation_id = None

def service_a(request):
    # Assign correlation ID if missing
    if not hasattr(request, 'correlation_id'):
        request.correlation_id = str(uuid.uuid4())
    logging.info(f"Processing request {request} with correlation_id={request.correlation_id}")
    service_b(request.correlation_id)

def service_b(correlation_id):
    logging.info(f"Processing in service B with correlation_id={correlation_id}")
OutputSuccess
Alternatives
Distributed Tracing
Distributed tracing extends correlation IDs by adding timing and metadata at each service hop to build a detailed trace.
Use when: Choose distributed tracing when you need detailed performance insights and latency breakdowns beyond simple request correlation.
Logging Context Propagation
Logging context propagation automatically attaches context like user ID or session ID to logs without a unique correlation ID.
Use when: Use when you want to enrich logs with context but do not require full request tracing across services.
Summary
Correlation IDs uniquely tag each user request to trace it across multiple microservices.
They enable easier debugging and monitoring by linking logs from different services.
Proper propagation and logging of correlation IDs are essential for effective observability.

Practice

(1/5)
1. What is the primary purpose of a Correlation ID in microservices?
easy
A. To balance load between servers
B. To encrypt data between services
C. To track a single request across multiple services for easier debugging
D. To store user session information

Solution

  1. Step 1: Understand the role of Correlation ID

    A Correlation ID is a unique identifier attached to a request that travels through multiple services.
  2. Step 2: Identify its main use

    This ID helps developers trace and debug the flow of that request across distributed systems.
  3. Final Answer:

    To track a single request across multiple services for easier debugging -> Option C
  4. Quick Check:

    Correlation ID = request tracking [OK]
Hint: Correlation ID links logs of one request across services [OK]
Common Mistakes:
  • Confusing Correlation ID with encryption keys
  • Thinking it balances load
  • Assuming it stores user data
2. Which of the following is the correct way to pass a Correlation ID between microservices?
easy
A. Add it as a custom HTTP header in the request
B. Include it as a query parameter in the URL
C. Store it in a database before each call
D. Embed it inside the response body

Solution

  1. Step 1: Review common practices for passing metadata

    Metadata like Correlation IDs are typically passed in HTTP headers to keep requests clean and consistent.
  2. Step 2: Evaluate options

    Query parameters can be altered or logged insecurely; storing in DB is inefficient; response body is too late for tracking.
  3. Final Answer:

    Add it as a custom HTTP header in the request -> Option A
  4. Quick Check:

    Correlation ID in headers = best practice [OK]
Hint: Use HTTP headers to pass Correlation IDs [OK]
Common Mistakes:
  • Using query parameters which can be insecure
  • Storing IDs in database for each call
  • Embedding IDs in response body instead of request
3. Consider this simplified code snippet in a microservice receiving an HTTP request:
def handle_request(request):
    correlation_id = request.headers.get('X-Correlation-ID')
    log(f"Start processing request {correlation_id}")
    # ... process ...
    log(f"End processing request {correlation_id}")
What will be logged if the incoming request has header X-Correlation-ID: abc123?
medium
A. No logs will be generated
B. Start processing request abc123 End processing request abc123
C. Start processing request X-Correlation-ID End processing request X-Correlation-ID
D. Start processing request None End processing request None

Solution

  1. Step 1: Extract Correlation ID from headers

    The code uses request.headers.get('X-Correlation-ID') which returns the header value if present.
  2. Step 2: Check the header value in the request

    The request has X-Correlation-ID: abc123, so correlation_id will be 'abc123'.
  3. Final Answer:

    Start processing request abc123 End processing request abc123 -> Option B
  4. Quick Check:

    Header value read correctly = logs with abc123 [OK]
Hint: Headers.get returns value or None; here value exists [OK]
Common Mistakes:
  • Assuming header key is logged instead of value
  • Thinking None is logged when header exists
  • Believing no logs are generated
4. A developer notices that Correlation IDs are missing in logs of downstream services. Which is the most likely cause?
medium
A. The Correlation ID header is not forwarded in outgoing requests
B. The Correlation ID is too long and gets truncated
C. The logging system does not support string messages
D. The services are using different programming languages

Solution

  1. Step 1: Understand Correlation ID propagation

    Correlation IDs must be passed along with every outgoing request to maintain traceability.
  2. Step 2: Identify common propagation mistake

    If downstream logs miss the ID, it usually means the header was not forwarded properly.
  3. Final Answer:

    The Correlation ID header is not forwarded in outgoing requests -> Option A
  4. Quick Check:

    Missing ID in logs = header not forwarded [OK]
Hint: Always forward Correlation ID header downstream [OK]
Common Mistakes:
  • Blaming length truncation without evidence
  • Assuming logging system limitation
  • Thinking language differences cause missing IDs
5. You design a microservices system where each service generates its own Correlation ID if none is provided. What is a potential problem with this approach?
hard
A. It ensures better security by hiding original IDs
B. It improves performance by reducing header size
C. It simplifies logging by having unique IDs per service
D. It breaks the traceability because multiple IDs exist for the same request

Solution

  1. Step 1: Analyze the effect of generating new IDs per service

    If each service creates a new Correlation ID, the original request's trace is lost.
  2. Step 2: Understand impact on traceability

    Multiple different IDs for one request make it impossible to follow the full request path across services.
  3. Final Answer:

    It breaks the traceability because multiple IDs exist for the same request -> Option D
  4. Quick Check:

    Multiple IDs = broken traceability [OK]
Hint: One Correlation ID per request across services only [OK]
Common Mistakes:
  • Thinking multiple IDs improve security
  • Assuming performance improves with new IDs
  • Believing unique IDs per service simplify logs