Bird
Raised Fist0
Microservicessystem_design~5 mins

Distributed tracing (Jaeger, Zipkin) in Microservices - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is distributed tracing in microservices?
Distributed tracing is a method to track and observe requests as they flow through multiple microservices, helping to understand system behavior and diagnose issues.
Click to reveal answer
beginner
Name two popular distributed tracing tools.
Jaeger and Zipkin are two widely used open-source distributed tracing tools.
Click to reveal answer
intermediate
What is a 'span' in distributed tracing?
A span represents a single unit of work or operation within a trace, such as a request to a microservice or a database call.
Click to reveal answer
intermediate
How does distributed tracing help in debugging microservices?
It shows the path of a request across services with timing details, helping identify slow or failing components quickly.
Click to reveal answer
beginner
What is the role of a trace ID in distributed tracing?
A trace ID uniquely identifies a single request as it travels through multiple services, linking all spans together.
Click to reveal answer
Which of the following best describes a 'trace' in distributed tracing?
AA log file of errors
BA single operation within a service
CA collection of spans representing a single request journey
DA database query
What is the primary purpose of Jaeger and Zipkin?
ATo monitor network traffic
BTo deploy microservices
CTo store user data
DTo trace requests across microservices
Which component in distributed tracing represents a single operation or work unit?
ASpan
BService mesh
CTrace ID
DLog entry
How does distributed tracing improve system observability?
ABy encrypting data
BBy showing request flow and timing across services
CBy reducing server load
DBy caching responses
What unique identifier links all spans of a single request in distributed tracing?
ATrace ID
BUser ID
CSession ID
DService ID
Explain how distributed tracing works in a microservices environment using Jaeger or Zipkin.
Think about how a request travels and how tracing tools capture each step.
You got /4 concepts.
    Describe the benefits of using distributed tracing for debugging and monitoring microservices.
    Consider how tracing helps find problems quickly.
    You got /4 concepts.

      Practice

      (1/5)
      1. What is the main purpose of distributed tracing tools like Jaeger or Zipkin in microservices?
      easy
      A. To track and visualize requests as they flow through multiple services
      B. To store large amounts of user data securely
      C. To replace load balancers in service communication
      D. To encrypt network traffic between microservices

      Solution

      1. Step 1: Understand the role of distributed tracing

        Distributed tracing tools help monitor how requests move through different microservices by collecting timing and metadata.
      2. Step 2: Identify the main function of Jaeger and Zipkin

        They visualize and analyze traces made of spans to find bottlenecks or errors in service chains.
      3. Final Answer:

        To track and visualize requests as they flow through multiple services -> Option A
      4. Quick Check:

        Distributed tracing = track requests flow [OK]
      Hint: Distributed tracing = tracking requests across services [OK]
      Common Mistakes:
      • Confusing tracing with data storage
      • Thinking tracing replaces load balancers
      • Assuming tracing encrypts traffic
      2. Which of the following is the correct way to propagate trace context between microservices using HTTP headers?
      easy
      A. Add Cookie header with span ID
      B. Add Authorization header with trace ID
      C. Add X-B3-TraceId and X-B3-SpanId headers to the outgoing request
      D. Add Content-Type header with trace ID value

      Solution

      1. Step 1: Recall standard trace context headers

        Distributed tracing uses specific headers like X-B3-TraceId and X-B3-SpanId to pass trace info between services.
      2. Step 2: Identify correct header usage

        Headers like Authorization, Content-Type, or Cookie are unrelated to tracing context propagation.
      3. Final Answer:

        Add X-B3-TraceId and X-B3-SpanId headers to the outgoing request -> Option C
      4. Quick Check:

        Trace context headers = X-B3-TraceId, X-B3-SpanId [OK]
      Hint: Trace context uses X-B3 headers, not auth or content-type [OK]
      Common Mistakes:
      • Using unrelated HTTP headers for trace context
      • Forgetting to propagate span ID
      • Confusing trace ID with authentication tokens
      3. Given the following trace spans collected by Zipkin, what is the total time taken by the root request?
      Span A (root): start=0ms, duration=50ms
      Span B (child of A): start=10ms, duration=20ms
      Span C (child of A): start=35ms, duration=10ms
      medium
      A. 50ms
      B. 40ms
      C. 30ms
      D. 60ms

      Solution

      1. Step 1: Understand root span duration

        The root span duration represents the total time of the entire request, including child spans.
      2. Step 2: Analyze given spans

        Span A starts at 0ms and lasts 50ms, so total time is 50ms regardless of child spans.
      3. Final Answer:

        50ms -> Option A
      4. Quick Check:

        Root span duration = total request time = 50ms [OK]
      Hint: Root span duration = total request time [OK]
      Common Mistakes:
      • Adding child spans durations incorrectly
      • Ignoring root span duration
      • Confusing start times with total duration
      4. You notice that your distributed tracing data in Jaeger shows many missing spans for some services. What is the most likely cause?
      medium
      A. The network latency is too low
      B. The services have too many CPU cores
      C. The database is down
      D. The services are not propagating the trace context headers correctly

      Solution

      1. Step 1: Identify cause of missing spans

        If spans are missing, it usually means trace context was not passed properly between services.
      2. Step 2: Eliminate unrelated causes

        CPU cores, database status, or low network latency do not cause missing trace spans.
      3. Final Answer:

        The services are not propagating the trace context headers correctly -> Option D
      4. Quick Check:

        Missing spans = trace context not propagated [OK]
      Hint: Missing spans? Check trace context propagation [OK]
      Common Mistakes:
      • Blaming unrelated system resources
      • Ignoring header propagation
      • Assuming network latency causes missing spans
      5. You want to design a distributed tracing system for a microservices architecture with 100 services and high request volume. Which approach best ensures scalability and minimal overhead?
      hard
      A. Trace every request fully and store all spans in a single central database
      B. Use sampling to trace only a subset of requests and propagate trace context with lightweight headers
      C. Disable trace context propagation and log spans locally in each service
      D. Use synchronous calls to the tracing backend for every span creation

      Solution

      1. Step 1: Consider scalability needs

        Tracing every request fully in a large system causes high overhead and storage issues.
      2. Step 2: Identify best practice for high volume tracing

        Sampling reduces load by tracing only some requests, and lightweight headers keep propagation efficient.
      3. Step 3: Eliminate poor options

        Disabling propagation loses trace linkage; synchronous calls add latency; central DB can bottleneck.
      4. Final Answer:

        Use sampling to trace only a subset of requests and propagate trace context with lightweight headers -> Option B
      5. Quick Check:

        Sampling + lightweight headers = scalable tracing [OK]
      Hint: Sampling + lightweight headers = scalable tracing [OK]
      Common Mistakes:
      • Tracing all requests causing overhead
      • Ignoring trace context propagation
      • Using synchronous calls causing latency