Bird
Raised Fist0
Microservicessystem_design~10 mins

Mutual TLS between services in Microservices - Scalability & System Analysis

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Scalability Analysis - Mutual TLS between services
Growth Table: Mutual TLS between services
Users / ServicesWhat Changes?
100 users / 10 servicesBasic mTLS setup with certificates issued by internal CA. Low latency impact. Simple certificate rotation.
10,000 users / 100 servicesCertificate management grows complex. Need automated certificate issuance and rotation. Increased CPU usage for TLS handshakes.
1,000,000 users / 1,000+ servicesHigh TLS handshake overhead impacts service latency. Certificate revocation and trust management become challenging. Need centralized certificate management and caching TLS sessions.
100,000,000 users / 10,000+ servicesNetwork bandwidth and CPU load from TLS dominate. Must implement TLS session resumption, hardware acceleration, and distributed trust stores. Monitoring and alerting critical.
First Bottleneck

The first bottleneck is the CPU load on service instances due to TLS handshake overhead. Each mutual TLS connection requires cryptographic operations that consume CPU. As the number of services and connections grows, CPU becomes saturated, increasing latency and reducing throughput.

Scaling Solutions
  • Session Resumption: Use TLS session tickets or IDs to avoid full handshakes on repeated connections.
  • Connection Pooling: Reuse TLS connections between services to reduce handshake frequency.
  • Hardware Acceleration: Use CPUs with crypto acceleration or dedicated TLS offload hardware.
  • Centralized Certificate Management: Automate certificate issuance, rotation, and revocation with tools like SPIFFE/SPIRE or Vault.
  • Load Balancing: Distribute traffic to avoid CPU hotspots.
  • Caching Trust Data: Cache certificate validation results to reduce repeated expensive operations.
Back-of-Envelope Cost Analysis
  • Assuming 1000 concurrent connections per server, each TLS handshake takes ~10-50ms CPU time.
  • At 10,000 services, with 10 handshakes per second each, total TLS handshakes = 100,000/sec.
  • CPU load for TLS handshakes can saturate multiple servers; need horizontal scaling.
  • Storage for certificates: Each certificate ~2KB, 10,000 services = ~20MB, manageable in memory.
  • Network bandwidth impact: TLS adds ~5-10% overhead on data transferred.
Interview Tip

Start by explaining what mutual TLS is and why it is used for service-to-service authentication and encryption. Then discuss how TLS handshake overhead impacts CPU and latency as scale grows. Mention certificate management complexity. Finally, propose concrete scaling solutions like session resumption, connection pooling, and automated certificate management. Use numbers to justify bottlenecks and solutions.

Self Check

Your database handles 1000 QPS. Traffic grows 10x. What do you do first?

Answer: Since the database is the bottleneck at 1000 QPS, first add read replicas and implement caching to reduce load. For mutual TLS, similarly, if CPU is bottleneck due to TLS handshakes, first implement TLS session resumption and connection reuse to reduce CPU load.

Key Result
Mutual TLS scales well initially but CPU overhead from TLS handshakes becomes the first bottleneck as services and connections grow; session resumption and automated certificate management are key to scaling.

Practice

(1/5)
1. What is the main purpose of using Mutual TLS between microservices?
easy
A. To allow services to communicate without encryption
B. To speed up the communication between services
C. To ensure both services authenticate each other before communication
D. To store service data securely on disk

Solution

  1. Step 1: Understand Mutual TLS authentication

    Mutual TLS requires both client and server to present certificates proving their identity.
  2. Step 2: Identify the purpose in microservices

    This ensures only trusted services communicate securely, preventing unauthorized access.
  3. Final Answer:

    To ensure both services authenticate each other before communication -> Option C
  4. Quick Check:

    Mutual TLS = mutual authentication [OK]
Hint: Mutual TLS means both sides prove who they are [OK]
Common Mistakes:
  • Thinking it only encrypts data without authentication
  • Assuming it speeds up communication
  • Confusing it with data storage security
2. Which of the following is the correct step to enable Mutual TLS in a microservice?
easy
A. Disable certificate verification on both services
B. Share the same private key among all services
C. Use plain HTTP instead of HTTPS
D. Configure each service with its own certificate and trust store

Solution

  1. Step 1: Identify certificate requirements

    Each service must have its own certificate and trust store to verify others.
  2. Step 2: Understand security best practices

    Disabling verification or sharing keys breaks security and is incorrect.
  3. Final Answer:

    Configure each service with its own certificate and trust store -> Option D
  4. Quick Check:

    Certificates + trust store = Mutual TLS setup [OK]
Hint: Each service needs its own certificate and trust store [OK]
Common Mistakes:
  • Disabling certificate verification to simplify setup
  • Using HTTP which is unencrypted
  • Sharing private keys causing security risks
3. Given two microservices A and B configured with Mutual TLS, what happens if service B presents an expired certificate during handshake?
medium
A. Service A accepts the connection without checks
B. Service A rejects the connection due to invalid certificate
C. Service B automatically renews the certificate
D. The connection proceeds but logs a warning

Solution

  1. Step 1: Understand certificate validation in Mutual TLS

    Certificates must be valid and trusted; expired certificates are rejected.
  2. Step 2: Identify handshake behavior on invalid certificates

    If service B's certificate is expired, service A will reject the connection to maintain security.
  3. Final Answer:

    Service A rejects the connection due to invalid certificate -> Option B
  4. Quick Check:

    Expired certificate = connection rejected [OK]
Hint: Expired cert means connection is rejected [OK]
Common Mistakes:
  • Assuming expired certs are accepted with warnings
  • Thinking certificates auto-renew during handshake
  • Believing connection proceeds without checks
4. A microservice fails to establish Mutual TLS with another service. The error logs show "certificate unknown". What is the most likely cause?
medium
A. The service's certificate is not signed by a trusted CA
B. The service is using HTTP instead of HTTPS
C. The private key is missing from the service
D. The service is using a self-signed certificate but trusts it

Solution

  1. Step 1: Analyze the error "certificate unknown"

    This error means the certificate presented is not recognized or trusted by the other service.
  2. Step 2: Identify cause related to trust

    If the certificate is not signed by a trusted CA, the other service will reject it as unknown.
  3. Final Answer:

    The service's certificate is not signed by a trusted CA -> Option A
  4. Quick Check:

    Untrusted CA = certificate unknown error [OK]
Hint: Certificate unknown means untrusted CA signature [OK]
Common Mistakes:
  • Confusing HTTP usage with certificate errors
  • Assuming missing private key causes this error
  • Believing self-signed certs are trusted by default
5. You need to design a microservices system with Mutual TLS where services dynamically scale up and down. Which approach best ensures secure and scalable certificate management?
hard
A. Use a centralized certificate authority with automated certificate issuance and rotation
B. Manually generate and distribute certificates to each service instance
C. Disable Mutual TLS during scaling to avoid certificate issues
D. Use the same certificate for all service instances to simplify management

Solution

  1. Step 1: Understand challenges of scaling with Mutual TLS

    Dynamic scaling requires automated certificate management to avoid manual errors and delays.
  2. Step 2: Evaluate options for secure and scalable management

    A centralized CA with automation allows issuing and rotating certificates securely as instances scale.
  3. Step 3: Reject insecure or manual approaches

    Manual distribution is error-prone, disabling TLS reduces security, and sharing certificates risks compromise.
  4. Final Answer:

    Use a centralized certificate authority with automated certificate issuance and rotation -> Option A
  5. Quick Check:

    Central CA + automation = scalable Mutual TLS [OK]
Hint: Automate certs with central CA for scaling [OK]
Common Mistakes:
  • Manually managing certs for each instance
  • Disabling Mutual TLS to avoid complexity
  • Sharing certificates across instances