0
0
Microservicessystem_design~7 mins

Service mesh concept in Microservices - System Design Guide

Choose your learning style9 modes available
Problem Statement
When microservices grow in number, managing communication between them becomes complex. Failures like lost requests, inconsistent retries, or security gaps happen because each service handles networking and observability differently. This leads to unreliable service-to-service communication and hard-to-debug issues.
Solution
A service mesh adds a dedicated infrastructure layer that manages all service-to-service communication. It uses lightweight proxies alongside each service to handle routing, retries, security, and monitoring uniformly. This separates communication logic from business code, making interactions reliable and observable without changing the services themselves.
Architecture
Service A
┌───────┐
Service B
Control
Plane

This diagram shows microservices each paired with a proxy that manages communication. The control plane configures and monitors these proxies to enforce policies and collect telemetry.

Trade-offs
✓ Pros
Centralizes communication features like retries, load balancing, and security without changing service code.
Improves observability by collecting detailed metrics and traces for all service interactions.
Enables fine-grained security policies such as mutual TLS between services.
Simplifies complex microservice networking with consistent behavior across services.
✗ Cons
Adds operational complexity and resource overhead due to sidecar proxies running alongside each service.
Increases latency slightly because all traffic passes through proxies.
Requires expertise to configure and maintain the control plane and proxies correctly.
When running dozens or more microservices that require secure, reliable, and observable communication at scale, especially in dynamic environments like Kubernetes.
When the system has fewer than 10 services or simple communication needs, as the added complexity and resource cost may outweigh benefits.
Real World Examples
Google
Developed Istio service mesh to manage complex service communication and security in their Kubernetes clusters.
Lyft
Created Envoy proxy as part of their service mesh to handle resilient service-to-service communication and observability.
IBM
Uses service mesh to enforce security policies and monitor microservices in hybrid cloud environments.
Alternatives
API Gateway
API Gateway manages external client-to-service traffic, while service mesh manages internal service-to-service communication.
Use when: When you need to control and secure traffic entering your system from outside clients.
Client-side Load Balancing
Client-side load balancing requires each service to implement communication logic, whereas service mesh centralizes this in proxies.
Use when: When you want simpler setups with fewer infrastructure components and can modify service code.
Summary
Service mesh manages communication between microservices using sidecar proxies and a control plane.
It improves reliability, security, and observability without changing application code.
Service mesh is best for large, complex microservice environments but adds operational overhead.