0
0
Microservicessystem_design~7 mins

Centralized logging (ELK stack) in Microservices - System Design Guide

Choose your learning style9 modes available
Problem Statement
When multiple microservices generate logs independently, it becomes impossible to trace issues across services quickly. Logs scattered in different locations cause delays in debugging and increase the risk of missing critical errors during incidents.
Solution
Centralized logging collects logs from all microservices into a single system where they are indexed and searchable. The ELK stack (Elasticsearch, Logstash, Kibana) ingests logs, stores them efficiently, and provides a dashboard for real-time analysis and troubleshooting.
Architecture
Microservice 1
Microservice 2
Kibana UI
Kibana UI

This diagram shows multiple microservices sending logs to Logstash, which processes and forwards them to Elasticsearch for storage and indexing. Kibana provides a user interface to visualize and analyze the logs.

Trade-offs
✓ Pros
Enables quick troubleshooting by aggregating logs in one place.
Supports complex queries and real-time monitoring through Elasticsearch and Kibana.
Improves operational visibility across distributed microservices.
✗ Cons
Introduces additional infrastructure components that require maintenance.
Logstash can become a bottleneck if not scaled properly.
Requires careful configuration to handle log volume and avoid data loss.
When running multiple microservices generating high volumes of logs needing centralized analysis, especially in production environments with frequent incidents.
For small systems with fewer than 5 services or low log volume where centralized logging overhead outweighs benefits.
Real World Examples
Netflix
Uses ELK stack to aggregate logs from thousands of microservices to quickly detect and resolve streaming issues.
Uber
Centralizes logs from diverse services to monitor ride requests and driver activities in real-time.
Shopify
Employs ELK to analyze logs for troubleshooting payment processing and order fulfillment microservices.
Alternatives
Fluentd + Elasticsearch + Kibana
Uses Fluentd instead of Logstash for log collection and processing, which can be lighter and more flexible.
Use when: When needing a more lightweight or cloud-native log collector with similar centralized logging capabilities.
Cloud-native logging services (e.g., AWS CloudWatch, Google Cloud Logging)
Uses managed cloud services for log aggregation and analysis instead of self-hosted ELK stack.
Use when: When operating primarily on cloud platforms and preferring managed services to reduce operational overhead.
Distributed tracing (e.g., Jaeger, Zipkin)
Focuses on tracing requests across services rather than collecting logs, providing a different perspective on system behavior.
Use when: When needing to understand request flows and latency across microservices rather than raw log data.
Summary
Centralized logging aggregates logs from multiple microservices into one searchable system.
The ELK stack processes, stores, and visualizes logs to speed up troubleshooting and monitoring.
It is best suited for complex systems with many services and high log volumes.