0
0
Microservicessystem_design~25 mins

Centralized logging (ELK stack) in Microservices - System Design Exercise

Choose your learning style9 modes available
Design: Centralized Logging System using ELK Stack
Includes log collection, storage, search, and visualization. Excludes log generation and microservice internal logging implementation.
Functional Requirements
FR1: Collect logs from multiple microservices in real-time
FR2: Store logs centrally for easy search and analysis
FR3: Provide a dashboard for monitoring logs with filtering and alerting
FR4: Support log retention for at least 30 days
FR5: Handle at least 10,000 log events per second
FR6: Ensure logs are searchable with p99 query latency under 200ms
Non-Functional Requirements
NFR1: System must be highly available with 99.9% uptime
NFR2: Logs must be securely transmitted and stored
NFR3: The system should scale horizontally as log volume grows
NFR4: Minimal impact on microservices performance when sending logs
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
Key Components
Log shippers (e.g., Filebeat, Logstash)
Message queue or buffer for log ingestion
Elasticsearch cluster for storage and search
Kibana for visualization and dashboards
Security components like TLS and authentication
Design Patterns
Log aggregation
Event streaming with buffering
Indexing and search optimization
Horizontal scaling and sharding
Data retention and archival
Reference Architecture
Microservices --> Filebeat/Logstash --> Kafka (buffer) --> Elasticsearch Cluster --> Kibana Dashboard
                    |                     |                     |                     |
                    |                     |                     |                     |
                    +---------------------+---------------------+---------------------+
Components
Microservices
Any microservice framework
Generate application logs in structured format
Filebeat / Logstash
Elastic Beats / Logstash
Collect and forward logs from microservices reliably
Kafka
Apache Kafka
Buffer logs to handle spikes and decouple producers from consumers
Elasticsearch Cluster
Elasticsearch
Store, index, and provide fast search over logs
Kibana
Kibana
Visualize logs, create dashboards, and set alerts
Request Flow
1. Microservices generate logs and write to local files or stdout.
2. Filebeat agents installed on microservice hosts read logs and forward them to Kafka or Logstash.
3. Kafka buffers incoming logs to handle bursts and ensure durability.
4. Logstash consumes logs from Kafka, processes and transforms them if needed, then sends to Elasticsearch.
5. Elasticsearch indexes logs for fast search and stores them with retention policies.
6. Kibana connects to Elasticsearch to provide dashboards and alerting interfaces for users.
Database Schema
Elasticsearch stores logs as documents with fields: timestamp, service_name, log_level, message, trace_id, host, and custom tags. Indexes are created on timestamp and service_name for efficient querying.
Scaling Discussion
Bottlenecks
Log ingestion rate exceeding Kafka or Logstash capacity
Elasticsearch cluster storage and query performance limits
Network bandwidth between microservices and log shippers
Kibana dashboard performance with large datasets
Solutions
Scale Kafka brokers horizontally and partition topics for parallelism
Add Elasticsearch nodes and use sharding to distribute data
Use compression and batching in Filebeat to reduce network load
Implement index lifecycle management to archive or delete old logs
Optimize Kibana queries and use filters to limit data volume
Interview Tips
Time: Spend 10 minutes understanding requirements and clarifying scale. Use 20 minutes to design components and data flow. Reserve 10 minutes to discuss scaling and trade-offs. Use last 5 minutes for questions.
Explain how logs flow from microservices to Elasticsearch
Discuss buffering with Kafka to handle spikes
Highlight importance of indexing and search optimization
Mention security and data retention considerations
Describe scaling strategies for each component