Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Design: Microservices Monitoring Dashboard with Grafana
In scope: Metrics collection, storage, visualization, alerting, and access control. Out of scope: Microservices implementation, detailed alert notification channels.
Functional Requirements
FR1: Display real-time metrics from multiple microservices
FR2: Support customizable dashboards for different teams
FR3: Visualize key performance indicators (KPIs) such as latency, error rates, and throughput
FR4: Allow alerting based on threshold breaches
FR5: Handle up to 100 microservices with 10,000 metrics per second
FR6: Provide historical data for at least 30 days
FR7: Secure access with role-based permissions
Non-Functional Requirements
NFR1: API response latency for dashboard queries should be under 500ms (p99)
NFR2: System availability must be 99.9% uptime
NFR3: Data retention for 30 days with efficient storage
High ingestion rate of metrics causing storage and processing overload
Slow query response times due to large data volume
Authentication service becoming a single point of failure
Alert manager overwhelmed by frequent alerts
Dashboard UI performance degradation with many concurrent users
Solutions
Use a horizontally scalable TSDB like Cortex or Thanos to distribute storage and ingestion load
Implement query caching and downsampling of older metrics to speed up queries
Deploy authentication service in a highly available cluster with load balancing
Rate-limit alerts and use deduplication in alert manager to reduce noise
Use Grafana’s built-in caching and optimize dashboard queries; scale Grafana instances behind a load balancer
Interview Tips
Time: Spend 10 minutes understanding requirements and clarifying scope, 20 minutes designing architecture and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing and answering questions.
Explain the choice of Prometheus and Grafana as industry standards for metrics and dashboards
Discuss pull-based metrics collection for reliability and scalability
Highlight security with RBAC and authentication integration
Describe how alerting integrates with monitoring for proactive issue detection
Address scaling challenges with distributed TSDB and caching
Mention data retention and downsampling strategies for storage efficiency
Practice
(1/5)
1. What is the main purpose of a Grafana dashboard in microservices monitoring?
easy
A. To visually display system data for easy monitoring
B. To write code for microservices
C. To store microservice source files
D. To deploy microservices automatically
Solution
Step 1: Understand Grafana's role
Grafana is a tool used to create dashboards that show data visually.
Step 2: Connect purpose to microservices
Dashboards help monitor microservices by showing their data clearly.
Final Answer:
To visually display system data for easy monitoring -> Option A
Quick Check:
Grafana dashboards = Visual monitoring [OK]
Hint: Dashboards show data visually to monitor systems fast [OK]
Common Mistakes:
Confusing dashboards with code editors
Thinking dashboards deploy services
Assuming dashboards store source code
2. Which of the following is the correct way to add a new panel in a Grafana dashboard?
easy
A. Write a new SQL query in the dashboard settings
B. Click the '+' icon and select 'Add Panel'
C. Restart the Grafana server
D. Edit the microservice code
Solution
Step 1: Identify how to add panels in Grafana
Grafana uses a '+' icon to add new panels visually.
Step 2: Eliminate unrelated actions
Writing SQL or restarting server does not add panels directly.
Final Answer:
Click the '+' icon and select 'Add Panel' -> Option B
Quick Check:
Add panel = '+' icon click [OK]
Hint: Use '+' icon to add panels quickly [OK]
Common Mistakes:
Trying to add panels by restarting Grafana
Confusing panel addition with code editing
Assuming SQL query alone adds panels
3. Given this Grafana query panel configuration: SELECT mean("response_time") FROM "service_metrics" WHERE $timeFilter GROUP BY time($__interval) fill(null) What will this panel display?
medium
A. List of all service names
B. Total number of requests received
C. Current CPU usage of the server
D. Average response time over time intervals
Solution
Step 1: Analyze the SQL query
The query calculates the mean (average) of "response_time" from "service_metrics" grouped by time intervals.
Step 2: Understand the output meaning
This means the panel shows average response time over time, not counts or other metrics.
Final Answer:
Average response time over time intervals -> Option D
Quick Check:
mean(response_time) = average response time [OK]
Hint: mean() shows average values in Grafana queries [OK]
Common Mistakes:
Confusing mean with total count
Assuming query lists service names
Thinking it shows CPU usage
4. You created a Grafana dashboard but the panels show 'No data'. What is the most likely cause?
medium
A. The data source is not connected or misconfigured
B. The dashboard theme is set to dark mode
C. The Grafana server needs a restart
D. The microservice code has a syntax error
Solution
Step 1: Identify common reasons for 'No data'
Panels show 'No data' usually when the data source is missing or wrong.
Step 2: Exclude unrelated causes
Theme or server restart rarely cause no data; code errors don't affect Grafana data directly.
Final Answer:
The data source is not connected or misconfigured -> Option A
Quick Check:
No data = data source issue [OK]
Hint: Check data source connection first if no data appears [OK]
Common Mistakes:
Restarting server unnecessarily
Changing theme expecting data fix
Blaming microservice code syntax
5. You want to create a Grafana dashboard that shows error rates for multiple microservices over the last 24 hours. Which steps should you follow?
hard
A. Use Grafana to deploy microservices and monitor logs
B. Write microservice code to log errors, then restart Grafana server
C. Connect data source, create a dashboard, add panels with queries filtering errors by service and time
D. Install Grafana plugins, then export dashboard JSON without queries
Solution
Step 1: Connect the correct data source
Grafana needs a data source with microservice metrics to query error rates.
Step 2: Create dashboard and add panels with queries
Panels should query error counts filtered by service name and last 24 hours.
Step 3: Customize time range and filters
Set time filter to last 24 hours and group by service for clear visualization.
Final Answer:
Connect data source, create a dashboard, add panels with queries filtering errors by service and time -> Option C
Quick Check:
Data source + queries + filters = dashboard [OK]
Hint: Always start with data source, then build queries in panels [OK]