Bird
Raised Fist0
Elasticsearchquery~30 mins

Infrastructure monitoring in Elasticsearch - Mini Project: Build & Apply

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Infrastructure Monitoring with Elasticsearch
📖 Scenario: You are setting up a simple infrastructure monitoring system using Elasticsearch. You want to store server metrics like CPU and memory usage, then query them to find servers with high CPU usage.
🎯 Goal: Build an Elasticsearch index with server metrics, configure a threshold for high CPU usage, query the index to find servers exceeding that threshold, and display the results.
📋 What You'll Learn
Create an Elasticsearch index called server_metrics with sample server data
Add a variable cpu_threshold to set the CPU usage limit
Write a query to find servers with CPU usage greater than cpu_threshold
Print the names of servers exceeding the CPU threshold
💡 Why This Matters
🌍 Real World
Monitoring server health and performance is critical in IT operations to prevent downtime and optimize resources.
💼 Career
DevOps engineers and system administrators use Elasticsearch to collect, query, and analyze infrastructure metrics for proactive monitoring.
Progress0 / 4 steps
1
Create the server_metrics index with sample data
Create an Elasticsearch index called server_metrics and add these exact documents: {"server": "server1", "cpu": 55, "memory": 70}, {"server": "server2", "cpu": 85, "memory": 60}, {"server": "server3", "cpu": 40, "memory": 80}.
Elasticsearch
Hint

Use the Elasticsearch bulk API format to add multiple documents to the server_metrics index.

2
Set the CPU usage threshold variable
Create a variable called cpu_threshold and set it to 70 to represent the CPU usage limit.
Elasticsearch
Hint

Use an Elasticsearch stored script or a variable in your query to represent the CPU threshold of 70.

3
Query servers with CPU usage above cpu_threshold
Write an Elasticsearch query to find all documents in server_metrics where the cpu field is greater than 70.
Elasticsearch
Hint

Use a range query on the cpu field with gt set to 70.

4
Display the names of servers exceeding the CPU threshold
Print the server names from the query results where CPU usage is greater than 70.
Elasticsearch
Hint

Look at the hits.hits array in the search response and print the _source.server field for each hit.

Practice

(1/5)
1. What is the primary purpose of infrastructure monitoring in Elasticsearch?
easy
A. To create user accounts and manage permissions
B. To store large amounts of data permanently
C. To watch system health and detect issues early
D. To design the user interface of Kibana dashboards

Solution

  1. Step 1: Understand infrastructure monitoring

    Infrastructure monitoring means watching your systems to keep them healthy and catch problems early.
  2. Step 2: Relate to Elasticsearch context

    Elasticsearch provides APIs to check cluster and node status, which helps monitor system health.
  3. Final Answer:

    To watch system health and detect issues early -> Option C
  4. Quick Check:

    Infrastructure monitoring = watch health early [OK]
Hint: Monitoring means watching system health regularly [OK]
Common Mistakes:
  • Confusing monitoring with data storage
  • Thinking monitoring manages user accounts
  • Mixing monitoring with UI design
2. Which Elasticsearch API command correctly checks the cluster health status?
easy
A. GET /_cluster/health
B. POST /_cluster/status
C. GET /_nodes/stats
D. PUT /_cluster/health

Solution

  1. Step 1: Identify the correct HTTP method and endpoint

    The cluster health API uses GET method and the endpoint is /_cluster/health.
  2. Step 2: Eliminate incorrect options

    POST and PUT are not used for checking health; /_nodes/stats gives node stats, not cluster health.
  3. Final Answer:

    GET /_cluster/health -> Option A
  4. Quick Check:

    Cluster health API = GET /_cluster/health [OK]
Hint: Use GET method with /_cluster/health to check status [OK]
Common Mistakes:
  • Using POST or PUT instead of GET
  • Confusing node stats with cluster health
  • Using wrong endpoint paths
3. What will be the output status field when you run GET /_cluster/health on a healthy Elasticsearch cluster?
medium
A. { \"status\": \"red\" }
B. { \"status\": \"green\" }
C. { \"status\": \"yellow\" }
D. { \"status\": \"blue\" }

Solution

  1. Step 1: Understand cluster health status colors

    Green means all primary and replica shards are active, so cluster is healthy.
  2. Step 2: Match output with healthy cluster

    Healthy cluster returns status as "green" in the JSON response.
  3. Final Answer:

    { "status": "green" } -> Option B
  4. Quick Check:

    Healthy cluster status = green [OK]
Hint: Green status means cluster is fully healthy [OK]
Common Mistakes:
  • Confusing yellow or red as healthy
  • Expecting blue status which does not exist
  • Misreading JSON output format
4. You run GET /_nodes/stats but get a 404 error. What is the most likely cause?
medium
A. The API endpoint is incorrect or misspelled
B. You used POST instead of GET method
C. The cluster is down and unreachable
D. The node stats API requires authentication

Solution

  1. Step 1: Understand 404 error meaning

    404 means the requested URL or endpoint does not exist on the server.
  2. Step 2: Check API endpoint correctness

    If the endpoint is misspelled or wrong, 404 occurs. The correct endpoint is /_nodes/stats.
  3. Final Answer:

    The API endpoint is incorrect or misspelled -> Option A
  4. Quick Check:

    404 error = wrong endpoint [OK]
Hint: 404 means wrong URL or endpoint [OK]
Common Mistakes:
  • Assuming cluster down causes 404 (usually connection error)
  • Confusing 404 with authentication errors
  • Using wrong HTTP method but expecting 404
5. You want to monitor Elasticsearch nodes for CPU and memory usage continuously. Which approach is best?
hard
A. Restart nodes frequently to reset CPU and memory usage
B. Use GET /_cluster/health to check CPU and memory
C. Install Kibana and create dashboards without data collection
D. Run GET /_nodes/stats regularly and parse CPU/memory fields

Solution

  1. Step 1: Identify API for node resource stats

    The /_nodes/stats API provides detailed CPU and memory usage per node.
  2. Step 2: Understand monitoring approach

    Regularly running this API and parsing results allows continuous monitoring of resource usage.
  3. Final Answer:

    Run GET /_nodes/stats regularly and parse CPU/memory fields -> Option D
  4. Quick Check:

    Node stats API for CPU/memory monitoring [OK]
Hint: Use /_nodes/stats API for detailed resource monitoring [OK]
Common Mistakes:
  • Using cluster health API which lacks CPU/memory details
  • Assuming Kibana dashboards work without data
  • Restarting nodes does not monitor usage