Bird
Raised Fist0
Elasticsearchquery~10 mins

Log management pipeline in Elasticsearch - Step-by-Step Execution

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Concept Flow - Log management pipeline
Log Generated by Application
Log Shipper (e.g., Filebeat)
Log Ingested into Elasticsearch
Log Processed by Ingest Pipeline
Log Stored in Elasticsearch Index
Log Visualized in Kibana
Logs flow from the application through a shipper, get processed and stored in Elasticsearch, then visualized in Kibana.
Execution Sample
Elasticsearch
PUT _ingest/pipeline/log_pipeline
{
  "processors": [
    {"grok": {"field": "message", "patterns": ["%{COMMONAPACHELOG}"]}}
  ]
}
Defines an ingest pipeline that parses log messages using a grok pattern.
Execution Table
StepActionInput LogProcessor AppliedOutput Document
1Receive log from shipper{"message": "127.0.0.1 - - [10/Oct/2023:13:55:36 +0000] \"GET /index.html HTTP/1.1\" 200 2326"}None{"message": "127.0.0.1 - - [10/Oct/2023:13:55:36 +0000] \"GET /index.html HTTP/1.1\" 200 2326"}
2Apply grok processor{"message": "127.0.0.1 - - [10/Oct/2023:13:55:36 +0000] \"GET /index.html HTTP/1.1\" 200 2326"}grok parsing COMMONAPACHELOG{"clientip": "127.0.0.1", "ident": "-", "auth": "-", "timestamp": "10/Oct/2023:13:55:36 +0000", "verb": "GET", "request": "/index.html", "httpversion": "1.1", "response": "200", "bytes": "2326", "message": "127.0.0.1 - - [10/Oct/2023:13:55:36 +0000] \"GET /index.html HTTP/1.1\" 200 2326"}
3Store document in index{"clientip": "127.0.0.1", "request": "/index.html"}NoneDocument stored in Elasticsearch index
4Visualize in KibanaStored documentNoneLog entry visible in Kibana dashboard
💡 All logs processed and stored; pipeline completes successfully.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3Final
log_document{}{"message": "127.0.0.1 - - [10/Oct/2023:13:55:36 +0000] \"GET /index.html HTTP/1.1\" 200 2326"}{"clientip": "127.0.0.1", "ident": "-", "auth": "-", "timestamp": "10/Oct/2023:13:55:36 +0000", "verb": "GET", "request": "/index.html", "httpversion": "1.1", "response": "200", "bytes": "2326", "message": "127.0.0.1 - - [10/Oct/2023:13:55:36 +0000] \"GET /index.html HTTP/1.1\" 200 2326"}{"clientip": "127.0.0.1", "request": "/index.html"}Stored in Elasticsearch index
Key Moments - 3 Insights
Why does the log document have both the original message and parsed fields after processing?
Because the grok processor extracts fields but keeps the original message intact, as shown in execution_table step 2 where both exist.
What happens if the grok pattern does not match the log message?
The processor will fail to parse fields, so the output document will only have the original message without extracted fields, stopping further processing.
Why do we need a shipper like Filebeat before logs reach Elasticsearch?
The shipper collects and forwards logs reliably from sources to Elasticsearch, as shown in concept_flow step 2 before ingestion.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 2, what field contains the client's IP address after grok processing?
Arequest
Bmessage
Cclientip
Dtimestamp
💡 Hint
Check the 'Output Document' column at step 2 in execution_table.
At which step is the log document stored in the Elasticsearch index?
AStep 1
BStep 3
CStep 2
DStep 4
💡 Hint
Look for the action 'Store document in index' in execution_table.
If the grok processor was removed, how would the 'log_document' variable change after step 2?
AIt would remain the original message only
BIt would be empty
CIt would contain parsed fields
DIt would cause an error
💡 Hint
Refer to variable_tracker and key_moments about grok processor effects.
Concept Snapshot
Log management pipeline flow:
1. Logs generated by apps
2. Sent by shipper (Filebeat)
3. Ingested into Elasticsearch
4. Processed by ingest pipeline (e.g., grok parsing)
5. Stored in index
6. Visualized in Kibana
Use ingest pipelines to parse and enrich logs before storage.
Full Transcript
This visual execution shows how logs move from an application through a shipper to Elasticsearch. The ingest pipeline applies processors like grok to parse log messages into fields. The execution table traces each step: receiving the log, applying grok to extract fields, storing the document, and visualizing it in Kibana. Variables track the log document's state as it gains parsed fields. Key moments clarify why original messages remain and what happens if parsing fails. The quiz tests understanding of fields, storage steps, and processor effects. The snapshot summarizes the pipeline stages and purpose.

Practice

(1/5)
1. What is the main purpose of a log management pipeline in Elasticsearch?
easy
A. To encrypt data before sending it to Elasticsearch
B. To create visual dashboards from raw data
C. To collect, process, and store logs for easy searching and alerting
D. To backup Elasticsearch indices automatically

Solution

  1. Step 1: Understand the role of a log management pipeline

    A log management pipeline is designed to handle logs by collecting, processing, and storing them.
  2. Step 2: Identify the main goal

    The goal is to organize logs so they can be searched easily and alerts can be created.
  3. Final Answer:

    To collect, process, and store logs for easy searching and alerting -> Option C
  4. Quick Check:

    Log pipeline purpose = collect, process, store logs [OK]
Hint: Remember: pipeline = collect + process + store logs [OK]
Common Mistakes:
  • Confusing log pipeline with visualization tools
  • Thinking it only backs up data
  • Assuming it encrypts logs by default
2. Which section is NOT part of a typical Elasticsearch log management pipeline configuration?
easy
A. authentication
B. filter
C. output
D. input

Solution

  1. Step 1: Recall pipeline sections

    A typical pipeline has input, filter, and output sections to handle logs.
  2. Step 2: Identify the section not included

    Authentication is not a standard section in the pipeline configuration; it is handled elsewhere.
  3. Final Answer:

    authentication -> Option A
  4. Quick Check:

    Pipeline sections = input, filter, output [OK]
Hint: Pipeline = input + filter + output only [OK]
Common Mistakes:
  • Thinking authentication is part of pipeline config
  • Confusing pipeline sections with security settings
  • Assuming output means authentication
3. Given this pipeline snippet, what will be the output field after processing?
{
  "input": { "type": "file", "path": "/var/log/app.log" },
  "filter": { "grok": { "match": { "message": "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level} %{GREEDYDATA:msg}" } } },
  "output": { "elasticsearch": { "index": "app-logs" } }
}
medium
A. The original message field is deleted
B. A new field named 'msg' extracted from the log message
C. Logs are sent to a file instead of Elasticsearch
D. The timestamp field is removed

Solution

  1. Step 1: Analyze the filter section

    The grok filter extracts parts of the log message into fields: timestamp, level, and msg.
  2. Step 2: Determine output effect

    The output sends logs to Elasticsearch index 'app-logs' with the new fields added, including 'msg'.
  3. Final Answer:

    A new field named 'msg' extracted from the log message -> Option B
  4. Quick Check:

    Grok adds 'msg' field from message [OK]
Hint: Grok filter extracts fields like 'msg' from logs [OK]
Common Mistakes:
  • Assuming original message is deleted
  • Thinking output sends logs to a file
  • Believing timestamp is removed
4. Identify the error in this pipeline configuration snippet:
{
  "input": { "type": "file", "path": "/var/log/app.log" },
  "filter": { "grok": { "match": { "message": "%{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:level}" } } },
  "output": { "elasticsearch": { "index": "app-logs" }
}
medium
A. Input type 'file' is invalid
B. Incorrect grok pattern syntax
C. Output index name cannot contain hyphens
D. Missing closing brace for the output section

Solution

  1. Step 1: Check JSON structure

    The output section is missing a closing brace '}' at the end, causing invalid JSON.
  2. Step 2: Validate other parts

    The grok pattern syntax is correct, input type 'file' is valid, and index names can have hyphens.
  3. Final Answer:

    Missing closing brace for the output section -> Option D
  4. Quick Check:

    JSON braces must be balanced [OK]
Hint: Check all braces and commas in JSON config [OK]
Common Mistakes:
  • Ignoring missing braces causing syntax errors
  • Assuming grok pattern is wrong without checking
  • Thinking index names can't have hyphens
5. You want to create a log management pipeline that drops logs with level 'DEBUG' and adds a new field 'environment' with value 'production'. Which filter configuration achieves this?
hard
A. { "drop": { "if": "[level] == 'DEBUG'" }, "mutate": { "add_field": { "environment": "production" } } }
B. { "if": "[level] == 'DEBUG'", "drop": {}, "add_field": { "environment": "production" } }
C. { "mutate": { "drop": "[level] == 'DEBUG'", "add_field": { "environment": "production" } } }
D. { "filter": { "drop": { "condition": "level == 'DEBUG'" }, "add_field": { "environment": "production" } } }

Solution

  1. Step 1: Understand filter syntax for dropping logs

    The 'drop' filter uses an 'if' condition to remove logs matching criteria.
  2. Step 2: Add a new field using 'mutate' filter

    The 'mutate' filter's 'add_field' adds new fields to the log event.
  3. Step 3: Combine drop and mutate correctly

    { "drop": { "if": "[level] == 'DEBUG'" }, "mutate": { "add_field": { "environment": "production" } } } correctly uses 'drop' with 'if' and 'mutate' with 'add_field' in the right structure.
  4. Final Answer:

    { "drop": { "if": "[level] == 'DEBUG'" }, "mutate": { "add_field": { "environment": "production" } } } -> Option A
  5. Quick Check:

    Drop with if + mutate add_field = { "drop": { "if": "[level] == 'DEBUG'" }, "mutate": { "add_field": { "environment": "production" } } } [OK]
Hint: Use 'drop' with 'if' and 'mutate' to add fields [OK]
Common Mistakes:
  • Placing 'drop' inside 'mutate' incorrectly
  • Using wrong syntax for conditions
  • Trying to add fields inside 'drop' filter