ElasticsearchHow-ToBeginner · 4 min read

How to Create Ingest Pipeline in Elasticsearch Quickly

To create an ingest pipeline in Elasticsearch, use the PUT _ingest/pipeline/{pipeline_id} API with a JSON body defining processors. This pipeline processes documents before indexing, allowing transformations like parsing or enrichment.

📐

Syntax

The basic syntax to create an ingest pipeline uses the PUT HTTP method on the _ingest/pipeline/{pipeline_id} endpoint. The request body is a JSON object that defines the pipeline's description and a list of processors that modify documents.

pipeline_id: A unique name for your pipeline.
description: A short text describing the pipeline.
processors: An array of actions to perform on documents, such as parsing or removing fields.

json

PUT _ingest/pipeline/{pipeline_id}
{
  "description": "Description of what this pipeline does",
  "processors": [
    {
      "processor_type": {
        "field": "field_name",
        "target_field": "new_field_name"
      }
    }
  ]
}

💻

Example

This example creates a pipeline named my_pipeline that adds a timestamp field called ingest_timestamp to each document when it is ingested.

json

PUT _ingest/pipeline/my_pipeline
{
  "description": "Adds ingest timestamp",
  "processors": [
    {
      "set": {
        "field": "ingest_timestamp",
        "value": "{{_ingest.timestamp}}"
      }
    }
  ]
}

Output

{ "acknowledged": true }

⚠️

Common Pitfalls

Common mistakes when creating ingest pipelines include:

Using invalid processor types or misspelling processor names.
Not providing required fields for processors, causing errors.
Forgetting to specify the pipeline when indexing documents, so the pipeline is never applied.
Trying to modify fields that do not exist in the document.

Always test your pipeline with the _simulate API before using it in production.

json

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "processors": [
      {
        "set": {
          "field": "ingest_timestamp",
          "value": "{{_ingest.timestamp}}"
        }
      }
    ]
  },
  "docs": [
    {
      "_source": {
        "message": "Test document"
      }
    }
  ]
}

Output

{ "docs": [ { "doc": { "_index": "_index", "_type": "_doc", "_id": "_id", "_source": { "message": "Test document", "ingest_timestamp": "2024-06-01T12:00:00.000Z" } } } ] }

📊

Quick Reference

Element	Description
pipeline_id	Unique name for the ingest pipeline
description	Short text describing the pipeline's purpose
processors	Array of actions to transform documents
set processor	Adds or updates a field with a specified value
remove processor	Removes a specified field from documents
grok processor	Parses text fields using patterns
_simulate API	Tests pipeline without indexing data

✅

Key Takeaways

Use PUT _ingest/pipeline/{pipeline_id} with JSON body to create a pipeline.

Define processors inside the pipeline to transform documents before indexing.

Test pipelines with the _simulate API to avoid errors.

Always specify the pipeline when indexing documents to apply it.

Common processors include set, remove, and grok for flexible data handling.