How to Create Ingest Pipeline in Elasticsearch Quickly
To create an ingest pipeline in Elasticsearch, use the
PUT _ingest/pipeline/{pipeline_id} API with a JSON body defining processors. This pipeline processes documents before indexing, allowing transformations like parsing or enrichment.Syntax
The basic syntax to create an ingest pipeline uses the PUT HTTP method on the _ingest/pipeline/{pipeline_id} endpoint. The request body is a JSON object that defines the pipeline's description and a list of processors that modify documents.
- pipeline_id: A unique name for your pipeline.
- description: A short text describing the pipeline.
- processors: An array of actions to perform on documents, such as parsing or removing fields.
json
PUT _ingest/pipeline/{pipeline_id}
{
"description": "Description of what this pipeline does",
"processors": [
{
"processor_type": {
"field": "field_name",
"target_field": "new_field_name"
}
}
]
}Example
This example creates a pipeline named my_pipeline that adds a timestamp field called ingest_timestamp to each document when it is ingested.
json
PUT _ingest/pipeline/my_pipeline
{
"description": "Adds ingest timestamp",
"processors": [
{
"set": {
"field": "ingest_timestamp",
"value": "{{_ingest.timestamp}}"
}
}
]
}Output
{
"acknowledged": true
}
Common Pitfalls
Common mistakes when creating ingest pipelines include:
- Using invalid processor types or misspelling processor names.
- Not providing required fields for processors, causing errors.
- Forgetting to specify the pipeline when indexing documents, so the pipeline is never applied.
- Trying to modify fields that do not exist in the document.
Always test your pipeline with the _simulate API before using it in production.
json
POST _ingest/pipeline/_simulate
{
"pipeline": {
"processors": [
{
"set": {
"field": "ingest_timestamp",
"value": "{{_ingest.timestamp}}"
}
}
]
},
"docs": [
{
"_source": {
"message": "Test document"
}
}
]
}Output
{
"docs": [
{
"doc": {
"_index": "_index",
"_type": "_doc",
"_id": "_id",
"_source": {
"message": "Test document",
"ingest_timestamp": "2024-06-01T12:00:00.000Z"
}
}
}
]
}
Quick Reference
| Element | Description |
|---|---|
| pipeline_id | Unique name for the ingest pipeline |
| description | Short text describing the pipeline's purpose |
| processors | Array of actions to transform documents |
| set processor | Adds or updates a field with a specified value |
| remove processor | Removes a specified field from documents |
| grok processor | Parses text fields using patterns |
| _simulate API | Tests pipeline without indexing data |
Key Takeaways
Use PUT _ingest/pipeline/{pipeline_id} with JSON body to create a pipeline.
Define processors inside the pipeline to transform documents before indexing.
Test pipelines with the _simulate API to avoid errors.
Always specify the pipeline when indexing documents to apply it.
Common processors include set, remove, and grok for flexible data handling.