Optimize Indexing Performance in Elasticsearch: Best Practices
To optimize indexing performance in Elasticsearch, use
bulk API to reduce overhead, increase the refresh_interval during heavy indexing, and disable replicas temporarily. Also, simplify mappings and avoid unnecessary fields to speed up indexing.Syntax
Key settings and APIs to optimize indexing include:
bulk API: Send multiple documents in one request to reduce overhead.refresh_interval: Controls how often Elasticsearch refreshes the index to make documents searchable.number_of_replicas: Number of replica shards; reducing this speeds up indexing.mapping: Define fields and types to avoid costly dynamic mapping.
json
POST /_bulk
{ "index" : {"_index" : "my_index", "_id" : "1"} }
{ "field1" : "value1" }
{ "index" : {"_index" : "my_index", "_id" : "2"} }
{ "field1" : "value2" }
PUT /my_index/_settings
{
"refresh_interval" : "30s",
"number_of_replicas" : 0
}Example
This example shows how to bulk index documents and adjust settings to improve indexing speed.
json
POST /_bulk
{ "index": {"_index": "products", "_id": "1"} }
{ "name": "Laptop", "price": 1200 }
{ "index": {"_index": "products", "_id": "2"} }
{ "name": "Phone", "price": 800 }
PUT /products/_settings
{
"refresh_interval": "60s",
"number_of_replicas": 0
}Output
{
"took": 30,
"errors": false,
"items": [
{"index": {"_index": "products", "_id": "1", "status": 201}},
{"index": {"_index": "products", "_id": "2", "status": 201}}
]
}
Common Pitfalls
Common mistakes that hurt indexing performance include:
- Indexing documents one by one instead of using the
bulk API. - Keeping
refresh_intervaltoo low (default 1s) during heavy indexing, causing frequent costly refreshes. - Having replicas enabled during bulk indexing, which duplicates work.
- Using dynamic mappings that create many fields on the fly, increasing overhead.
json
### Wrong: Single document indexing
POST /my_index/_doc
{ "field": "value" }
### Right: Bulk indexing
POST /_bulk
{ "index": {"_index": "my_index"} }
{ "field": "value1" }
{ "index": {"_index": "my_index"} }
{ "field": "value2" }Quick Reference
- Use bulk API: Batch multiple documents per request.
- Increase refresh_interval: Set to 30s or more during indexing.
- Set replicas to 0: Disable replicas temporarily.
- Optimize mappings: Define explicit fields, avoid dynamic mapping.
- Disable unnecessary indexing features: Like _source if not needed.
Key Takeaways
Use the bulk API to reduce overhead and speed up indexing.
Increase the refresh_interval during heavy indexing to reduce costly refreshes.
Temporarily disable replicas to avoid duplicate indexing work.
Define explicit mappings to avoid expensive dynamic field creation.
Disable unnecessary features like _source if you don't need them.