How to Bulk Index Documents in Elasticsearch Quickly
To bulk index documents in Elasticsearch, use the
_bulk API endpoint with a newline-delimited JSON payload containing action and data pairs. Each document requires an index action line followed by the document data line, allowing efficient insertion of many documents in one request.Syntax
The bulk API requires a newline-delimited JSON (NDJSON) format where each action line specifies the operation (like index) and the next line contains the document data. This pattern repeats for each document.
- Action line: Specifies the operation and target index.
- Data line: Contains the JSON document to be indexed.
json
{ "index" : { "_index" : "my_index", "_id" : "1" } }
{ "field1" : "value1", "field2" : "value2" }
{ "index" : { "_index" : "my_index", "_id" : "2" } }
{ "field1" : "value3", "field2" : "value4" }Example
This example shows how to bulk index two documents into an index named my_index using a curl command. It demonstrates the required NDJSON format and the response from Elasticsearch.
bash
curl -X POST "localhost:9200/_bulk" -H 'Content-Type: application/x-ndjson' -d ' { "index" : { "_index" : "my_index", "_id" : "1" } } { "name" : "Alice", "age" : 30 } { "index" : { "_index" : "my_index", "_id" : "2" } } { "name" : "Bob", "age" : 25 } '
Output
{
"took" : 5,
"errors" : false,
"items" : [
{ "index" : { "_index" : "my_index", "_id" : "1", "status" : 201 } },
{ "index" : { "_index" : "my_index", "_id" : "2", "status" : 201 } }
]
}
Common Pitfalls
Common mistakes when bulk indexing include:
- Not using newline-delimited JSON format correctly, causing parsing errors.
- Missing the action line before each document.
- Sending the entire bulk payload as a single JSON array instead of NDJSON.
- Not setting the
Content-Typeheader toapplication/x-ndjson.
Always ensure each action line is followed by its document line, and the payload ends with a newline.
json
Wrong way:
[
{ "index" : { "_index" : "my_index", "_id" : "1" } },
{ "name" : "Alice", "age" : 30 }
]
Right way:
{ "index" : { "_index" : "my_index", "_id" : "1" } }
{ "name" : "Alice", "age" : 30 }Quick Reference
Tips for bulk indexing:
- Use the
_bulkAPI endpoint. - Format data as newline-delimited JSON (NDJSON).
- Each document requires an action line and a data line.
- Set
Content-Typetoapplication/x-ndjson. - Check the response for errors after bulk indexing.
Key Takeaways
Use the Elasticsearch _bulk API with newline-delimited JSON to index multiple documents efficiently.
Each document must be preceded by an action line specifying the index and optional document ID.
Always set the Content-Type header to application/x-ndjson when sending bulk requests.
Check the bulk API response for errors to ensure all documents indexed successfully.
Avoid sending bulk data as a JSON array; use NDJSON format instead.