Bird
Raised Fist0
Elasticsearchquery~10 mins

Async search for expensive queries in Elasticsearch - Step-by-Step Execution

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Concept Flow - Async search for expensive queries
Start Async Search Request
Elasticsearch Accepts Request
Query Runs in Background
Client Polls for Results
Results Ready?
NoWait and Poll Again
Yes
Client Retrieves Results
Process or Display Results
End
The client sends an async search request, Elasticsearch runs the query in the background, the client polls until results are ready, then retrieves and processes them.
Execution Sample
Elasticsearch
POST /_async_search
{
  "query": { "match_all": {} },
  "size": 1000
}
Starts an async search that matches all documents and returns up to 1000 results.
Execution Table
StepActionRequest/ResponseStatusNotes
1Send async search requestPOST /_async_search with queryAcceptedElasticsearch starts query in background
2Receive async search ID{ "id": "abc123", "is_running": true }RunningClient gets search ID to poll later
3Poll for resultsGET /_async_search/abc123RunningQuery still running, no results yet
4Poll again after waitGET /_async_search/abc123CompletedResults ready, returned in response
5Process resultsResponse contains hitsSuccessClient processes or displays results
6Delete async searchDELETE /_async_search/abc123DeletedClean up resources on server
💡 Execution stops after results are retrieved and optionally deleted.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 4Final
search_idnull"abc123""abc123""abc123"null
is_runningnulltruetruefalsefalse
resultsnullnullnullhits datahits data
Key Moments - 3 Insights
Why do we get a search ID instead of immediate results?
Because the query is expensive, Elasticsearch runs it in the background and returns a search ID to let the client check back later, as shown in execution_table step 2.
What happens if we poll too early for results?
The response will indicate the search is still running (step 3), so the client must wait and poll again later.
Why should we delete the async search after retrieving results?
Deleting frees server resources used to keep the search context, as shown in step 6 of the execution_table.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what is the status after the first async search request is sent?
ACompleted
BAccepted
CDeleted
DFailed
💡 Hint
Check the Status column at Step 1 in the execution_table.
At which step does the client receive the actual search results?
AStep 4
BStep 3
CStep 2
DStep 6
💡 Hint
Look for 'Results ready' in the Notes column of the execution_table.
If the client never deletes the async search, what variable in variable_tracker remains non-null?
Aresults
Bis_running
Csearch_id
Dnull
💡 Hint
Check the 'search_id' row in variable_tracker and what happens after Step 6.
Concept Snapshot
Async search lets Elasticsearch run expensive queries in background.
Client sends request and gets a search ID.
Client polls with ID until results are ready.
Results returned when complete.
Delete async search to free resources.
Full Transcript
Async search in Elasticsearch helps run heavy queries without waiting for immediate results. The client sends a request and gets back a search ID. Elasticsearch runs the query in the background. The client polls using the ID to check if results are ready. Once ready, the client retrieves and processes the results. Finally, the client can delete the async search to clean up resources. This process avoids long waits and keeps the system responsive.

Practice

(1/5)
1. What is the main benefit of using async search in Elasticsearch for expensive queries?
easy
A. It caches all query results permanently.
B. It automatically speeds up the query execution time.
C. It disables query logging to improve performance.
D. It allows running slow queries without blocking the application.

Solution

  1. Step 1: Understand async search purpose

    Async search lets you run slow or heavy queries without making your app wait or freeze.
  2. Step 2: Identify the main benefit

    This means your app can continue working while the query runs in the background.
  3. Final Answer:

    It allows running slow queries without blocking the application. -> Option D
  4. Quick Check:

    Async search = non-blocking query execution [OK]
Hint: Async search runs queries in background, so app doesn't wait [OK]
Common Mistakes:
  • Thinking async search speeds up queries automatically
  • Assuming async search caches results permanently
  • Believing async search disables logging
2. Which of the following is the correct way to start an async search request in Elasticsearch using the REST API?
easy
A. POST /_async_search { "query": { "match_all": {} } }
B. GET /_async_search { "query": { "match_all": {} } }
C. POST /_search/async { "query": { "match_all": {} } }
D. PUT /_async_search { "query": { "match_all": {} } }

Solution

  1. Step 1: Recall async search API endpoint

    The correct endpoint to start an async search is POST /_async_search with the query in the body.
  2. Step 2: Check HTTP method and path

    GET is not used to start async search, and /_search/async or PUT are incorrect paths or methods.
  3. Final Answer:

    POST /_async_search with query body -> Option A
  4. Quick Check:

    Start async search = POST /_async_search [OK]
Hint: Use POST method on /_async_search to start async search [OK]
Common Mistakes:
  • Using GET instead of POST to start async search
  • Using wrong endpoint like /_search/async
  • Using PUT method which is invalid here
3. Given this async search response snippet, what does the id field represent?
{
  "id": "r1A2B3C4D5E6F7G8H9I",
  "is_running": true,
  "response": null
}
medium
A. The timeout duration for the async search.
B. The total number of documents matched by the query.
C. The unique identifier to check status or fetch results later.
D. The Elasticsearch node handling the query.

Solution

  1. Step 1: Understand the async search response fields

    The id is a unique string to identify this async search request.
  2. Step 2: Purpose of the id

    You use this id to check if the search is done or to get the results later.
  3. Final Answer:

    The unique identifier to check status or fetch results later. -> Option C
  4. Quick Check:

    Async search id = unique query handle [OK]
Hint: Async search id tracks query status and results [OK]
Common Mistakes:
  • Confusing id with document count
  • Thinking id is timeout or node info
  • Assuming id changes during query
4. You wrote this code to start an async search but get an error:
POST /_async_search
{
  "query": {
    "match": {
      "title": "Elasticsearch"
    }
  },
  "wait_for_completion_timeout": "1s"
}
What is the error in this request?
medium
A. Missing comma between query and wait_for_completion_timeout fields.
B. Using POST instead of GET method.
C. wait_for_completion_timeout cannot be set in the request body.
D. The field name "title" is invalid in match query.

Solution

  1. Step 1: Check JSON syntax

    The JSON body is missing a comma after the closing brace of the "query" object.
  2. Step 2: Validate method and fields

    POST is correct method, wait_for_completion_timeout is valid in body, and "title" is a valid field name.
  3. Final Answer:

    Missing comma between query and wait_for_completion_timeout fields. -> Option A
  4. Quick Check:

    JSON syntax error = missing comma [OK]
Hint: Check commas between JSON fields carefully [OK]
Common Mistakes:
  • Forgetting commas between JSON objects
  • Confusing HTTP methods for async search
  • Misplacing wait_for_completion_timeout outside body
5. You want to run a very expensive aggregation query on a large dataset without timing out. Which approach using async search is best to get the final results efficiently?
hard
A. Run a normal search with a very high timeout value to wait for results.
B. Start async search with a long wait_for_completion_timeout and poll using the returned id until results are ready.
C. Start async search and immediately request results without waiting for completion.
D. Run the query multiple times with smaller timeouts and merge results manually.

Solution

  1. Step 1: Understand async search timeout and polling

    Setting a reasonable wait_for_completion_timeout lets the server try to finish quickly but returns control if it takes longer.
  2. Step 2: Use the returned id to poll for completion

    You can check the status later using the id until the results are ready, avoiding timeouts and blocking.
  3. Final Answer:

    Start async search with a long wait_for_completion_timeout and poll using the returned id until results are ready. -> Option B
  4. Quick Check:

    Async search + polling = efficient for expensive queries [OK]
Hint: Use wait_for_completion_timeout + poll with id for big queries [OK]
Common Mistakes:
  • Using normal search with high timeout risking app freeze
  • Requesting results immediately before completion
  • Manually merging partial results instead of async search