Bird
Raised Fist0
Elasticsearchquery~10 mins

Cross-cluster search in Elasticsearch - Step-by-Step Execution

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Concept Flow - Cross-cluster search
Start Search Request
Identify Target Clusters
Send Query to Local Cluster
Send Query to Remote Clusters
Collect Results from All Clusters
Merge and Sort Results
Return Combined Results to User
The search request is sent to local and remote clusters, results are gathered, merged, and returned as one combined response.
Execution Sample
Elasticsearch
GET /local_index,remote_cluster:remote_index/_search
{
  "query": { "match_all": {} }
}
This query searches both a local index and a remote cluster's index and returns combined results.
Execution Table
StepActionTarget ClusterQuery SentResponse ReceivedResult Status
1Receive search requestLocal clustermatch_all query on local_indexPendingWaiting for responses
2Send query to local clusterLocal clustermatch_all query on local_indexResults from local_indexSuccess
3Send query to remote clusterRemote clustermatch_all query on remote_indexResults from remote_indexSuccess
4Merge resultsLocal + RemoteN/ACombined sorted resultsSuccess
5Return results to userClientN/ACombined results JSONComplete
💡 All clusters responded successfully and results merged for final output.
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
local_resultsemptyresults from local_indexresults from local_indexresults from local_indexincluded in combined results
remote_resultsemptyemptyresults from remote_indexresults from remote_indexincluded in combined results
combined_resultsemptyemptyemptymerged local + remotereturned to user
Key Moments - 3 Insights
Why do we specify the remote cluster name before the index in the query?
Because the remote cluster name tells Elasticsearch where to send the query; see execution_table step 3 where the query targets 'remote_cluster:remote_index'.
What happens if the remote cluster does not respond?
The search waits for a timeout or error; results from the local cluster are returned alone. This is implied after step 3 if no response is received.
How are results from multiple clusters combined?
Results are merged and sorted by relevance or timestamp as shown in step 4 of the execution_table.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, at which step are results from the remote cluster received?
AStep 3
BStep 2
CStep 4
DStep 5
💡 Hint
Check the 'Response Received' column for remote cluster results.
According to variable_tracker, what is the state of combined_results after Step 3?
AContains remote results only
BEmpty
CContains local results only
DContains merged local and remote results
💡 Hint
Look at the 'combined_results' row and the 'After Step 3' column.
If the remote cluster name is omitted in the query, what changes in the execution flow?
AQuery fails with an error
BQuery is sent to all clusters automatically
CQuery is sent only to local cluster
DQuery is sent only to remote cluster
💡 Hint
Refer to concept_flow where target clusters are identified explicitly.
Concept Snapshot
Cross-cluster search lets you query multiple Elasticsearch clusters at once.
Use the syntax 'remote_cluster:index' to target remote data.
Elasticsearch sends queries to all specified clusters.
Results are merged and sorted before returning.
This helps search across distributed data easily.
Full Transcript
Cross-cluster search in Elasticsearch allows a user to send a search query that targets both local and remote clusters. The process starts when the search request is received. The system identifies which clusters to query, including the local cluster and any remote clusters specified by name. The query is sent to the local cluster first, then to the remote clusters. Each cluster processes the query and returns results. These results are collected and merged, typically sorted by relevance or timestamp. Finally, the combined results are returned to the user as a single response. Variables like local_results and remote_results hold partial results before merging. The combined_results variable holds the final merged data. If a remote cluster does not respond, only local results are returned. The syntax for remote queries requires prefixing the index with the remote cluster name, such as 'remote_cluster:index'. This feature enables seamless searching across multiple Elasticsearch clusters from one query.

Practice

(1/5)
1. What is the main purpose of cross-cluster search in Elasticsearch?
easy
A. To monitor cluster health status remotely
B. To backup data from one cluster to another
C. To merge two clusters into one
D. To search data across multiple Elasticsearch clusters using a single query

Solution

  1. Step 1: Understand cross-cluster search concept

    Cross-cluster search allows querying data from multiple clusters in one search request.
  2. Step 2: Differentiate from other cluster operations

    It does not merge clusters, backup data, or monitor health but focuses on searching data.
  3. Final Answer:

    To search data across multiple Elasticsearch clusters using a single query -> Option D
  4. Quick Check:

    Cross-cluster search = search across clusters [OK]
Hint: Cross-cluster search = one query, many clusters [OK]
Common Mistakes:
  • Confusing search with backup or monitoring
  • Thinking it merges clusters
  • Assuming it manages cluster health
2. Which syntax correctly specifies a remote cluster alias in a cross-cluster search query?
easy
A. GET /remote_cluster:index/_search
B. GET /index@remote_cluster/_search
C. GET /index/remote_cluster/_search
D. GET /remote_cluster/_search/index

Solution

  1. Step 1: Recall remote cluster alias syntax

    The correct syntax uses remote_cluster:index to specify the cluster alias and index.
  2. Step 2: Check each option format

    Only GET /remote_cluster:index/_search matches the correct pattern: GET /remote_cluster:index/_search.
  3. Final Answer:

    GET /remote_cluster:index/_search -> Option A
  4. Quick Check:

    Alias:index/_search = correct syntax [OK]
Hint: Use alias:index/_search to target remote cluster data [OK]
Common Mistakes:
  • Placing alias after index
  • Using slashes instead of colon
  • Misordering parts of the URL
3. Given this cross-cluster search query:
GET /clusterA:logs-2023/_search
{
  "query": { "match_all": {} }
}

What data will this query return?
medium
A. All documents from the local cluster's logs-2023 index
B. All documents from the logs-2023 index in clusterA
C. Documents matching "clusterA" in the logs-2023 index
D. An error because cluster alias is missing

Solution

  1. Step 1: Identify cluster alias usage

    The query uses clusterA:logs-2023, meaning it targets the logs-2023 index on remote cluster named clusterA.
  2. Step 2: Understand the query body

    The match_all query returns all documents from that index on clusterA.
  3. Final Answer:

    All documents from the logs-2023 index in clusterA -> Option B
  4. Quick Check:

    Alias:index with match_all = all remote docs [OK]
Hint: Alias:index means search that index on remote cluster [OK]
Common Mistakes:
  • Assuming it searches local cluster
  • Thinking it filters by cluster name in data
  • Believing alias is optional
4. You run this cross-cluster search query:
GET /remoteCluster:products/_search
{
  "query": { "term": { "category": "electronics" } }
}

But get an error: no such remote cluster. What is the likely cause?
medium
A. The query syntax is invalid for cross-cluster search
B. The index 'products' does not exist on the remote cluster
C. The remote cluster alias 'remoteCluster' is not configured in the local cluster
D. The term query cannot be used in cross-cluster search

Solution

  1. Step 1: Analyze the error message

    The error no such remote cluster means the alias 'remoteCluster' is unknown to the local cluster.
  2. Step 2: Check configuration requirements

    Remote clusters must be configured before use; missing alias causes this error.
  3. Final Answer:

    The remote cluster alias 'remoteCluster' is not configured in the local cluster -> Option C
  4. Quick Check:

    Missing alias config = no such remote cluster error [OK]
Hint: Configure remote cluster alias before querying [OK]
Common Mistakes:
  • Assuming index absence causes this error
  • Blaming query syntax for alias errors
  • Thinking term queries are unsupported
5. You want to search the sales-2023 index across two remote clusters named clusterX and clusterY. Which query correctly searches both clusters and returns combined results?
hard
A. GET /clusterX:sales-2023,clusterY:sales-2023/_search { "query": { "match_all": {} } }
B. GET /sales-2023/_search { "query": { "match_all": {} }, "clusters": ["clusterX", "clusterY"] }
C. GET /clusterX:clusterY:sales-2023/_search { "query": { "match_all": {} } }
D. GET /sales-2023/_search { "query": { "match_all": {} }, "remote_clusters": ["clusterX", "clusterY"] }

Solution

  1. Step 1: Recall syntax for multiple remote clusters

    To search multiple clusters, use comma-separated list of <code>cluster_alias:index</code>, like <code>clusterX:sales-2023,clusterY:sales-2023</code>.
  2. Step 2: Evaluate each option

    GET /clusterX:sales-2023,clusterY:sales-2023/_search { "query": { "match_all": {} } } uses <code>clusterX:sales-2023,clusterY:sales-2023</code> which is correct syntax for cross-cluster search across multiple clusters.
  3. Final Answer:

    GET /clusterX:sales-2023,clusterY:sales-2023/_search { "query": { "match_all": {} } } -> Option A
  4. Quick Check:

    clusterX:sales-2023,clusterY:sales-2023/_search = multi-cluster search [OK]
Hint: comma-separate alias:index for multi-cluster search [OK]
Common Mistakes:
  • Using multiple colons instead of commas
  • Adding cluster names inside query body
  • Assuming local index searches multiple clusters