Bird
Raised Fist0
Elasticsearchquery~5 mins

Cross-cluster search in Elasticsearch - Time & Space Complexity

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Time Complexity: Cross-cluster search
O(n)
Understanding Time Complexity

When using cross-cluster search, we want to know how the search time changes as we add more clusters or data.

We ask: How does the search cost grow when searching across multiple clusters?

Scenario Under Consideration

Analyze the time complexity of the following Elasticsearch cross-cluster search query.


GET /cluster_one:index_one,cluster_two:index_two/_search
{
  "query": {
    "match": { "field": "value" }
  }
}
    

This query searches for documents matching "value" in "field" across two clusters and their indexes.

Identify Repeating Operations

Look at what repeats when the query runs:

  • Primary operation: Searching each cluster's index for matching documents.
  • How many times: Once per cluster-index pair involved in the search.
How Execution Grows With Input

As you add more clusters or indexes, the search runs more times, once per cluster-index.

Input Size (number of clusters)Approx. Operations
22 searches
1010 searches
100100 searches

Pattern observation: The total work grows directly with the number of clusters searched.

Final Time Complexity

Time Complexity: O(n)

This means the search time grows linearly as you add more clusters to search.

Common Mistake

[X] Wrong: "Searching multiple clusters happens all at once with no extra cost."

[OK] Correct: Each cluster runs its own search, so total time adds up with more clusters.

Interview Connect

Understanding how cross-cluster search scales helps you explain real-world search performance and design better queries.

Self-Check

What if we limited the search to only a subset of clusters? How would the time complexity change?

Practice

(1/5)
1. What is the main purpose of cross-cluster search in Elasticsearch?
easy
A. To monitor cluster health status remotely
B. To backup data from one cluster to another
C. To merge two clusters into one
D. To search data across multiple Elasticsearch clusters using a single query

Solution

  1. Step 1: Understand cross-cluster search concept

    Cross-cluster search allows querying data from multiple clusters in one search request.
  2. Step 2: Differentiate from other cluster operations

    It does not merge clusters, backup data, or monitor health but focuses on searching data.
  3. Final Answer:

    To search data across multiple Elasticsearch clusters using a single query -> Option D
  4. Quick Check:

    Cross-cluster search = search across clusters [OK]
Hint: Cross-cluster search = one query, many clusters [OK]
Common Mistakes:
  • Confusing search with backup or monitoring
  • Thinking it merges clusters
  • Assuming it manages cluster health
2. Which syntax correctly specifies a remote cluster alias in a cross-cluster search query?
easy
A. GET /remote_cluster:index/_search
B. GET /index@remote_cluster/_search
C. GET /index/remote_cluster/_search
D. GET /remote_cluster/_search/index

Solution

  1. Step 1: Recall remote cluster alias syntax

    The correct syntax uses remote_cluster:index to specify the cluster alias and index.
  2. Step 2: Check each option format

    Only GET /remote_cluster:index/_search matches the correct pattern: GET /remote_cluster:index/_search.
  3. Final Answer:

    GET /remote_cluster:index/_search -> Option A
  4. Quick Check:

    Alias:index/_search = correct syntax [OK]
Hint: Use alias:index/_search to target remote cluster data [OK]
Common Mistakes:
  • Placing alias after index
  • Using slashes instead of colon
  • Misordering parts of the URL
3. Given this cross-cluster search query:
GET /clusterA:logs-2023/_search
{
  "query": { "match_all": {} }
}

What data will this query return?
medium
A. All documents from the local cluster's logs-2023 index
B. All documents from the logs-2023 index in clusterA
C. Documents matching "clusterA" in the logs-2023 index
D. An error because cluster alias is missing

Solution

  1. Step 1: Identify cluster alias usage

    The query uses clusterA:logs-2023, meaning it targets the logs-2023 index on remote cluster named clusterA.
  2. Step 2: Understand the query body

    The match_all query returns all documents from that index on clusterA.
  3. Final Answer:

    All documents from the logs-2023 index in clusterA -> Option B
  4. Quick Check:

    Alias:index with match_all = all remote docs [OK]
Hint: Alias:index means search that index on remote cluster [OK]
Common Mistakes:
  • Assuming it searches local cluster
  • Thinking it filters by cluster name in data
  • Believing alias is optional
4. You run this cross-cluster search query:
GET /remoteCluster:products/_search
{
  "query": { "term": { "category": "electronics" } }
}

But get an error: no such remote cluster. What is the likely cause?
medium
A. The query syntax is invalid for cross-cluster search
B. The index 'products' does not exist on the remote cluster
C. The remote cluster alias 'remoteCluster' is not configured in the local cluster
D. The term query cannot be used in cross-cluster search

Solution

  1. Step 1: Analyze the error message

    The error no such remote cluster means the alias 'remoteCluster' is unknown to the local cluster.
  2. Step 2: Check configuration requirements

    Remote clusters must be configured before use; missing alias causes this error.
  3. Final Answer:

    The remote cluster alias 'remoteCluster' is not configured in the local cluster -> Option C
  4. Quick Check:

    Missing alias config = no such remote cluster error [OK]
Hint: Configure remote cluster alias before querying [OK]
Common Mistakes:
  • Assuming index absence causes this error
  • Blaming query syntax for alias errors
  • Thinking term queries are unsupported
5. You want to search the sales-2023 index across two remote clusters named clusterX and clusterY. Which query correctly searches both clusters and returns combined results?
hard
A. GET /clusterX:sales-2023,clusterY:sales-2023/_search { "query": { "match_all": {} } }
B. GET /sales-2023/_search { "query": { "match_all": {} }, "clusters": ["clusterX", "clusterY"] }
C. GET /clusterX:clusterY:sales-2023/_search { "query": { "match_all": {} } }
D. GET /sales-2023/_search { "query": { "match_all": {} }, "remote_clusters": ["clusterX", "clusterY"] }

Solution

  1. Step 1: Recall syntax for multiple remote clusters

    To search multiple clusters, use comma-separated list of <code>cluster_alias:index</code>, like <code>clusterX:sales-2023,clusterY:sales-2023</code>.
  2. Step 2: Evaluate each option

    GET /clusterX:sales-2023,clusterY:sales-2023/_search { "query": { "match_all": {} } } uses <code>clusterX:sales-2023,clusterY:sales-2023</code> which is correct syntax for cross-cluster search across multiple clusters.
  3. Final Answer:

    GET /clusterX:sales-2023,clusterY:sales-2023/_search { "query": { "match_all": {} } } -> Option A
  4. Quick Check:

    clusterX:sales-2023,clusterY:sales-2023/_search = multi-cluster search [OK]
Hint: comma-separate alias:index for multi-cluster search [OK]
Common Mistakes:
  • Using multiple colons instead of commas
  • Adding cluster names inside query body
  • Assuming local index searches multiple clusters