Shard allocation awareness helps Elasticsearch keep data copies on different machines or racks. This makes sure your data stays safe even if one machine or rack fails.
Shard allocation awareness in Elasticsearch
Start learning this pattern below
Jump into concepts and practice - no test required
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.allocation.awareness.attributes": "rack_id"
}
}
PUT /my_index
{
"settings": {
"index.routing.allocation.awareness.include": {
"rack_id": "rack1,rack2"
}
}
}The cluster.routing.allocation.awareness.attributes setting tells Elasticsearch which node attribute to use for awareness (like rack_id).
The index.routing.allocation.awareness.include setting controls which attribute values shards should be allocated to.
zone, so Elasticsearch will try to spread shards across different zones.PUT /_cluster/settings
{
"persistent": {
"cluster.routing.allocation.awareness.attributes": "zone"
}
}zone1 or zone2.PUT /my_index
{
"settings": {
"index.routing.allocation.awareness.include": {
"zone": "zone1,zone2"
}
}
}rack and zone for more detailed shard spreading.PUT /_cluster/settings
{
"persistent": {
"cluster.routing.allocation.awareness.attributes": "rack,zone"
}
}This example sets the cluster to use rack_id for shard allocation awareness. Then it creates an index my_index that only allocates shards to nodes with rack1 or rack2. Finally, it shows where shards are placed.
PUT /_cluster/settings
{
"persistent": {
"cluster.routing.allocation.awareness.attributes": "rack_id"
}
}
PUT /my_index
{
"settings": {
"index.routing.allocation.awareness.include": {
"rack_id": "rack1,rack2"
},
"number_of_shards": 2,
"number_of_replicas": 1
}
}
GET /_cat/shards/my_index?vMake sure your Elasticsearch nodes have the attribute (like rack_id) set in their elasticsearch.yml file under node.attr.rack_id.
Shard allocation awareness helps prevent data loss by spreading shards, but it does not guarantee perfect balance if nodes are uneven.
Use the GET /_cat/shards API to check where shards are allocated.
Shard allocation awareness spreads data copies across different physical locations.
Set cluster awareness attributes and index allocation rules to control shard placement.
This improves data safety and availability in case of machine or rack failures.
Practice
Solution
Step 1: Understand shard allocation awareness concept
Shard allocation awareness ensures that shard copies are placed on different physical locations like racks or machines.Step 2: Identify the benefit of spreading shards
This spreading improves fault tolerance by preventing data loss if one location fails.Final Answer:
To spread shard copies across different physical locations for better fault tolerance -> Option DQuick Check:
Shard allocation awareness = spreading shards for fault tolerance [OK]
- Confusing shard allocation awareness with shard count increase
- Thinking it speeds up queries directly
- Assuming it compresses data
elasticsearch.yml file?Solution
Step 1: Recall the correct setting syntax
The correct setting for awareness attributes iscluster.routing.allocation.awareness.attributes.Step 2: Match the option with correct syntax
cluster.routing.allocation.awareness.attributes: rack_id matches the exact syntax used in Elasticsearch configuration files.Final Answer:
cluster.routing.allocation.awareness.attributes: rack_id -> Option AQuick Check:
Correct config key = cluster.routing.allocation.awareness.attributes [OK]
- Omitting 'cluster.routing' prefix
- Swapping order of words in the key
- Using incomplete or wrong keys
{
"settings": {
"index.routing.allocation.awareness.include": {
"rack_id": "rack1,rack2"
}
}
}Solution
Step 1: Understand the setting meaning
The settingindex.routing.allocation.awareness.includewith rack_id values means shards should only go to nodes with those rack_ids.Step 2: Apply to given values
Since rack1 and rack2 are included, shards will only be allocated on nodes labeled with rack1 or rack2.Final Answer:
Shards will only be allocated on nodes with rack_id rack1 or rack2 -> Option CQuick Check:
Allocation include rack1,rack2 = shards on rack1 or rack2 only [OK]
- Thinking shards can go to any rack
- Confusing include with exclude
- Assuming syntax error due to JSON format
cluster.routing.allocation.awareness.attributes: rack_id but shards are still allocated on the same rack. What is the likely cause?Solution
Step 1: Check cluster awareness prerequisites
For awareness to work, each node must havenode.attr.rack_idset to identify its rack.Step 2: Identify missing node attribute effect
If nodes lack this attribute, Elasticsearch cannot distinguish racks and may place shards on the same rack.Final Answer:
Nodes do not have thenode.attr.rack_idsetting defined -> Option BQuick Check:
Missing node.attr.rack_id = shards not spread by rack [OK]
- Assuming replicas count affects awareness
- Thinking cluster read-only blocks allocation
- Blaming shard size for allocation issues
Solution
Step 1: Identify setting to enforce shard separation
Theindex.routing.allocation.awareness.force.rack_id: truesetting forces Elasticsearch to allocate primary and replica shards on different racks.Step 2: Combine with cluster awareness attribute
Settingcluster.routing.allocation.awareness.attributes: rack_idenables awareness based on rack_id attribute.Step 3: Confirm other options do not enforce separation
Simply setting the awareness attribute does not force separation. Setting force to false prevents enforcement. Using include settings restricts available racks but does not ensure primary and replica are on different ones.Final Answer:
Set cluster.routing.allocation.awareness.attributes: rack_id and index.routing.allocation.awareness.force.rack_id: true -> Option AQuick Check:
Force awareness true + rack_id attribute = shards separated by rack [OK]
- Forgetting to set force awareness to true
- Only setting awareness attribute without force
- Confusing include with force settings
