ElasticsearchHow-ToBeginner · 4 min read

How to Scale Elasticsearch Cluster: Simple Steps and Tips

To scale an Elasticsearch cluster, add more data nodes to distribute data and queries, and adjust the number of shards and replicas for better load balancing and fault tolerance. You can scale horizontally by increasing nodes or vertically by upgrading hardware resources.

📐

Syntax

Scaling an Elasticsearch cluster involves configuring nodes and shard settings. Key parts include:

Nodes: Servers that hold data and perform operations.
Shards: Pieces of an index that distribute data.
Replicas: Copies of shards for fault tolerance.
Cluster settings: Adjusted via APIs or config files to add nodes or change shard counts.

json

PUT /my-index/_settings
{
  "index": {
    "number_of_replicas": 1
  }
}

💻

Example

This example shows how to add a new data node to the cluster and update an index to increase replicas for better availability.

yaml + json

# Add a new node by configuring elasticsearch.yml on the new server
node.name: node-3
node.master: false
node.data: true
cluster.name: my-cluster
network.host: 192.168.1.3

# Update index settings to increase replicas
PUT /my-index/_settings
{
  "index": {
    "number_of_replicas": 2
  }
}

Output

{ "acknowledged": true, "shards_acknowledged": true, "index": "my-index" }

⚠️

Common Pitfalls

Common mistakes when scaling Elasticsearch include:

Setting too many shards for small data, causing overhead.
Not balancing shards evenly across nodes.
Ignoring hardware limits like CPU, memory, and disk I/O.
Failing to update replica counts after adding nodes.

Always monitor cluster health and rebalance shards after scaling.

json

## Wrong: Creating too many shards for a small index
PUT /small-index
{
  "settings": {
    "number_of_shards": 50,
    "number_of_replicas": 1
  }
}

## Right: Use fewer shards for small data
PUT /small-index
{
  "settings": {
    "number_of_shards": 1,
    "number_of_replicas": 1
  }
}

📊

Quick Reference

Add nodes: Configure new servers with node.data: true and join cluster.
Adjust shards: Set number_of_shards at index creation; cannot change later.
Adjust replicas: Change number_of_replicas anytime for fault tolerance.
Monitor: Use _cluster/health API to check cluster status.

✅

Key Takeaways

Scale Elasticsearch by adding data nodes to distribute load horizontally.

Set shard count wisely at index creation; replicas can be changed anytime.

Avoid too many shards for small datasets to reduce overhead.

Monitor cluster health and rebalance shards after scaling.

Upgrade hardware resources for vertical scaling if needed.