What is Shard in Elasticsearch: Explanation and Example
shard is a basic unit of storage that holds a subset of data from an index. It helps distribute data and search load across multiple nodes, making Elasticsearch scalable and fast.How It Works
Think of an Elasticsearch index like a big book. Instead of storing the whole book in one place, Elasticsearch splits it into smaller chapters called shards. Each shard holds part of the data, so the system can handle large amounts of information efficiently.
These shards are spread across different servers (called nodes). When you search or add data, Elasticsearch works with multiple shards at once, speeding up the process. This is like asking several friends to read different chapters of a book at the same time and then combining their answers.
Example
This example shows how to create an index with 3 primary shards and 1 replica shard each. Primary shards hold the original data, and replicas are copies for safety and speed.
PUT /my-index
{
"settings": {
"number_of_shards": 3,
"number_of_replicas": 1
}
}When to Use
Use shards when you need to store and search large amounts of data quickly. Shards let Elasticsearch split the work across many servers, so it can handle more data and more users at the same time.
For example, if you run a website with millions of products or logs, shards help keep searches fast and reliable. You can also add more shards or nodes as your data grows, making your system flexible.
Key Points
- A shard is a piece of an Elasticsearch index that stores part of the data.
- Shards allow Elasticsearch to scale by distributing data and search load.
- Primary shards hold original data; replica shards are copies for backup and speed.
- You decide the number of shards when creating an index, balancing performance and resources.