Bird
Raised Fist0
Elasticsearchquery~20 mins

Shard sizing strategy in Elasticsearch - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
๐ŸŽ–๏ธ
Shard Sizing Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
โ“ Predict Output
intermediate
1:30remaining
What is the recommended shard size for optimal performance?

Given the following Elasticsearch shard sizing guidelines, what is the recommended shard size range for optimal performance?

Options are in gigabytes (GB).

A1-5 GB
B10-50 GB
C100-200 GB
D500-1000 GB
Attempts:
2 left
๐Ÿ’ก Hint

Think about balancing shard size to avoid overhead and ensure fast recovery.

๐Ÿง  Conceptual
intermediate
1:30remaining
Why avoid too many small shards in Elasticsearch?

What is the main reason to avoid having too many small shards in an Elasticsearch cluster?

ASmall shards cause excessive overhead and resource consumption
BSmall shards improve search speed significantly
CSmall shards increase disk space usage
DSmall shards reduce cluster stability
Attempts:
2 left
๐Ÿ’ก Hint

Think about how Elasticsearch manages shards internally.

โ“ Predict Output
advanced
1:30remaining
What happens if shard size exceeds recommended limits?

Consider an Elasticsearch index with shards sized around 200 GB each. What is the most likely impact on cluster behavior?

ANo impact on cluster performance
BFaster indexing and search performance
CReduced disk usage and faster backups
DSlower shard recovery and increased search latency
Attempts:
2 left
๐Ÿ’ก Hint

Think about how large shards affect recovery and search operations.

๐Ÿ”ง Debug
advanced
2:00remaining
Identify the shard sizing mistake in this scenario

An Elasticsearch cluster has 100 indices, each with 20 shards sized about 1 GB. The cluster is experiencing high CPU usage and slow queries. What is the likely shard sizing mistake?

AToo many small shards causing overhead
BShards are too large causing slow recovery
CInsufficient number of shards per index
DShard size is optimal; issue is unrelated
Attempts:
2 left
๐Ÿ’ก Hint

Consider how shard count and size affect resource usage.

๐Ÿš€ Application
expert
2:30remaining
Calculate total shard size and count for a 5 TB index

You have a 5 TB Elasticsearch index. You want to keep shard sizes between 20 GB and 40 GB for optimal performance. How many shards should you create?

Choose the correct shard count range.

A500 to 1000 shards
B10 to 25 shards
C125 to 250 shards
D50 to 100 shards
Attempts:
2 left
๐Ÿ’ก Hint

Divide total index size by shard size range to find shard count.

Practice

(1/5)
1. What is the main reason to choose an appropriate shard size in Elasticsearch?
easy
A. To balance data storage and search performance
B. To increase the number of replicas
C. To reduce the number of indices
D. To avoid using any replicas

Solution

  1. Step 1: Understand shard purpose

    Shards split data to distribute storage and speed up search operations.
  2. Step 2: Connect shard size to performance

    Choosing the right shard size balances storage efficiency and search speed.
  3. Final Answer:

    To balance data storage and search performance -> Option A
  4. Quick Check:

    Shard size affects performance balance = A [OK]
Hint: Shard size balances storage and speed [OK]
Common Mistakes:
  • Thinking replicas control shard size
  • Confusing shard count with replica count
  • Assuming more shards always improve speed
2. Which setting controls the number of primary shards when creating an Elasticsearch index?
easy
A. number_of_shards
B. number_of_replicas
C. shard_size
D. index_refresh_interval

Solution

  1. Step 1: Identify shard count setting

    The setting number_of_shards defines how many primary shards an index has.
  2. Step 2: Differentiate from replicas

    number_of_replicas controls copies, not primary shard count.
  3. Final Answer:

    number_of_shards -> Option A
  4. Quick Check:

    Primary shards = number_of_shards [OK]
Hint: Primary shards set by number_of_shards [OK]
Common Mistakes:
  • Confusing replicas with shards
  • Using shard_size which is not a setting
  • Mixing index refresh with shard count
3. Given an index with 5 primary shards and each shard sized at 20GB, what is the total data size stored in the index?
medium
A. 20GB
B. 100GB
C. 25GB
D. 5GB

Solution

  1. Step 1: Calculate total size from shards

    Total size = number of shards x size per shard = 5 x 20GB = 100GB.
  2. Step 2: Confirm no replicas included

    Replicas add copies but do not affect primary data size calculation here.
  3. Final Answer:

    100GB -> Option B
  4. Quick Check:

    5 shards x 20GB = 100GB [OK]
Hint: Multiply shards by shard size [OK]
Common Mistakes:
  • Adding replica size to primary data size
  • Confusing shard count with replica count
  • Choosing shard size instead of total
4. You set number_of_shards to 1 but your data size grows to 200GB. What is the main problem with this shard sizing?
medium
A. Index refresh interval is too short
B. Too many shards causing overhead
C. Replica count is zero
D. Shard size is too large, causing slower search and indexing

Solution

  1. Step 1: Analyze shard size impact

    One shard holding 200GB is large and can slow down search and indexing.
  2. Step 2: Identify correct problem

    Too few shards for large data causes performance issues, not replica count or refresh interval.
  3. Final Answer:

    Shard size is too large, causing slower search and indexing -> Option D
  4. Quick Check:

    Large shard size = slower performance [OK]
Hint: Avoid very large single shards [OK]
Common Mistakes:
  • Blaming replica count instead of shard size
  • Thinking many shards cause this problem
  • Ignoring shard size impact on speed
5. You have 500GB of data and want to keep shard sizes between 10GB and 40GB. Which shard count is best to set for your index?
hard
A. 5 shards
B. 10 shards
C. 50 shards
D. 100 shards

Solution

  1. Step 1: Calculate shard count range

    Minimum shards = 500GB / 40GB โ‰ˆ 13 shards; maximum shards = 500GB / 10GB = 50 shards.
  2. Step 2: Choose shard count within range

    To keep shard size between 10GB and 40GB, choose a shard count near 50.
  3. Final Answer:

    50 shards -> Option C
  4. Quick Check:

    500GB รท 50 shards = 10GB per shard [OK]
Hint: Divide total data by desired shard size [OK]
Common Mistakes:
  • Choosing too few shards causing large shard size
  • Choosing too many shards causing overhead
  • Ignoring shard size limits