Bird
Raised Fist0
Elasticsearchquery~10 mins

Shard sizing strategy in Elasticsearch - Interactive Code Practice

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Practice - 5 Tasks
Answer the questions below
1fill in blank
easy

Complete the code to set the number of shards to 5 in the index settings.

Elasticsearch
{
  "settings": {
    "number_of_shards": [1]
  }
}
Drag options to blanks, or click blank then click option'
A10
B1
C3
D5
Attempts:
3 left
๐Ÿ’ก Hint
Common Mistakes
Setting number_of_shards to 1 when more shards are needed for scaling.
Using a very high number like 10 without reason.
2fill in blank
medium

Complete the code to set the shard size limit to 50GB using the index.routing.allocation.total_shards_per_node setting.

Elasticsearch
{
  "settings": {
    "index.routing.allocation.total_shards_per_node": [1]
  }
}
Drag options to blanks, or click blank then click option'
A"5"
B"50"
C"20"
D"10"
Attempts:
3 left
๐Ÿ’ก Hint
Common Mistakes
Confusing shard count with shard size.
Setting the value too high causing large shards.
3fill in blank
hard

Fix the error in the shard size calculation formula to estimate shard size correctly.

Elasticsearch
shard_size_gb = total_index_size_gb [1] number_of_shards
Drag options to blanks, or click blank then click option'
A*
B+
C/
D-
Attempts:
3 left
๐Ÿ’ก Hint
Common Mistakes
Using multiplication instead of division.
Using addition or subtraction which are incorrect here.
4fill in blank
hard

Fill both blanks to create a dictionary comprehension that maps each shard to its size if the size is less than 30GB.

Elasticsearch
shard_sizes = {shard: size[1] 2 for shard, size in shards.items() if size [2] 30}
Drag options to blanks, or click blank then click option'
A**
B>
C<
D//
Attempts:
3 left
๐Ÿ’ก Hint
Common Mistakes
Using > instead of < in the condition.
Using multiplication instead of exponentiation.
5fill in blank
hard

Fill both blanks to create a dictionary comprehension that maps uppercase shard names to their sizes if size is greater than 20GB.

Elasticsearch
filtered_shards = {shard[1]: size for shard, size in shards.items() if size [2] 20}
Drag options to blanks, or click blank then click option'
A.upper()
C>
D<
Attempts:
3 left
๐Ÿ’ก Hint
Common Mistakes
Using < instead of > in the condition.
Adding extra characters to size.

Practice

(1/5)
1. What is the main reason to choose an appropriate shard size in Elasticsearch?
easy
A. To balance data storage and search performance
B. To increase the number of replicas
C. To reduce the number of indices
D. To avoid using any replicas

Solution

  1. Step 1: Understand shard purpose

    Shards split data to distribute storage and speed up search operations.
  2. Step 2: Connect shard size to performance

    Choosing the right shard size balances storage efficiency and search speed.
  3. Final Answer:

    To balance data storage and search performance -> Option A
  4. Quick Check:

    Shard size affects performance balance = A [OK]
Hint: Shard size balances storage and speed [OK]
Common Mistakes:
  • Thinking replicas control shard size
  • Confusing shard count with replica count
  • Assuming more shards always improve speed
2. Which setting controls the number of primary shards when creating an Elasticsearch index?
easy
A. number_of_shards
B. number_of_replicas
C. shard_size
D. index_refresh_interval

Solution

  1. Step 1: Identify shard count setting

    The setting number_of_shards defines how many primary shards an index has.
  2. Step 2: Differentiate from replicas

    number_of_replicas controls copies, not primary shard count.
  3. Final Answer:

    number_of_shards -> Option A
  4. Quick Check:

    Primary shards = number_of_shards [OK]
Hint: Primary shards set by number_of_shards [OK]
Common Mistakes:
  • Confusing replicas with shards
  • Using shard_size which is not a setting
  • Mixing index refresh with shard count
3. Given an index with 5 primary shards and each shard sized at 20GB, what is the total data size stored in the index?
medium
A. 20GB
B. 100GB
C. 25GB
D. 5GB

Solution

  1. Step 1: Calculate total size from shards

    Total size = number of shards x size per shard = 5 x 20GB = 100GB.
  2. Step 2: Confirm no replicas included

    Replicas add copies but do not affect primary data size calculation here.
  3. Final Answer:

    100GB -> Option B
  4. Quick Check:

    5 shards x 20GB = 100GB [OK]
Hint: Multiply shards by shard size [OK]
Common Mistakes:
  • Adding replica size to primary data size
  • Confusing shard count with replica count
  • Choosing shard size instead of total
4. You set number_of_shards to 1 but your data size grows to 200GB. What is the main problem with this shard sizing?
medium
A. Index refresh interval is too short
B. Too many shards causing overhead
C. Replica count is zero
D. Shard size is too large, causing slower search and indexing

Solution

  1. Step 1: Analyze shard size impact

    One shard holding 200GB is large and can slow down search and indexing.
  2. Step 2: Identify correct problem

    Too few shards for large data causes performance issues, not replica count or refresh interval.
  3. Final Answer:

    Shard size is too large, causing slower search and indexing -> Option D
  4. Quick Check:

    Large shard size = slower performance [OK]
Hint: Avoid very large single shards [OK]
Common Mistakes:
  • Blaming replica count instead of shard size
  • Thinking many shards cause this problem
  • Ignoring shard size impact on speed
5. You have 500GB of data and want to keep shard sizes between 10GB and 40GB. Which shard count is best to set for your index?
hard
A. 5 shards
B. 10 shards
C. 50 shards
D. 100 shards

Solution

  1. Step 1: Calculate shard count range

    Minimum shards = 500GB / 40GB โ‰ˆ 13 shards; maximum shards = 500GB / 10GB = 50 shards.
  2. Step 2: Choose shard count within range

    To keep shard size between 10GB and 40GB, choose a shard count near 50.
  3. Final Answer:

    50 shards -> Option C
  4. Quick Check:

    500GB รท 50 shards = 10GB per shard [OK]
Hint: Divide total data by desired shard size [OK]
Common Mistakes:
  • Choosing too few shards causing large shard size
  • Choosing too many shards causing overhead
  • Ignoring shard size limits