Sparse Index in DynamoDB: What It Is and How It Works
sparse index in DynamoDB is a secondary index that only includes items with a specific attribute, making it smaller and faster to query. It works by indexing only those items that have the indexed attribute, unlike a regular index that includes all table items.How It Works
Imagine you have a big filing cabinet with many folders, but you only want to quickly find folders that have a special sticker on them. A sparse index in DynamoDB works like a smaller cabinet that only holds folders with that sticker. This means you don't have to search through everything, just the important ones.
Technically, a sparse index only contains entries for items in the main table that have the attribute used as the index key. If an item does not have that attribute, it won't appear in the sparse index. This reduces the size of the index and speeds up queries that target those specific items.
Example
status set to 'active' are indexed.aws dynamodb create-table \
--table-name Users \
--attribute-definitions \
AttributeName=UserId,AttributeType=S \
AttributeName=Status,AttributeType=S \
--key-schema AttributeName=UserId,KeyType=HASH \
--provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5 \
--global-secondary-indexes '[
{
"IndexName": "ActiveUsersIndex",
"KeySchema": [
{"AttributeName":"Status","KeyType":"HASH"}
],
"Projection": {"ProjectionType":"ALL"},
"ProvisionedThroughput": {"ReadCapacityUnits":5,"WriteCapacityUnits":5}
}
]'
# Insert items
aws dynamodb put-item --table-name Users --item '{"UserId": {"S": "1"}, "Name": {"S": "Alice"}, "Status": {"S": "active"}}'
aws dynamodb put-item --table-name Users --item '{"UserId": {"S": "2"}, "Name": {"S": "Bob"}}'
# Query the sparse index for active users
aws dynamodb query \
--table-name Users \
--index-name ActiveUsersIndex \
--key-condition-expression "Status = :status" \
--expression-attribute-values '{":status":{"S":"active"}}'When to Use
Use a sparse index when you want to efficiently query only a subset of items that share a common attribute. For example, if you have a table of users but only want to quickly find those who are currently active, a sparse index on the status attribute with value 'active' is ideal.
This approach saves storage and speeds up queries because the index excludes all items without the attribute, reducing the amount of data DynamoDB scans.
Key Points
- A sparse index only includes items with the indexed attribute present.
- It reduces index size and improves query performance for selective data.
- Useful for filtering items by a specific attribute value like status or category.
- Works with global secondary indexes (GSIs) in DynamoDB.