GSI key selection strategy in DynamoDB - Time & Space Complexity
When using Global Secondary Indexes (GSIs) in DynamoDB, the way we pick keys affects how fast queries run.
We want to know how the choice of keys changes the work DynamoDB does as data grows.
Analyze the time complexity of querying a GSI with a chosen partition key and sort key.
const params = {
TableName: "Orders",
IndexName: "CustomerDateIndex",
KeyConditionExpression: "CustomerId = :cid AND OrderDate >= :start",
ExpressionAttributeValues: {
":cid": "12345",
":start": "2023-01-01"
}
};
const result = await dynamodb.query(params).promise();
This code queries a GSI named CustomerDateIndex using CustomerId as partition key and OrderDate as sort key.
Look for repeated work done during the query.
- Primary operation: Scanning items with the same partition key in the GSI.
- How many times: Once per matching partition key group; then filtering by sort key range.
As more items share the same partition key, the query checks more items in that group.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 items with same key | 10 checks |
| 100 items with same key | 100 checks |
| 1000 items with same key | 1000 checks |
Pattern observation: The work grows linearly with the number of items sharing the partition key.
Time Complexity: O(n)
This means the query time grows directly with how many items share the same partition key in the GSI.
[X] Wrong: "Choosing any partition key for the GSI will always give fast queries regardless of data distribution."
[OK] Correct: If many items share the same partition key, the query must check all those items, making it slower as data grows.
Understanding how GSI key choices affect query speed shows you can design databases that stay fast even as data grows.
What if we changed the GSI partition key to a more unique attribute? How would the time complexity change?