0
0
DynamoDBquery~10 mins

GSI key selection strategy in DynamoDB - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - GSI key selection strategy
Identify access patterns
Choose GSI partition key
Choose GSI sort key (optional)
Check key uniqueness and distribution
Validate query efficiency
Create GSI with chosen keys
This flow shows how to select keys for a Global Secondary Index (GSI) by analyzing access patterns, choosing partition and sort keys, and validating their effectiveness.
Execution Sample
DynamoDB
{
  "TableName": "Orders",
  "GlobalSecondaryIndexUpdates": [{
    "Create": {
      "IndexName": "GSI1",
      "KeySchema": [
        {
          "AttributeName": "user_id",
          "KeyType": "HASH"
        },
        {
          "AttributeName": "order_date",
          "KeyType": "RANGE"
        }
      ],
      "Projection": {
        "ProjectionType": "ALL"
      }
    }
  }]
}
This JSON payload for the DynamoDB UpdateTable API creates a GSI named GSI1 with user_id as partition key and order_date as sort key to support queries by user and date.
Execution Table
StepActionEvaluationResult
1Identify access pattern: Query orders by userAccess pattern foundProceed to key selection
2Choose partition key: user_iduser_id has high cardinalityGood distribution expected
3Choose sort key: order_dateorder_date allows sorting by dateEnables range queries
4Check uniquenessuser_id + order_date combination mostly uniquePrevents write conflicts
5Validate query efficiencyQueries use GSI keysEfficient query execution
6Create GSI with keysGSI created successfullyReady for queries
💡 GSI created after validating keys for efficient queries and good data distribution
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
partition_keynoneuser_iduser_iduser_iduser_id
sort_keynonenoneorder_dateorder_dateorder_date
key_uniquenessunknownunknownunknownmostly uniquemostly unique
query_efficiencyunknownunknownunknownunknownefficient
Key Moments - 3 Insights
Why do we choose a partition key with high cardinality?
Choosing a partition key with many unique values (high cardinality) helps distribute data evenly across partitions, avoiding hot spots. See execution_table step 2 where user_id is chosen for this reason.
Why is a sort key optional in a GSI?
A sort key is optional because it is only needed if you want to sort or filter data within the same partition key. Step 3 shows adding order_date as sort key to enable sorting by date.
What happens if the key combination is not unique?
If the partition and sort key combination is not unique, it can cause write failures due to duplicate primary keys. Step 4 checks for uniqueness to avoid this problem.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what key is chosen as the partition key at step 2?
Auser_id
Border_date
CGSI1
DTable name
💡 Hint
Refer to execution_table row with Step 2 under 'Action' and 'Result'
At which step is the sort key selected?
AStep 1
BStep 2
CStep 3
DStep 5
💡 Hint
Check execution_table row where 'Choose sort key' is the action
If the partition key had low cardinality, what would likely happen?
AData would be evenly distributed
BHot partitions could form causing performance issues
CQueries would be faster
DGSI creation would fail
💡 Hint
See key_moments explanation about partition key cardinality and execution_table step 2
Concept Snapshot
GSI Key Selection Strategy:
1. Identify query access patterns.
2. Choose a partition key with high cardinality for even data spread.
3. Optionally choose a sort key for sorting/filtering.
4. Ensure key combination uniqueness to avoid write conflicts.
5. Validate query efficiency before creating the GSI.
Full Transcript
To select keys for a DynamoDB Global Secondary Index (GSI), first identify how you want to query your data. Then pick a partition key that has many unique values to spread data evenly. Optionally, pick a sort key to sort or filter data within the partition. Check that the combination of partition and sort keys is mostly unique to avoid write conflicts. Finally, validate that your queries will efficiently use the GSI before creating it.