How to Design Schema in DynamoDB: Best Practices and Examples
To design a schema in
DynamoDB, start by choosing a Partition Key that evenly distributes your data and optionally a Sort Key to organize related items. Use Secondary Indexes for additional query patterns and model your data to fit your application's access needs rather than traditional relational designs.Syntax
A DynamoDB table schema mainly consists of a Partition Key and optionally a Sort Key. You define these keys when creating the table. Secondary Indexes (Global or Local) can be added to support different query patterns.
Partition Key: Unique identifier for data distribution.Sort Key: Optional, organizes items with the same partition key.Global Secondary Index (GSI): Allows queries on non-key attributes across all partitions.Local Secondary Index (LSI): Allows queries on non-key attributes within the same partition.
sql
CREATE TABLE ExampleTable ( UserId STRING, -- Partition Key Timestamp NUMBER, -- Sort Key Data STRING, PRIMARY KEY (UserId, Timestamp) ); -- Adding a Global Secondary Index CREATE GLOBAL SECONDARY INDEX GSI1 ON ExampleTable (Data, Timestamp);
Example
This example shows how to design a DynamoDB table for a user activity log where UserId is the partition key and Timestamp is the sort key. This schema allows efficient queries for all activities of a user sorted by time.
python
import boto3 # Create DynamoDB client client = boto3.client('dynamodb') # Create table with partition and sort key response = client.create_table( TableName='UserActivity', KeySchema=[ {'AttributeName': 'UserId', 'KeyType': 'HASH'}, # Partition key {'AttributeName': 'Timestamp', 'KeyType': 'RANGE'} # Sort key ], AttributeDefinitions=[ {'AttributeName': 'UserId', 'AttributeType': 'S'}, {'AttributeName': 'Timestamp', 'AttributeType': 'N'} ], ProvisionedThroughput={ 'ReadCapacityUnits': 5, 'WriteCapacityUnits': 5 } ) print('Table status:', response['TableDescription']['TableStatus'])
Output
Table status: CREATING
Common Pitfalls
Common mistakes when designing DynamoDB schemas include:
- Choosing a partition key that causes hot partitions by uneven data distribution.
- Using relational database thinking, like many joins or complex transactions.
- Not planning for query patterns upfront, leading to inefficient scans.
- Overusing secondary indexes which can increase costs and complexity.
Correct design focuses on access patterns and data distribution.
sql
/* Wrong: Using a timestamp as partition key causes hot partitions */ CREATE TABLE Logs ( Timestamp NUMBER, -- Partition Key (bad choice) LogId STRING, -- Sort Key Message STRING, PRIMARY KEY (Timestamp, LogId) ); /* Right: Use a user or category as partition key to spread load */ CREATE TABLE Logs ( Category STRING, -- Partition Key (better choice) Timestamp NUMBER, -- Sort Key LogId STRING, Message STRING, PRIMARY KEY (Category, Timestamp) );
Quick Reference
| Concept | Description | Best Practice |
|---|---|---|
| Partition Key | Unique key to distribute data | Choose high-cardinality attribute |
| Sort Key | Organizes items within partition | Use for sorting or grouping related items |
| Global Secondary Index | Alternate query key across partitions | Use for different query patterns |
| Local Secondary Index | Alternate sort key within partition | Use for sorting/filtering in same partition |
| Data Modeling | Design based on access patterns | Denormalize and avoid joins |
Key Takeaways
Design your DynamoDB schema based on how your application queries data, not on relational models.
Choose a partition key that evenly distributes data to avoid hot partitions.
Use sort keys to organize related items and enable efficient range queries.
Add secondary indexes only when you need additional query flexibility.
Plan your access patterns early to optimize performance and cost.