0
0
DynamodbHow-ToBeginner · 4 min read

How to Design Schema in DynamoDB: Best Practices and Examples

To design a schema in DynamoDB, start by choosing a Partition Key that evenly distributes your data and optionally a Sort Key to organize related items. Use Secondary Indexes for additional query patterns and model your data to fit your application's access needs rather than traditional relational designs.
📐

Syntax

A DynamoDB table schema mainly consists of a Partition Key and optionally a Sort Key. You define these keys when creating the table. Secondary Indexes (Global or Local) can be added to support different query patterns.

  • Partition Key: Unique identifier for data distribution.
  • Sort Key: Optional, organizes items with the same partition key.
  • Global Secondary Index (GSI): Allows queries on non-key attributes across all partitions.
  • Local Secondary Index (LSI): Allows queries on non-key attributes within the same partition.
sql
CREATE TABLE ExampleTable (
  UserId STRING,          -- Partition Key
  Timestamp NUMBER,       -- Sort Key
  Data STRING,
  PRIMARY KEY (UserId, Timestamp)
);

-- Adding a Global Secondary Index
CREATE GLOBAL SECONDARY INDEX GSI1 ON ExampleTable (Data, Timestamp);
💻

Example

This example shows how to design a DynamoDB table for a user activity log where UserId is the partition key and Timestamp is the sort key. This schema allows efficient queries for all activities of a user sorted by time.

python
import boto3

# Create DynamoDB client
client = boto3.client('dynamodb')

# Create table with partition and sort key
response = client.create_table(
    TableName='UserActivity',
    KeySchema=[
        {'AttributeName': 'UserId', 'KeyType': 'HASH'},  # Partition key
        {'AttributeName': 'Timestamp', 'KeyType': 'RANGE'}  # Sort key
    ],
    AttributeDefinitions=[
        {'AttributeName': 'UserId', 'AttributeType': 'S'},
        {'AttributeName': 'Timestamp', 'AttributeType': 'N'}
    ],
    ProvisionedThroughput={
        'ReadCapacityUnits': 5,
        'WriteCapacityUnits': 5
    }
)

print('Table status:', response['TableDescription']['TableStatus'])
Output
Table status: CREATING
⚠️

Common Pitfalls

Common mistakes when designing DynamoDB schemas include:

  • Choosing a partition key that causes hot partitions by uneven data distribution.
  • Using relational database thinking, like many joins or complex transactions.
  • Not planning for query patterns upfront, leading to inefficient scans.
  • Overusing secondary indexes which can increase costs and complexity.

Correct design focuses on access patterns and data distribution.

sql
/* Wrong: Using a timestamp as partition key causes hot partitions */
CREATE TABLE Logs (
  Timestamp NUMBER,       -- Partition Key (bad choice)
  LogId STRING,           -- Sort Key
  Message STRING,
  PRIMARY KEY (Timestamp, LogId)
);

/* Right: Use a user or category as partition key to spread load */
CREATE TABLE Logs (
  Category STRING,        -- Partition Key (better choice)
  Timestamp NUMBER,       -- Sort Key
  LogId STRING,
  Message STRING,
  PRIMARY KEY (Category, Timestamp)
);
📊

Quick Reference

ConceptDescriptionBest Practice
Partition KeyUnique key to distribute dataChoose high-cardinality attribute
Sort KeyOrganizes items within partitionUse for sorting or grouping related items
Global Secondary IndexAlternate query key across partitionsUse for different query patterns
Local Secondary IndexAlternate sort key within partitionUse for sorting/filtering in same partition
Data ModelingDesign based on access patternsDenormalize and avoid joins

Key Takeaways

Design your DynamoDB schema based on how your application queries data, not on relational models.
Choose a partition key that evenly distributes data to avoid hot partitions.
Use sort keys to organize related items and enable efficient range queries.
Add secondary indexes only when you need additional query flexibility.
Plan your access patterns early to optimize performance and cost.