DynamodbHow-ToBeginner · 4 min read

How to Choose Partition Key in DynamoDB: Best Practices

Choose a partition key in DynamoDB that evenly distributes your data across partitions to avoid hotspots. It should have high cardinality (many unique values) and be frequently used in queries to optimize performance.

📐

Syntax

The partition key is part of the primary key in DynamoDB and uniquely identifies items in a table. It can be used alone or combined with a sort key to form a composite primary key.

Syntax for defining a partition key in a table creation:

PartitionKeyName: The attribute name used as the partition key.
PartitionKeyType: The data type of the partition key (e.g., String, Number).

json

{
  "TableName": "ExampleTable",
  "KeySchema": [
    { "AttributeName": "UserId", "KeyType": "HASH" }  
  ],
  "AttributeDefinitions": [
    { "AttributeName": "UserId", "AttributeType": "S" }
  ],
  "ProvisionedThroughput": {
    "ReadCapacityUnits": 5,
    "WriteCapacityUnits": 5
  }
}

💻

Example

This example shows how to choose a partition key for a user data table where UserId is the partition key. It ensures data is spread evenly because each user has a unique ID.

javascript

const AWS = require('aws-sdk');
const dynamodb = new AWS.DynamoDB();

const params = {
  TableName: 'Users',
  KeySchema: [
    { AttributeName: 'UserId', KeyType: 'HASH' }  // Partition key
  ],
  AttributeDefinitions: [
    { AttributeName: 'UserId', AttributeType: 'S' }
  ],
  ProvisionedThroughput: {
    ReadCapacityUnits: 5,
    WriteCapacityUnits: 5
  }
};

dynamodb.createTable(params, (err, data) => {
  if (err) console.log('Error:', err);
  else console.log('Table Created:', data.TableDescription.TableName);
});

Output

Table Created: Users

⚠️

Common Pitfalls

Choosing a poor partition key can cause uneven data distribution and slow performance. Common mistakes include:

Using low-cardinality keys like boolean or status fields, which cause hotspots.
Choosing keys that are not used in queries, making access inefficient.
Using monotonically increasing keys like timestamps, which concentrate writes on one partition.

Always pick a key with many unique values and that matches your query patterns.

json

/* Wrong: Using a boolean as partition key */
KeySchema: [
  { AttributeName: 'IsActive', KeyType: 'HASH' }
],
AttributeDefinitions: [
  { AttributeName: 'IsActive', AttributeType: 'BOOL' }
]

/* Right: Using UserId with many unique values */
KeySchema: [
  { AttributeName: 'UserId', KeyType: 'HASH' }
],
AttributeDefinitions: [
  { AttributeName: 'UserId', AttributeType: 'S' }
]

📊

Quick Reference

Tip	Explanation
High Cardinality	Choose keys with many unique values to spread data evenly.
Query Usage	Pick keys that your application queries often for fast lookups.
Avoid Hotspots	Do not use keys with few values or sequential patterns.
Composite Keys	Use sort keys to organize data within partitions if needed.

✅

Key Takeaways

Pick a partition key with many unique values to avoid uneven data distribution.

Ensure the partition key matches your query patterns for efficient access.

Avoid keys with low cardinality or sequential values to prevent hotspots.

Use composite keys with sort keys when you need to organize data within partitions.