How to Implement Time Series Data in DynamoDB
To implement
time series data in DynamoDB, use a partition key to group related data (like device ID) and a sort key to store timestamps in ascending or descending order. This design allows efficient queries by time range. Use TTL (Time To Live) to automatically delete old data and keep your table size manageable.Syntax
In DynamoDB, time series data is stored using a composite primary key:
- Partition Key: Groups data by an entity, e.g., device ID or sensor ID.
- Sort Key: Stores the timestamp, allowing sorting and range queries.
This lets you efficiently query data for a specific entity over a time range.
Example table schema:
PartitionKey (string): e.g., 'device123'SortKey (string or number): e.g., ISO 8601 timestamp '2024-06-01T12:00:00Z'- Other attributes: sensor readings, metadata
sql
CREATE TABLE TimeSeriesData ( DeviceID STRING, Timestamp STRING, Temperature NUMBER, Humidity NUMBER, PRIMARY KEY (DeviceID, Timestamp) );
Example
This example shows how to insert and query time series data for a device using AWS SDK for JavaScript (v3).
It inserts temperature readings with timestamps and queries data for a device within a time range.
javascript
import { DynamoDBClient, PutItemCommand, QueryCommand } from "@aws-sdk/client-dynamodb"; const client = new DynamoDBClient({ region: "us-east-1" }); async function insertReading(deviceId, timestamp, temperature) { const params = { TableName: "TimeSeriesData", Item: { DeviceID: { S: deviceId }, Timestamp: { S: timestamp }, Temperature: { N: temperature.toString() } } }; await client.send(new PutItemCommand(params)); } async function queryReadings(deviceId, startTime, endTime) { const params = { TableName: "TimeSeriesData", KeyConditionExpression: "DeviceID = :deviceId AND Timestamp BETWEEN :start AND :end", ExpressionAttributeValues: { ":deviceId": { S: deviceId }, ":start": { S: startTime }, ":end": { S: endTime } } }; const data = await client.send(new QueryCommand(params)); return data.Items; } (async () => { await insertReading("device123", "2024-06-01T12:00:00Z", 22.5); await insertReading("device123", "2024-06-01T12:05:00Z", 23.0); const readings = await queryReadings("device123", "2024-06-01T12:00:00Z", "2024-06-01T12:10:00Z"); console.log(readings); })();
Output
[
{
DeviceID: { S: 'device123' },
Timestamp: { S: '2024-06-01T12:00:00Z' },
Temperature: { N: '22.5' }
},
{
DeviceID: { S: 'device123' },
Timestamp: { S: '2024-06-01T12:05:00Z' },
Temperature: { N: '23.0' }
}
]
Common Pitfalls
Common mistakes when implementing time series data in DynamoDB include:
- Using only a partition key: This causes all data to be in one partition, leading to performance bottlenecks.
- Not sorting by timestamp: Without a sort key, you cannot efficiently query by time range.
- Storing timestamps as numbers without consistent format: This can break sorting and querying.
- Ignoring TTL: Without automatic expiration, your table grows indefinitely, increasing costs.
Correct approach example:
PRIMARY KEY (DeviceID, Timestamp) -- Timestamp as ISO 8601 string for lexicographical sorting
Wrong approach example:
PRIMARY KEY (Timestamp) -- No partition key, causes hot partitions
sql
/* Wrong: No partition key, only timestamp as primary key */ CREATE TABLE TimeSeriesData ( Timestamp STRING, Temperature NUMBER, PRIMARY KEY (Timestamp) ); /* Right: Composite key with partition and sort key */ CREATE TABLE TimeSeriesData ( DeviceID STRING, Timestamp STRING, Temperature NUMBER, PRIMARY KEY (DeviceID, Timestamp) );
Quick Reference
- Partition Key: Use an entity ID like device or sensor ID.
- Sort Key: Use ISO 8601 timestamp strings for sorting.
- TTL: Enable Time To Live to auto-delete old data.
- Query: Use
KeyConditionExpressionwithBETWEENfor time ranges. - Indexes: Use Global Secondary Indexes if you need alternate query patterns.
Key Takeaways
Use a composite primary key with partition key as entity ID and sort key as timestamp for efficient time series queries.
Store timestamps as ISO 8601 strings to maintain correct sorting order.
Enable TTL to automatically remove old time series data and control table size.
Avoid using only a partition key or only a sort key to prevent performance issues.
Use KeyConditionExpression with BETWEEN to query data within time ranges.