Query vs Scan in DynamoDB: Key Differences and When to Use Each
Query finds items based on primary key values and is efficient and fast, while a Scan reads every item in a table and is slower and more costly. Use Query when you know the partition key, and Scan when you need to examine all items without key constraints.Quick Comparison
This table summarizes the main differences between Query and Scan operations in DynamoDB.
| Factor | Query | Scan |
|---|---|---|
| Purpose | Retrieve items by primary key | Read all items in a table |
| Performance | Fast and efficient | Slower, reads entire table |
| Cost | Lower (reads fewer items) | Higher (reads all items) |
| Filter Capability | Filters after key lookup | Filters after full scan |
| Use Case | Known partition key queries | Full table analysis or unknown keys |
| Result Size | Limited to matching keys | Can return all items |
Key Differences
A Query operation in DynamoDB requires specifying the partition key and optionally a sort key to efficiently retrieve matching items. It uses indexes to quickly locate data, making it much faster and cheaper than scanning the entire table. Filters can be applied after the key lookup to narrow results further, but the main selection is based on keys.
In contrast, a Scan operation reads every item in the table, which can be very slow and costly for large datasets. It does not require any key information and is useful when you need to examine all data or when you don't know the keys. Filters applied during a scan only reduce the data returned but do not reduce the read cost because all items are still read.
Because Query targets specific partitions, it is the preferred method for most lookups. Scan should be used sparingly, mainly for tasks like data migration, backups, or analytics where full table access is necessary.
Code Comparison
Here is an example of a Query operation that retrieves all orders for a specific customer using their customer ID as the partition key.
const AWS = require('aws-sdk'); const dynamodb = new AWS.DynamoDB.DocumentClient(); const params = { TableName: 'Orders', KeyConditionExpression: 'CustomerId = :cid', ExpressionAttributeValues: { ':cid': '12345' } }; dynamodb.query(params, (err, data) => { if (err) console.error('Query failed:', err); else console.log('Query succeeded:', data.Items); });
Scan Equivalent
This example shows a Scan operation that reads all orders and filters those with an amount greater than 50 after reading the entire table.
const AWS = require('aws-sdk'); const dynamodb = new AWS.DynamoDB.DocumentClient(); const params = { TableName: 'Orders', FilterExpression: 'Amount > :amt', ExpressionAttributeValues: { ':amt': 50 } }; dynamodb.scan(params, (err, data) => { if (err) console.error('Scan failed:', err); else console.log('Scan succeeded:', data.Items); });
When to Use Which
Choose Query when you know the partition key and want fast, cost-effective access to specific items. It is ideal for most application lookups and real-time data retrieval.
Choose Scan only when you need to process or analyze the entire table, such as for reporting, backups, or when you do not have key information. Be aware that scans are slower and more expensive, so use them sparingly and consider pagination or parallel scans for large tables.