0
0
DynamodbComparisonBeginner · 4 min read

Query vs Scan in DynamoDB: Key Differences and When to Use Each

In DynamoDB, a Query finds items based on primary key values and is efficient and fast, while a Scan reads every item in a table and is slower and more costly. Use Query when you know the partition key, and Scan when you need to examine all items without key constraints.
⚖️

Quick Comparison

This table summarizes the main differences between Query and Scan operations in DynamoDB.

FactorQueryScan
PurposeRetrieve items by primary keyRead all items in a table
PerformanceFast and efficientSlower, reads entire table
CostLower (reads fewer items)Higher (reads all items)
Filter CapabilityFilters after key lookupFilters after full scan
Use CaseKnown partition key queriesFull table analysis or unknown keys
Result SizeLimited to matching keysCan return all items
⚖️

Key Differences

A Query operation in DynamoDB requires specifying the partition key and optionally a sort key to efficiently retrieve matching items. It uses indexes to quickly locate data, making it much faster and cheaper than scanning the entire table. Filters can be applied after the key lookup to narrow results further, but the main selection is based on keys.

In contrast, a Scan operation reads every item in the table, which can be very slow and costly for large datasets. It does not require any key information and is useful when you need to examine all data or when you don't know the keys. Filters applied during a scan only reduce the data returned but do not reduce the read cost because all items are still read.

Because Query targets specific partitions, it is the preferred method for most lookups. Scan should be used sparingly, mainly for tasks like data migration, backups, or analytics where full table access is necessary.

⚖️

Code Comparison

Here is an example of a Query operation that retrieves all orders for a specific customer using their customer ID as the partition key.

javascript
const AWS = require('aws-sdk');
const dynamodb = new AWS.DynamoDB.DocumentClient();

const params = {
  TableName: 'Orders',
  KeyConditionExpression: 'CustomerId = :cid',
  ExpressionAttributeValues: {
    ':cid': '12345'
  }
};

dynamodb.query(params, (err, data) => {
  if (err) console.error('Query failed:', err);
  else console.log('Query succeeded:', data.Items);
});
Output
Query succeeded: [ { OrderId: 'A1', CustomerId: '12345', Amount: 50 }, { OrderId: 'A2', CustomerId: '12345', Amount: 75 } ]
↔️

Scan Equivalent

This example shows a Scan operation that reads all orders and filters those with an amount greater than 50 after reading the entire table.

javascript
const AWS = require('aws-sdk');
const dynamodb = new AWS.DynamoDB.DocumentClient();

const params = {
  TableName: 'Orders',
  FilterExpression: 'Amount > :amt',
  ExpressionAttributeValues: {
    ':amt': 50
  }
};

dynamodb.scan(params, (err, data) => {
  if (err) console.error('Scan failed:', err);
  else console.log('Scan succeeded:', data.Items);
});
Output
Scan succeeded: [ { OrderId: 'A2', CustomerId: '12345', Amount: 75 }, { OrderId: 'B1', CustomerId: '67890', Amount: 100 } ]
🎯

When to Use Which

Choose Query when you know the partition key and want fast, cost-effective access to specific items. It is ideal for most application lookups and real-time data retrieval.

Choose Scan only when you need to process or analyze the entire table, such as for reporting, backups, or when you do not have key information. Be aware that scans are slower and more expensive, so use them sparingly and consider pagination or parallel scans for large tables.

Key Takeaways

Use Query for fast, efficient lookups by partition key in DynamoDB.
Scan reads the entire table and is slower and more costly.
Filters in Query reduce returned data but not read cost; in Scan, filters apply after reading all items.
Prefer Query for most use cases; reserve Scan for full table operations or unknown keys.
Always design your table and access patterns to minimize the need for Scan.