Scan vs query performance in AWS - Performance Comparison
When working with AWS databases, it is important to know how fast different data retrieval methods work as data grows.
We want to understand how the time to get data changes when using scan versus query operations.
Analyze the time complexity of the following operation sequence.
# Query operation
aws dynamodb query --table-name MyTable --key-condition-expression "UserId = :userId" --expression-attribute-values '{":userId":{"S":"123"}}'
# Scan operation
aws dynamodb scan --table-name MyTable
The query fetches items matching a specific key, while the scan reads all items in the table.
- Primary operation: Reading items from the database.
- How many times: Query reads only matching items; Scan reads every item in the table.
As the number of items in the table grows, query reads only a small subset, but scan reads all items.
| Input Size (n) | Approx. Api Calls/Operations |
|---|---|
| 10 | Query: few calls; Scan: 10 calls |
| 100 | Query: few calls; Scan: 100 calls |
| 1000 | Query: few calls; Scan: 1000 calls |
Pattern observation: Query time stays mostly the same; Scan time grows directly with table size.
Time Complexity: O(n) for scan, O(1) for query
This means scan time grows with the number of items, but query time stays mostly constant for targeted lookups.
[X] Wrong: "Scan and query take the same time because they both read data from the table."
[OK] Correct: Scan reads every item, so time grows with table size. Query reads only matching items, so time stays small.
Understanding how scan and query scale helps you design efficient data access in cloud apps, a key skill for real projects.
"What if we added an index to the table? How would that affect the time complexity of queries and scans?"