0
0
DynamoDBquery~15 mins

Unprocessed items handling in DynamoDB - Deep Dive

Choose your learning style9 modes available
Overview - Unprocessed items handling
What is it?
Unprocessed items handling in DynamoDB is about managing items that were not processed during batch write or batch get operations. When you send multiple items to DynamoDB in one request, some items might not be processed due to capacity limits or throttling. This concept helps you detect those items and retry them to ensure all your data operations complete successfully.
Why it matters
Without handling unprocessed items, some data changes might be lost silently, leading to incomplete or inconsistent data in your database. This can cause errors in your application, wrong reports, or lost user data. Properly managing unprocessed items ensures reliability and data integrity in systems that use batch operations.
Where it fits
Before learning this, you should understand basic DynamoDB operations like PutItem, BatchWriteItem, and BatchGetItem. After mastering unprocessed items handling, you can explore advanced topics like exponential backoff retries, error handling patterns, and optimizing throughput for batch operations.
Mental Model
Core Idea
Unprocessed items handling is the process of detecting and retrying items that DynamoDB could not process in batch requests to ensure no data is lost.
Think of it like...
Imagine sending a batch of letters through the mail, but some letters get returned because the mailbox was full. You collect those returned letters and resend them until all are delivered.
┌───────────────────────────────┐
│ Batch Request with Multiple   │
│ Items to DynamoDB             │
└──────────────┬────────────────┘
               │
               ▼
┌──────────────┴───────────────┐
│ DynamoDB Processes Items      │
│ ┌───────────────┐             │
│ │Processed Items│             │
│ └───────────────┘             │
│ ┌─────────────────────┐       │
│ │Unprocessed Items Set │◄──────┤
│ └─────────────────────┘       │
└──────────────┬───────────────┘
               │
               ▼
┌──────────────┴───────────────┐
│ Client Retries Unprocessed    │
│ Items Until All Are Processed │
└───────────────────────────────┘
Build-Up - 6 Steps
1
FoundationBasics of Batch Operations
🤔
Concept: Batch operations allow sending multiple items in one request to DynamoDB.
DynamoDB supports BatchWriteItem and BatchGetItem to process multiple items at once. This reduces the number of network calls and improves efficiency. However, these batch operations have limits on size and throughput.
Result
You can send up to 25 items or 16 MB of data in a single batch request.
Understanding batch operations is essential because unprocessed items only occur in these multi-item requests, not single-item operations.
2
FoundationWhat Are Unprocessed Items?
🤔
Concept: Unprocessed items are those that DynamoDB did not handle in a batch request due to capacity or throttling.
When you send a batch request, DynamoDB tries to process all items. If it cannot process some items because of limits, it returns those items as 'UnprocessedItems' in the response. These items were not written or read.
Result
The response includes a list of unprocessed items you need to handle.
Knowing that DynamoDB explicitly tells you which items were not processed helps you avoid silent data loss.
3
IntermediateDetecting Unprocessed Items in Responses
🤔Before reading on: do you think DynamoDB automatically retries unprocessed items for you, or do you need to handle them yourself? Commit to your answer.
Concept: You must check the response for unprocessed items and decide how to retry them.
After a batch request, inspect the 'UnprocessedItems' field in the response. If it is empty, all items were processed. If not, it contains the items you need to resend. DynamoDB does not retry automatically.
Result
You identify exactly which items need to be retried.
Understanding that retrying unprocessed items is your responsibility prevents data inconsistencies.
4
IntermediateRetrying Unprocessed Items Safely
🤔Before reading on: do you think retrying unprocessed items immediately in a tight loop is a good idea, or should you wait between retries? Commit to your answer.
Concept: Retries should be done with delays to avoid repeated throttling.
When retrying unprocessed items, use exponential backoff: wait a short time before the first retry, then increase the wait time for subsequent retries. This reduces load and gives DynamoDB time to recover capacity.
Result
Retries succeed more often and reduce throttling errors.
Knowing how to retry with backoff improves reliability and prevents overwhelming the database.
5
AdvancedHandling Partial Success in Batch Writes
🤔Before reading on: do you think a batch write is atomic (all or nothing), or can some items succeed while others fail? Commit to your answer.
Concept: Batch writes are not atomic; some items can succeed while others remain unprocessed.
In BatchWriteItem, DynamoDB processes items individually. Some may succeed, others may be unprocessed. You must handle partial success by retrying only unprocessed items, not the entire batch.
Result
You avoid duplicating writes and ensure all items are eventually written.
Understanding partial success prevents data duplication and wasted effort.
6
ExpertOptimizing Unprocessed Items Handling in Production
🤔Before reading on: do you think always retrying unprocessed items immediately is best, or can batching retries improve performance? Commit to your answer.
Concept: Batching retries and monitoring throughput helps optimize performance and cost.
In production, collect unprocessed items from multiple batch responses and retry them together in a new batch. Monitor consumed capacity and adjust request rates. Use adaptive backoff and jitter to avoid retry storms.
Result
Your application handles high loads efficiently without excessive retries or throttling.
Knowing how to optimize retries at scale improves system stability and cost-effectiveness.
Under the Hood
DynamoDB processes batch requests by attempting each item individually within the batch. If the provisioned throughput or internal limits are exceeded, it stops processing some items and returns them as unprocessed. This prevents overloading the system and maintains overall performance. The client must then retry these items.
Why designed this way?
DynamoDB prioritizes availability and performance over atomic batch operations. Returning unprocessed items allows the system to gracefully handle load spikes without failing entire batch requests. This design balances throughput and reliability in a distributed environment.
┌───────────────┐
│ Client Sends  │
│ Batch Request │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ DynamoDB Node │
│ Processes     │
│ Items One-by- │
│ One           │
└──────┬────────┘
       │
       ├─► Processed Items
       │
       └─► Unprocessed Items (due to limits)
       │
       ▼
┌───────────────┐
│ Response to   │
│ Client with   │
│ Unprocessed   │
│ Items List    │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Do you think DynamoDB automatically retries unprocessed items for you? Commit to yes or no.
Common Belief:DynamoDB automatically retries unprocessed items until they succeed.
Tap to reveal reality
Reality:DynamoDB returns unprocessed items to the client; the client must retry them explicitly.
Why it matters:Assuming automatic retries leads to lost data or incomplete operations if the client does not handle retries.
Quick: Is a batch write operation atomic, meaning all items succeed or all fail? Commit to yes or no.
Common Belief:Batch write operations are atomic; either all items succeed or none do.
Tap to reveal reality
Reality:Batch writes are not atomic; some items can succeed while others remain unprocessed.
Why it matters:Believing in atomicity causes incorrect error handling and potential data duplication.
Quick: Do you think retrying unprocessed items immediately without delay is best? Commit to yes or no.
Common Belief:Retrying unprocessed items immediately in a tight loop is the best way to ensure success.
Tap to reveal reality
Reality:Immediate retries can cause repeated throttling; exponential backoff with delays is recommended.
Why it matters:Ignoring backoff leads to wasted resources, increased latency, and possible request failures.
Quick: Do you think unprocessed items only happen when the batch size is too large? Commit to yes or no.
Common Belief:Unprocessed items only occur if the batch request exceeds size limits.
Tap to reveal reality
Reality:Unprocessed items can occur due to throughput limits or temporary throttling, even with small batches.
Why it matters:Assuming size is the only cause leads to ignoring throughput management and retry strategies.
Expert Zone
1
Unprocessed items can vary between retries due to changing capacity, so retry logic must be dynamic and adaptive.
2
Using jitter (randomized delay) in backoff prevents retry storms when many clients retry simultaneously.
3
Monitoring consumed capacity and throttling metrics helps tune batch sizes and retry intervals for optimal performance.
When NOT to use
Unprocessed items handling is not needed for single-item operations like PutItem or GetItem. For transactional operations requiring atomicity, use DynamoDB transactions instead of batch writes.
Production Patterns
In production, developers implement retry queues that collect unprocessed items and retry them asynchronously with exponential backoff and jitter. They also monitor CloudWatch metrics to adjust throughput and batch sizes dynamically.
Connections
Exponential Backoff
Builds-on retry strategy for unprocessed items
Understanding exponential backoff helps implement efficient retries that reduce throttling and improve success rates.
Distributed Systems Throttling
Same pattern of handling load limits and retrying requests
Knowing how distributed systems throttle requests clarifies why unprocessed items occur and how to handle them gracefully.
Postal Mail Delivery
Analogy for retrying undelivered items
Recognizing that unprocessed items are like returned mail helps grasp the importance of retrying to ensure delivery.
Common Pitfalls
#1Ignoring unprocessed items and assuming all batch items succeeded.
Wrong approach:response = dynamodb.batch_write_item(RequestItems=batch) # No check for unprocessed items
Correct approach:response = dynamodb.batch_write_item(RequestItems=batch) while 'UnprocessedItems' in response and response['UnprocessedItems']: response = dynamodb.batch_write_item(RequestItems=response['UnprocessedItems'])
Root cause:Misunderstanding that batch operations can partially fail and that unprocessed items must be retried.
#2Retrying unprocessed items immediately in a tight loop without delay.
Wrong approach:while unprocessed_items: response = dynamodb.batch_write_item(RequestItems=unprocessed_items) unprocessed_items = response.get('UnprocessedItems', {})
Correct approach:import time import random retry_delay = 0.1 while unprocessed_items: time.sleep(retry_delay) response = dynamodb.batch_write_item(RequestItems=unprocessed_items) unprocessed_items = response.get('UnprocessedItems', {}) retry_delay = min(retry_delay * 2, 5) + random.uniform(0, 0.1)
Root cause:Not applying exponential backoff and jitter leads to repeated throttling and inefficient retries.
#3Retrying the entire batch instead of only unprocessed items.
Wrong approach:response = dynamodb.batch_write_item(RequestItems=batch) if response['UnprocessedItems']: response = dynamodb.batch_write_item(RequestItems=batch) # retries whole batch
Correct approach:response = dynamodb.batch_write_item(RequestItems=batch) if response['UnprocessedItems']: response = dynamodb.batch_write_item(RequestItems=response['UnprocessedItems']) # retries only unprocessed
Root cause:Confusing partial success with full failure causes unnecessary duplicate writes and wasted resources.
Key Takeaways
Unprocessed items occur when DynamoDB cannot process all items in a batch due to capacity or throttling limits.
DynamoDB returns unprocessed items in the response; clients must detect and retry them explicitly.
Retries should use exponential backoff with jitter to avoid repeated throttling and improve success.
Batch write operations are not atomic; some items can succeed while others remain unprocessed.
Proper handling of unprocessed items ensures data integrity and reliability in batch operations.