Overview - Batch limits and retries

What is it?

Batch limits and retries in DynamoDB refer to the rules and mechanisms that control how many items you can process at once and how the system handles requests that don't succeed immediately. When you send multiple read or write requests together, DynamoDB limits the number of items per batch to keep the system stable. If some requests fail, retries help to try those requests again automatically or manually.

Why it matters

Without batch limits, sending too many requests at once could overload the database, causing slowdowns or failures. Without retries, failed requests would be lost, leading to incomplete data operations and errors in your application. These controls ensure your app stays fast, reliable, and consistent even when many users access data simultaneously.

Where it fits

Before learning batch limits and retries, you should understand basic DynamoDB operations like single-item reads and writes. After this, you can explore advanced topics like error handling, exponential backoff, and optimizing throughput for large-scale applications.

Mental Model

Core Idea

Batch limits set the maximum number of items per request to protect the database, and retries automatically handle failed requests to ensure data operations complete successfully.

Think of it like...

Imagine sending packages through a mail service that only allows a certain number of parcels per shipment to avoid overloading the delivery truck. If some parcels get lost or delayed, the service automatically resends those parcels until they arrive safely.

┌───────────────────────────────┐
│        Client Request          │
│  (Batch of items to process)  │
└──────────────┬────────────────┘
               │
               ▼
┌───────────────────────────────┐
│      DynamoDB Batch Limit      │
│  Max items per batch enforced  │
└──────────────┬────────────────┘
               │
               ▼
┌───────────────────────────────┐
│    Processed Items + Unprocessed│
│          (if any failures)     │
└──────────────┬────────────────┘
               │
               ▼
┌───────────────────────────────┐
│          Retry Logic           │
│  Resend unprocessed items     │
└───────────────────────────────┘

Build-Up - 6 Steps

1

FoundationUnderstanding DynamoDB Batch Operations

Concept: Learn what batch operations are and why they exist in DynamoDB.

DynamoDB allows you to read or write multiple items in a single request using batch operations. BatchGetItem lets you retrieve up to 100 items, and BatchWriteItem lets you write or delete up to 25 items at once. This helps reduce the number of network calls and improves efficiency.

Result

You can send multiple items in one request, reducing overhead and speeding up your app.

Knowing batch operations lets you handle many items efficiently instead of one by one.

2

FoundationLimits on Batch Size in DynamoDB

3

IntermediateHandling Unprocessed Items in Batches

4

IntermediateImplementing Retry Logic for Unprocessed Items

5

AdvancedOptimizing Batch Sizes for Throughput

6

ExpertAdvanced Retry Patterns and Error Handling

Under the Hood

DynamoDB processes batch requests by splitting them internally into smaller units that fit capacity limits. If the system is busy or capacity is exceeded, it returns unprocessed items instead of failing the whole batch. The client is responsible for retrying these unprocessed items. This design balances throughput and availability by preventing overload and enabling eventual consistency.

Why designed this way?

Batch limits and retries were designed to protect DynamoDB's performance and stability. Allowing unlimited batch sizes could cause resource exhaustion and slowdowns. Returning unprocessed items instead of failing entire requests lets clients handle retries flexibly. This approach supports high availability and scales well under heavy load.

┌───────────────┐
│ Client sends  │
│ batch request │
└──────┬────────┘
       │
       ▼
┌─────────────────────────┐
│ DynamoDB internal split │
│   into smaller units    │
└──────┬────────┬─────────┘
       │        │
       ▼        ▼
┌───────────┐ ┌───────────┐
│ Processed │ │ Unprocessed│
│  items    │ │  items    │
└────┬──────┘ └────┬──────┘
     │             │
     ▼             ▼
┌───────────────┐  ┌───────────────┐
│ Return result │  │ Client retries │
│ to client     │  │ unprocessed   │
└───────────────┘  │ items later   │
                   └───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Do you think DynamoDB processes all items in a batch request at once without any unprocessed items? Commit yes or no.

Common Belief:Batch requests always process every item successfully in one go.

Tap to reveal reality

Quick: Do you think retrying unprocessed items immediately without delay is best? Commit yes or no.

Common Belief:Retrying unprocessed items immediately is the fastest and best approach.

Tap to reveal reality

Quick: Do you think sending the largest possible batch size always improves performance? Commit yes or no.

Common Belief:Always send the maximum batch size allowed to maximize throughput.

Tap to reveal reality

Quick: Do you think all retry failures are caused by throttling? Commit yes or no.

Common Belief:Retries fail only because of throttling issues.

Tap to reveal reality

Expert Zone

1

DynamoDB's unprocessed items response is a soft failure, allowing partial success and flexible client-side retry strategies.

2

Using jitter in retry delays prevents synchronized retry storms that can cause cascading throttling.

3

BatchWriteItem combines PutItem and DeleteItem requests, but each counts toward the 25-item limit, requiring careful batch composition.

When NOT to use

Batch operations are not suitable for transactional or strongly consistent operations requiring atomicity; use DynamoDB transactions instead. Also, for very large datasets, consider parallel scans or streams for efficient processing.

Production Patterns

In production, developers implement exponential backoff with jitter for retries, monitor unprocessed item rates to adjust batch sizes dynamically, and combine batch operations with conditional writes to maintain data integrity.

Connections

Exponential Backoff

Batch retries use exponential backoff to space out retry attempts.

Understanding exponential backoff helps optimize retry timing and avoid overwhelming the database.

Network Protocols

Batch limits and retries resemble flow control and retransmission in network protocols.

Recognizing this connection clarifies why partial failures happen and how retries improve reliability.

Supply Chain Management

Batch limits and retries are like shipment size limits and resending lost packages in supply chains.

Seeing this cross-domain similarity highlights the universal need to balance load and handle failures gracefully.

Common Pitfalls

#1Ignoring unprocessed items returned by DynamoDB and assuming all batch items succeeded.

Wrong approach:response = dynamodb.batch_write_item(RequestItems=batch) # No check for unprocessed items print('Batch write complete')

Correct approach:response = dynamodb.batch_write_item(RequestItems=batch) while 'UnprocessedItems' in response and response['UnprocessedItems']: response = dynamodb.batch_write_item(RequestItems=response['UnprocessedItems']) print('All items processed')

Root cause:Misunderstanding that batch operations can partially fail and require explicit retry handling.

#2Retrying unprocessed items immediately without any delay, causing repeated throttling.

Wrong approach:while unprocessed_items: response = dynamodb.batch_write_item(RequestItems=unprocessed_items) unprocessed_items = response.get('UnprocessedItems', {})

Correct approach:import time import random retry_count = 0 while unprocessed_items and retry_count < 5: delay = (2 ** retry_count) + random.uniform(0, 1) time.sleep(delay) response = dynamodb.batch_write_item(RequestItems=unprocessed_items) unprocessed_items = response.get('UnprocessedItems', {}) retry_count += 1

Root cause:Lack of exponential backoff and jitter in retry logic leads to retry storms and persistent failures.

#3Always sending maximum batch sizes without monitoring throughput or unprocessed items.

Wrong approach:batch = create_batch(items[:25]) response = dynamodb.batch_write_item(RequestItems=batch) # No adjustment based on response

Correct approach:batch_size = 25 while items: batch = create_batch(items[:batch_size]) response = dynamodb.batch_write_item(RequestItems=batch) if response.get('UnprocessedItems'): batch_size = max(1, batch_size // 2) # Reduce batch size else: batch_size = min(25, batch_size + 1) # Increase batch size items = items[batch_size:]

Root cause:Not adapting batch size to current table capacity and throughput causes inefficiency.

Key Takeaways

DynamoDB batch operations have strict limits on the number of items per request to protect system stability.

Batch requests can be partially processed, returning unprocessed items that must be retried to ensure data consistency.

Implementing retries with exponential backoff and jitter prevents overload and improves success rates.

Optimizing batch sizes based on throughput and monitoring unprocessed items leads to better performance.

Advanced retry patterns and error handling are essential for robust production applications using DynamoDB batch operations.