Overview - Index capacity and cost

What is it?

In DynamoDB, indexes are special data structures that help you find data quickly without scanning the entire table. They come in two main types: Global Secondary Indexes (GSI) and Local Secondary Indexes (LSI). Each index uses read and write capacity units, which affect how much it costs and how fast it can respond. Understanding index capacity and cost means knowing how indexes consume resources and how to manage them efficiently.

Why it matters

Without understanding index capacity and cost, you might create indexes that slow down your app or make your bill unexpectedly high. Indexes speed up data retrieval but use extra capacity, so balancing performance and cost is key. If you ignore this, your database might become expensive or slow, hurting user experience and your budget.

Where it fits

Before learning about index capacity and cost, you should know basic DynamoDB concepts like tables, items, and primary keys. After this, you can explore advanced topics like capacity auto-scaling, adaptive capacity, and cost optimization strategies.

Mental Model

Core Idea

Indexes in DynamoDB are like extra sorted lists that speed up searches but need their own read and write resources, which cost money and affect performance.

Think of it like...

Imagine a library where the main catalog is a big book listing all books. An index is like a special card catalog for a specific topic that helps you find books faster, but maintaining this card catalog takes extra librarian time and space.

┌─────────────┐       ┌───────────────┐
│   DynamoDB  │       │    Indexes    │
│    Table    │──────▶│  GSI and LSI  │
└─────────────┘       └───────────────┘
       │                      │
       │ Uses capacity units   │
       ▼                      ▼
┌─────────────┐        ┌─────────────┐
│ Read/Write  │        │ Read/Write  │
│ Capacity    │        │ Capacity    │
│ Units       │        │ Units       │
└─────────────┘        └─────────────┘

Build-Up - 7 Steps

1

FoundationUnderstanding DynamoDB Capacity Units

Concept: Learn what read and write capacity units are and how they measure throughput.

DynamoDB uses read capacity units (RCUs) and write capacity units (WCUs) to control how much data you can read or write per second. One RCU allows you to read up to 4 KB of data per second for strongly consistent reads (or 8 KB for eventually consistent reads), and one WCU allows you to write up to 1 KB per second. These units help DynamoDB manage performance and cost.

Result

You understand that capacity units limit how fast you can read or write data and that bigger items use more units.

Knowing capacity units is essential because indexes consume these units separately, affecting your overall throughput and cost.

2

FoundationWhat Are DynamoDB Indexes?

3

IntermediateHow Indexes Consume Capacity Units

4

IntermediateDifferences in Capacity Cost Between GSI and LSI

5

IntermediateImpact of Item Size on Index Capacity

6

AdvancedManaging Capacity with Auto Scaling and Adaptive Capacity

7

ExpertSurprising Costs: Sparse Indexes and Write Amplification

Under the Hood

DynamoDB stores indexes as separate tables internally. When you write to the main table, DynamoDB synchronously replicates the relevant data to each index, consuming write capacity units for each. Reads from indexes consume capacity units from the index's provisioned or on-demand capacity. This separation allows fast queries on different keys but requires managing capacity for each index independently.

Why designed this way?

This design balances fast query performance with scalability. By separating indexes, DynamoDB can optimize queries on different keys without scanning the main table. The tradeoff is extra storage and capacity cost. Alternatives like scanning the main table would be slower and less scalable, so this design prioritizes speed and flexibility.

┌───────────────┐      ┌───────────────┐      ┌───────────────┐
│   Main Table  │─────▶│    GSI Table  │      │    LSI Table  │
│ (Primary Key) │      │ (Separate PK) │      │ (Same PK, Diff│
└───────────────┘      └───────────────┘      │  Sort Key)    │
       │                      │               └───────────────┘
       │                      │                      ▲
       │                      │                      │
       │                      │                      │
       ▼                      ▼                      │
┌───────────────┐      ┌───────────────┐            │
│ Write to Main │      │ Write to GSI  │◀───────────┘
│ consumes WCU  │      │ consumes WCU  │
└───────────────┘      └───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Do you think adding an index does not affect write costs? Commit yes or no.

Common Belief:Adding an index only affects read performance and cost, not writes.

Tap to reveal reality

Quick: Do you think LSIs can be added after table creation? Commit yes or no.

Common Belief:You can add Local Secondary Indexes anytime after creating the table.

Tap to reveal reality

Quick: Do you think projecting all attributes to an index always improves performance? Commit yes or no.

Common Belief:Projecting all attributes to an index is always better for query speed.

Tap to reveal reality

Quick: Do you think DynamoDB automatically shares capacity units between table and indexes? Commit yes or no.

Common Belief:DynamoDB automatically shares capacity units between the main table and its indexes.

Tap to reveal reality

Expert Zone

1

Sparse indexes can reduce write costs by only including items with specific attributes, but they require careful query design to avoid missing data.

2

Write amplification means that even small updates to items can cause multiple write capacity unit charges if multiple GSIs exist, impacting cost unpredictably.

3

Adaptive capacity helps hot partitions in indexes get more throughput without manual scaling, but it does not eliminate the need for proper capacity planning.

When NOT to use

Avoid using many GSIs on write-heavy tables because write amplification can cause high costs and throttling. Instead, consider denormalizing data or using single-table design patterns. For queries that don't need fast lookups, scanning or filtering might be cheaper.

Production Patterns

In production, teams often limit GSIs to essential queries, use sparse indexes to reduce size, and enable auto scaling to handle traffic spikes. Monitoring index usage and capacity consumption is standard to optimize cost and performance.

Connections

Caching

Builds-on

Understanding index capacity helps decide when to use caching layers to reduce read load and save capacity units.

Cost Optimization

Builds-on

Knowing index costs is key to optimizing cloud spending and designing cost-effective data architectures.

Supply Chain Inventory Management

Analogy in resource allocation

Just like managing warehouse space and restocking costs, managing index capacity balances resource use and cost for efficient operations.

Common Pitfalls

#1Creating many GSIs without considering write capacity impact.

Wrong approach:Create 5 GSIs on a write-heavy table without adjusting write capacity units or monitoring costs.

Correct approach:Limit GSIs to essential queries, monitor write capacity usage, and enable auto scaling to handle increased load.

Root cause:Misunderstanding that each GSI consumes write capacity units on every write causes unexpected throttling and high costs.

#2Projecting all attributes to indexes by default.

Wrong approach:Define GSIs with ProjectionType = ALL regardless of query needs.

Correct approach:Project only necessary attributes to indexes to reduce size and write costs.

Root cause:Assuming more data in indexes always improves performance leads to higher costs and slower writes.

#3Trying to add LSIs after table creation.

Wrong approach:Attempt to add an LSI to an existing table using update commands.

Correct approach:Define LSIs only during table creation or redesign the table if needed.

Root cause:Not knowing LSIs are fixed at table creation causes wasted effort and delays.

Key Takeaways

DynamoDB indexes improve query speed but consume separate read and write capacity units, affecting cost and performance.

Global Secondary Indexes (GSIs) consume write capacity units on every write, while Local Secondary Indexes (LSIs) share partition keys and have different cost behaviors.

Item size and projected attributes directly impact index capacity consumption; careful design reduces unnecessary costs.

Auto scaling and adaptive capacity help manage index capacity dynamically but do not replace thoughtful capacity planning.

Misunderstanding index capacity leads to unexpected costs, throttling, and poor application performance.