0
0
DynamoDBquery~15 mins

Local Secondary Index (LSI) concept in DynamoDB - Deep Dive

Choose your learning style9 modes available
Overview - Local Secondary Index (LSI) concept
What is it?
A Local Secondary Index (LSI) in DynamoDB is a way to create an alternate view of your data within the same partition key but with a different sort key. It allows you to query data using the same partition key but sorted or filtered differently. This helps you find related items quickly without scanning the entire table.
Why it matters
Without LSIs, you would have to scan the whole table or create multiple tables to get different views of your data, which is slow and costly. LSIs let you efficiently retrieve data sorted or filtered in multiple ways while keeping it organized under the same partition key. This improves performance and reduces costs in real applications.
Where it fits
Before learning about LSIs, you should understand DynamoDB tables, partition keys, and sort keys. After LSIs, you can learn about Global Secondary Indexes (GSIs), which allow indexing on different partition keys, and advanced query optimization techniques.
Mental Model
Core Idea
A Local Secondary Index lets you look at the same group of items (same partition key) but sorted or filtered by a different attribute.
Think of it like...
Imagine a filing cabinet drawer labeled with a category (partition key). Inside, you have folders sorted by date (sort key). An LSI is like adding a second way to sort those same folders inside the drawer, maybe by client name instead of date, without moving them to a new drawer.
┌─────────────────────────────┐
│ DynamoDB Table              │
│ Partition Key: UserID       │
│ Sort Key: OrderDate         │
│                             │
│ Items:                      │
│ UserID=123, OrderDate=2023-01-01 │
│ UserID=123, OrderDate=2023-02-01 │
│                             │
│ Local Secondary Index (LSI) │
│ Partition Key: UserID       │
│ Sort Key: OrderStatus       │
│                             │
│ Same items, different sort  │
└─────────────────────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding DynamoDB Keys
🤔
Concept: Learn what partition keys and sort keys are in DynamoDB tables.
In DynamoDB, each item is uniquely identified by a partition key and optionally a sort key. The partition key decides which storage partition the item goes to. The sort key orders items within the same partition key. Together, they form the primary key.
Result
You can uniquely identify and organize data in DynamoDB using partition and sort keys.
Understanding keys is essential because LSIs rely on the partition key staying the same while changing the sort key.
2
FoundationWhat is an Index in DynamoDB?
🤔
Concept: Indexes let you query data in different ways without scanning the whole table.
An index is like a shortcut to find data faster. Instead of searching every item, you use an index to quickly locate items based on certain attributes. DynamoDB supports two types: Local Secondary Index (LSI) and Global Secondary Index (GSI).
Result
You can query data efficiently using indexes instead of scanning the entire table.
Knowing what an index is helps you understand why LSIs exist and how they improve query speed.
3
IntermediateDefining Local Secondary Index (LSI)
🤔
Concept: LSI uses the same partition key but a different sort key to create an alternate view of data.
An LSI is created when you want to query items with the same partition key but sorted or filtered by a different attribute. For example, your table's sort key might be OrderDate, but your LSI's sort key could be OrderStatus. This lets you find orders by status within the same user.
Result
You can query the same partition key with multiple sort keys, enabling flexible data retrieval.
Understanding that LSIs share the partition key but differ in sort keys clarifies their unique role compared to GSIs.
4
IntermediateQuerying with Local Secondary Index
🤔Before reading on: do you think you can query an LSI without specifying the partition key? Commit to your answer.
Concept: Queries on LSIs require the partition key and can use the alternate sort key for filtering or sorting.
When querying an LSI, you must provide the partition key value. You can then filter or sort results using the LSI's sort key. This is different from GSIs, where you can query by different partition keys.
Result
Queries on LSIs return items with the same partition key but ordered or filtered by the LSI's sort key.
Knowing that partition key is mandatory for LSI queries prevents common mistakes and clarifies their use case.
5
IntermediateLSI Storage and Size Limits
🤔
Concept: LSIs share storage with the base table and have size limits per partition key.
LSIs do not store data separately; they use the same storage as the main table. Because of this, the total size of all items with the same partition key, including all LSIs, cannot exceed 10 GB. This limit affects how you design your data model.
Result
You must design your data so that partitions do not exceed 10 GB when using LSIs.
Understanding storage sharing and size limits helps avoid performance issues and errors in production.
6
AdvancedLSI vs GSI: Key Differences
🤔Before reading on: do you think LSIs and GSIs can have different partition keys? Commit to your answer.
Concept: LSIs share the partition key with the base table; GSIs can have different partition keys and sort keys.
LSIs are limited to the base table's partition key but allow alternate sort keys. GSIs allow completely different partition and sort keys, enabling more flexible queries but with separate storage and eventual consistency.
Result
You choose LSIs for alternate sorting within the same partition; GSIs for different partition keys and more query flexibility.
Knowing the fundamental difference guides you to pick the right index type for your use case.
7
ExpertLSI Impact on Write Operations
🤔Before reading on: do you think adding an LSI affects write throughput? Commit to your answer.
Concept: LSIs increase write costs because every write must update the base table and all LSIs synchronously.
When you write or update an item, DynamoDB updates the base table and all LSIs atomically. This means write throughput is consumed for each LSI, and write latency can increase. Planning the number of LSIs is important to balance query flexibility and write performance.
Result
Writes become more expensive and slower with more LSIs, so use them judiciously.
Understanding the write cost impact prevents overusing LSIs and helps optimize performance and cost.
Under the Hood
LSIs are implemented by maintaining alternate sort key attributes for items sharing the same partition key. When an item is written, DynamoDB synchronously updates the base table and all LSIs to keep data consistent. This is done atomically within the partition, ensuring queries on LSIs reflect the latest data immediately.
Why designed this way?
LSIs were designed to provide strong consistency and fast queries on alternate sort keys without duplicating data across partitions. By sharing the partition key and storage, DynamoDB avoids data duplication and ensures immediate consistency, unlike GSIs which are eventually consistent and stored separately.
┌───────────────┐
│ Write Request │
└──────┬────────┘
       │
       ▼
┌─────────────────────────────┐
│ Partition (UserID=123)       │
│ ┌───────────────┐           │
│ │ Base Table    │           │
│ │ Sort Key: Date│           │
│ └───────────────┘           │
│ ┌───────────────┐           │
│ │ LSI           │           │
│ │ Sort Key: Status│          │
│ └───────────────┘           │
└─────────────────────────────┘
       ▲
       │
┌──────┴────────┐
│ Atomic Update │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Can you query an LSI without specifying the partition key? Commit yes or no.
Common Belief:You can query an LSI using only the alternate sort key without the partition key.
Tap to reveal reality
Reality:LSI queries always require the partition key; you cannot query an LSI without it.
Why it matters:Trying to query without the partition key leads to errors and confusion about how LSIs work.
Quick: Do LSIs store data separately from the base table? Commit yes or no.
Common Belief:LSIs store a separate copy of data, increasing total storage used.
Tap to reveal reality
Reality:LSIs share the same storage as the base table; they do not duplicate data.
Why it matters:Misunderstanding storage leads to wrong assumptions about costs and size limits.
Quick: Does adding more LSIs have no effect on write performance? Commit yes or no.
Common Belief:Adding LSIs does not affect write throughput or latency.
Tap to reveal reality
Reality:Each LSI adds overhead to writes because updates must be applied atomically to all LSIs.
Why it matters:Ignoring write cost impact can cause unexpected slowdowns and higher costs.
Quick: Can LSIs have different partition keys than the base table? Commit yes or no.
Common Belief:LSIs can have different partition keys from the base table.
Tap to reveal reality
Reality:LSIs must use the same partition key as the base table; only the sort key differs.
Why it matters:Confusing LSIs with GSIs leads to wrong data modeling and query design.
Expert Zone
1
LSIs provide strong consistency because they share storage and update atomically with the base table, unlike GSIs which are eventually consistent.
2
The 10 GB size limit per partition key including LSIs means large partitions can cause write failures or throttling if not designed carefully.
3
LSIs cannot be added after table creation; they must be defined at table creation time, requiring careful upfront planning.
When NOT to use
Avoid LSIs when you need to query by different partition keys or require more than five indexes. Use Global Secondary Indexes (GSIs) instead for flexible partition keys and eventual consistency.
Production Patterns
In production, LSIs are used to support multiple query patterns on the same partition key, such as sorting user orders by date and status. Teams carefully limit LSIs to avoid write bottlenecks and size limits, often combining LSIs with GSIs for full query flexibility.
Connections
Global Secondary Index (GSI)
LSIs and GSIs are both DynamoDB indexing methods but differ in partition key usage and consistency.
Understanding LSIs clarifies why GSIs exist and when to use each for efficient querying.
Database Normalization
LSIs help avoid data duplication by providing alternate views without copying data, similar to normalization goals.
Knowing LSIs reduces redundant data storage, connecting to normalization principles in relational databases.
Library Cataloging Systems
Like LSIs, library catalogs index books by author and title under the same category, enabling multiple search paths.
Seeing LSIs as multiple indexes in a catalog helps grasp their purpose in organizing data efficiently.
Common Pitfalls
#1Querying an LSI without specifying the partition key.
Wrong approach:Query({ IndexName: 'StatusIndex', KeyConditionExpression: 'OrderStatus = :status', ExpressionAttributeValues: { ':status': 'Shipped' } })
Correct approach:Query({ IndexName: 'StatusIndex', KeyConditionExpression: 'UserID = :user and OrderStatus = :status', ExpressionAttributeValues: { ':user': '123', ':status': 'Shipped' } })
Root cause:Misunderstanding that LSIs require the partition key for queries.
#2Adding LSIs after table creation.
Wrong approach:UpdateTable({ TableName: 'Orders', AttributeDefinitions: [...], LocalSecondaryIndexes: [...] })
Correct approach:Define LSIs only during CreateTable operation, not UpdateTable.
Root cause:Not knowing LSIs must be created with the table, unlike GSIs.
#3Ignoring the 10 GB partition size limit with LSIs.
Wrong approach:Designing a table with many large items under one partition key and multiple LSIs without size checks.
Correct approach:Distribute data to keep partition size under 10 GB or avoid LSIs for large partitions.
Root cause:Lack of awareness of partition size limits affecting LSIs.
Key Takeaways
Local Secondary Indexes let you query the same partition key with different sort keys for flexible data views.
LSIs require the partition key in queries and share storage with the base table, ensuring strong consistency.
They must be defined when creating the table and have a 10 GB size limit per partition key including all LSIs.
Adding LSIs increases write costs because updates apply atomically to the base table and all LSIs.
Choosing between LSIs and GSIs depends on your query needs: LSIs for alternate sorting within partitions, GSIs for different partition keys.