0
0
DynamoDBquery~15 mins

Adjacency list pattern in DynamoDB - Deep Dive

Choose your learning style9 modes available
Overview - Adjacency list pattern
What is it?
The adjacency list pattern is a way to represent hierarchical data, like trees or graphs, inside a database. Each item stores a reference to its parent, creating a chain of connections. This pattern helps organize data with parent-child relationships, such as categories or organizational charts. It is simple and works well for many use cases.
Why it matters
Without a way to represent hierarchies, databases would struggle to model real-world relationships like company structures or folder trees. The adjacency list pattern solves this by linking items through parent references, making it easy to find related data. Without it, applications would need complex workarounds, slowing down development and making data harder to understand.
Where it fits
Before learning this, you should understand basic database tables and how to store simple records. After this, you can explore other hierarchical patterns like materialized paths or nested sets, which optimize querying. This pattern is a foundation for working with trees and graphs in databases.
Mental Model
Core Idea
Each item points to its parent, forming a chain that builds a hierarchy.
Think of it like...
Imagine a family tree where each person knows who their parent is, but not necessarily their children. By following parent links, you can trace back to the root ancestor.
Root
  │
  ├─ Child A
  │    ├─ Grandchild A1
  │    └─ Grandchild A2
  └─ Child B
       └─ Grandchild B1

Each node stores a pointer to its parent node.
Build-Up - 7 Steps
1
FoundationBasic parent reference concept
🤔
Concept: Learn how each item stores a reference to its parent to form a hierarchy.
In the adjacency list pattern, every record has a field called 'parentId' that holds the ID of its parent item. The root item has a null or special value indicating no parent. For example, a category might have parentId = null if it's top-level, or parentId = 5 if its parent has ID 5.
Result
You can see a chain of items connected by their parentId fields, forming a tree structure.
Understanding that each item only needs to know its parent simplifies storing complex hierarchies.
2
FoundationStoring adjacency list in DynamoDB
🤔
Concept: How to represent the adjacency list pattern using DynamoDB tables and attributes.
In DynamoDB, each item is stored with a unique primary key (like 'id') and an attribute 'parentId' that points to its parent. For example: { "id": "3", "name": "Child A", "parentId": "1" } The root item might have parentId set to null or omitted. }
Result
A DynamoDB table where each item knows its parent, enabling hierarchical queries.
Knowing how to map the adjacency list to DynamoDB's key-value model is essential for practical use.
3
IntermediateQuerying children of a parent
🤔Before reading on: Do you think you can find all children of a parent with a single query in DynamoDB? Commit to yes or no.
Concept: Learn how to find all items that have a specific parentId, i.e., the children of a node.
To find children of a parent, you query the table for items where 'parentId' equals the parent's ID. In DynamoDB, this requires a secondary index on 'parentId' because it is not the primary key. For example, create a Global Secondary Index (GSI) with 'parentId' as the partition key, then query the GSI with the parent's ID.
Result
You get a list of all direct children of the given parent node.
Understanding the need for a secondary index to efficiently find children is key to performant queries.
4
IntermediateNavigating up the hierarchy
🤔Before reading on: Can you find an item's ancestors by querying DynamoDB multiple times? Commit to yes or no.
Concept: Learn how to find the chain of parents (ancestors) by following parentId links upwards.
To find ancestors, start from an item and repeatedly query DynamoDB for the item whose 'id' matches the current item's 'parentId'. Repeat until you reach the root (parentId null). This requires multiple queries, one per level up the tree.
Result
You get the full path from the item up to the root ancestor.
Knowing that upward traversal requires multiple queries helps plan efficient data access.
5
IntermediateLimitations of adjacency list pattern
🤔Before reading on: Do you think adjacency list pattern is efficient for deep tree queries? Commit to yes or no.
Concept: Understand the challenges when querying deep or complex hierarchies with adjacency lists.
Because each item only knows its parent, finding all descendants or deep ancestors requires multiple queries or recursive logic outside DynamoDB. This can be slow and costly. Also, DynamoDB does not support joins, so complex tree queries need extra work in application code.
Result
You realize adjacency lists are simple but have performance trade-offs for deep queries.
Recognizing these limits guides when to use adjacency lists versus other patterns.
6
AdvancedOptimizing adjacency list with denormalization
🤔Before reading on: Can adding extra fields reduce query complexity in adjacency lists? Commit to yes or no.
Concept: Learn how to add extra attributes to items to speed up queries at the cost of data duplication.
You can store extra data like 'path' or 'ancestorIds' in each item to quickly find descendants or ancestors without multiple queries. For example, store a list of all ancestor IDs in an attribute. This denormalization speeds up reads but requires careful updates when the hierarchy changes.
Result
Queries become faster and simpler, but data maintenance becomes more complex.
Understanding trade-offs between query speed and data consistency is crucial for production systems.
7
ExpertHandling cycles and data integrity
🤔Before reading on: Do you think adjacency lists prevent cycles automatically? Commit to yes or no.
Concept: Explore how to detect and prevent cycles in adjacency lists to maintain valid hierarchies.
Adjacency lists do not inherently prevent cycles (where an item is its own ancestor). Applications must enforce rules to avoid cycles, such as checking ancestors before updating parentId. In DynamoDB, this requires extra queries and logic, as the database does not enforce foreign key constraints.
Result
You maintain a valid tree structure without loops, avoiding infinite loops in queries.
Knowing that data integrity is the application's responsibility prevents subtle bugs and data corruption.
Under the Hood
The adjacency list pattern stores each item's parent ID, creating a linked structure. DynamoDB stores items as key-value pairs, so each item has a unique key and attributes including 'parentId'. Queries for children use secondary indexes on 'parentId'. Upward traversal requires multiple queries following parentId links. DynamoDB does not support joins or recursive queries, so the application handles hierarchy logic.
Why designed this way?
This pattern is simple and flexible, requiring minimal data per item. It fits well with DynamoDB's key-value model and scales horizontally. Alternatives like nested sets or materialized paths optimize queries but add complexity. The adjacency list was chosen historically for its simplicity and ease of updates.
┌─────────┐       ┌─────────┐       ┌─────────┐
│  Item 1 │◄──────│  Item 2 │◄──────│  Item 3 │
│ id=1    │       │ id=2    │       │ id=3    │
│ parentId│ null  │ parentId│ 1     │ parentId│ 2
└─────────┘       └─────────┘       └─────────┘

Query children: find items where parentId = given id
Query parent: find item where id = parentId of current
Myth Busters - 3 Common Misconceptions
Quick: Does adjacency list pattern automatically prevent cycles in the hierarchy? Commit to yes or no.
Common Belief:The adjacency list pattern ensures no cycles because each item has only one parent.
Tap to reveal reality
Reality:Adjacency lists do not prevent cycles; an item can be set as a parent of its ancestor, creating loops.
Why it matters:Cycles cause infinite loops in queries and corrupt the hierarchy, leading to application crashes or wrong data.
Quick: Can you get all descendants of a node with a single DynamoDB query using adjacency lists? Commit to yes or no.
Common Belief:You can retrieve all descendants of a node with one query by filtering on parentId.
Tap to reveal reality
Reality:You can only get direct children in one query; retrieving all descendants requires multiple queries or extra data.
Why it matters:Assuming single-query descendant retrieval leads to inefficient or incorrect implementations.
Quick: Is adjacency list pattern always the best choice for hierarchical data? Commit to yes or no.
Common Belief:Adjacency list is the best and simplest pattern for all hierarchical data needs.
Tap to reveal reality
Reality:Adjacency list is simple but can be inefficient for deep or complex queries; other patterns like materialized paths may be better.
Why it matters:Using adjacency lists blindly can cause performance problems in large or deep hierarchies.
Expert Zone
1
Secondary indexes on 'parentId' are critical for performance but add cost and complexity in DynamoDB.
2
Denormalizing ancestor paths speeds up queries but requires careful update logic to maintain consistency.
3
DynamoDB's lack of join and recursive query support means application logic must handle hierarchy traversal.
When NOT to use
Avoid adjacency lists when you need frequent queries for all descendants or deep ancestors. Instead, use materialized path or nested set patterns that store path information for faster queries.
Production Patterns
In production, adjacency lists are combined with GSIs on 'parentId' for child queries and denormalized ancestor lists for upward traversal. Applications implement cycle detection and update ancestor paths on hierarchy changes.
Connections
Graph theory
Adjacency list pattern is a direct application of adjacency lists in graph theory.
Understanding graph adjacency lists helps grasp how database items link to parents, enabling traversal algorithms.
Filesystem directory trees
Both represent hierarchical data with parent-child relationships.
Knowing how files and folders link to parents clarifies how adjacency lists model real-world hierarchies.
Linked lists (data structures)
Adjacency lists use parent pointers similar to linked list nodes pointing to next nodes.
Recognizing adjacency lists as linked structures helps understand traversal and update operations.
Common Pitfalls
#1Trying to get all descendants with a single query on parentId.
Wrong approach:Query DynamoDB with parentId = rootId expecting all descendants: Query({ IndexName: 'parentId-index', KeyConditionExpression: 'parentId = :rootId', ExpressionAttributeValues: { ':rootId': '1' } })
Correct approach:Query children with parentId = rootId, then recursively query children of those children in application code or use denormalized paths.
Root cause:Misunderstanding that parentId only links to direct children, not all descendants.
#2Not creating a secondary index on parentId for child queries.
Wrong approach:Query DynamoDB table directly on parentId without an index, causing errors or full table scans.
Correct approach:Create a Global Secondary Index on parentId attribute and query that index for efficient child lookups.
Root cause:Not knowing DynamoDB requires indexes for querying non-key attributes.
#3Allowing cycles by setting an item's parentId to its own descendant.
Wrong approach:Update item with id=3 to have parentId=5, where 5 is a child of 3, creating a cycle.
Correct approach:Before updating parentId, check ancestor chain to prevent cycles.
Root cause:Assuming database enforces hierarchy integrity automatically.
Key Takeaways
The adjacency list pattern stores each item's parent ID to build hierarchies simply and flexibly.
DynamoDB requires secondary indexes on parentId to efficiently query children of a node.
Upward traversal in adjacency lists needs multiple queries following parent links, which can be slow.
Denormalization can speed queries but adds complexity in keeping data consistent.
Applications must handle cycle detection and data integrity because adjacency lists do not enforce it.