Overview - Embedding vs referencing

What is it?

Embedding and referencing are two ways to organize related data in Redis, a fast key-value database. Embedding means storing related data together inside one record, while referencing means storing data separately and linking them by keys. This helps manage how data is saved and accessed efficiently.

Why it matters

Without choosing between embedding or referencing, data can become slow to access or hard to update. Embedding makes reading related data fast but can cause duplication, while referencing keeps data clean but needs extra steps to gather information. Picking the right method improves app speed and simplicity.

Where it fits

Before learning this, you should understand basic Redis data types like strings, hashes, and sets. After this, you can learn about Redis data modeling and performance tuning to build efficient applications.

Mental Model

Core Idea

Embedding stores related data together inside one Redis key, while referencing stores data separately and links them by keys.

Think of it like...

Imagine a photo album: embedding is like putting all photos of a trip in one album page, while referencing is like having separate photo pages linked by a table of contents.

┌───────────────┐       ┌───────────────┐
│ Embedded Key  │──────▶│ All data inside│
│ (one record)  │       │ one place     │
└───────────────┘       └───────────────┘

┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ Reference Key │──────▶│ Separate Data │       │ Separate Data │
│ (links keys)  │       │ (key 1)      │       │ (key 2)      │
└───────────────┘       └───────────────┘       └───────────────┘

Build-Up - 6 Steps

1

FoundationUnderstanding Redis Basic Storage

Concept: Learn how Redis stores simple data types like strings and hashes.

Redis stores data as keys with values. Values can be simple strings or more complex types like hashes, which hold multiple fields and values inside one key.

Result

You can store and retrieve simple and grouped data using Redis commands like SET and HSET.

Knowing Redis data types is essential before deciding how to organize related data using embedding or referencing.

2

FoundationWhat Is Embedding in Redis

3

IntermediateWhat Is Referencing in Redis

4

IntermediateComparing Embedding and Referencing

5

AdvancedHandling Data Consistency with Referencing

6

ExpertOptimizing Redis Models with Hybrid Approaches

Under the Hood

Redis stores data as key-value pairs in memory for fast access. Embedding stores multiple fields inside one key using hashes, which are efficient for grouped data. Referencing stores separate keys and requires multiple lookups to gather related data. Redis does not support joins or foreign keys, so references are managed by the application.

Why designed this way?

Redis was designed for speed and simplicity, focusing on key-value access. Embedding fits this by grouping related data in one key, minimizing lookups. Referencing allows flexible, normalized data but requires extra commands. This design avoids complex joins to keep Redis fast and scalable.

┌───────────────┐
│ Redis Memory  │
│  (Key-Value)  │
└──────┬────────┘
       │
┌──────▼───────┐          ┌───────────────┐
│ Embedded Key │─────────▶│ Hash with     │
│ (one record) │          │ multiple fields│
└──────────────┘          └───────────────┘

┌───────────────┐          ┌───────────────┐          ┌───────────────┐
│ Reference Key │─────────▶│ Separate Key 1│          │ Separate Key 2│
│ (links keys)  │          │ (data part 1) │          │ (data part 2) │
└───────────────┘          └───────────────┘          └───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Does embedding always use more memory than referencing? Commit yes or no.

Common Belief:Embedding always uses more memory because it duplicates data.

Tap to reveal reality

Quick: Does referencing automatically keep data consistent in Redis? Commit yes or no.

Common Belief:Referencing keys in Redis automatically keeps data consistent like foreign keys in SQL.

Tap to reveal reality

Quick: Is embedding always faster than referencing? Commit yes or no.

Common Belief:Embedding is always faster because all data is in one key.

Tap to reveal reality

Quick: Can you use Redis commands to join referenced data automatically? Commit yes or no.

Common Belief:Redis supports automatic joins between referenced keys like SQL databases.

Tap to reveal reality

Expert Zone

1

Embedding small, frequently accessed fields reduces network round-trips and improves latency.

2

Referencing large or rarely changed data avoids duplication and reduces memory usage.

3

Using Redis Lua scripts or pipelines can efficiently fetch referenced data in fewer commands.

When NOT to use

Avoid embedding when data grows large or changes independently, as it causes duplication and slow writes. Avoid referencing when you need very fast reads of all related data. Instead, use hybrid models or specialized databases with joins if complex relations are needed.

Production Patterns

In production, developers embed user profile info but reference user-generated content like posts or comments. They use Redis pipelines to batch fetch referenced keys and Lua scripts to maintain consistency during updates.

Connections

Normalization in Relational Databases

Referencing in Redis is similar to normalization, separating data to avoid duplication.

Understanding normalization helps grasp why referencing reduces data duplication but adds complexity.

Caching Strategies

Embedding resembles caching full objects, while referencing is like caching pointers or IDs.

Knowing caching trade-offs clarifies when embedding or referencing improves Redis performance.

Object Composition in Software Design

Embedding is like composing objects with all parts inside, referencing is like linking separate objects.

Recognizing this helps software developers design data models that match application logic.

Common Pitfalls

#1Storing all related data in separate keys without linking them.

Wrong approach:HSET user:1 name 'Alice' HSET user:1:posts post1 'Hello' # No key storing list of posts

Correct approach:HSET user:1 name 'Alice' LPUSH user:1:posts post1 HSET post:post1 content 'Hello'

Root cause:Not linking referenced data causes inability to find related records.

#2Embedding large, frequently changing data causing slow writes.

Wrong approach:HSET user:1 name 'Alice' posts 'very large JSON string with all posts'

Correct approach:HSET user:1 name 'Alice' LPUSH user:1:posts post1 HSET post:post1 content 'Hello'

Root cause:Embedding large data causes duplication and slow updates.

#3Assuming Redis enforces reference integrity automatically.

Wrong approach:DEL user:1 # expecting posts to be deleted automatically

Correct approach:DEL user:1 DEL user:1:posts DEL post:post1 # application deletes all related keys

Root cause:Redis lacks foreign key constraints; app must manage consistency.

Key Takeaways

Embedding stores related data together inside one Redis key for fast, simple reads.

Referencing stores data separately and links them by keys, reducing duplication but needing extra commands.

Choosing embedding or referencing depends on data size, update frequency, and access patterns.

Redis does not enforce data consistency between referenced keys; applications must handle it.

Hybrid models combining embedding and referencing often provide the best balance in real-world apps.