Overview - Message delivery guarantees

What is it?

Message delivery guarantees describe how a system ensures messages sent between components arrive correctly and reliably. They define if messages are delivered once, multiple times, or at least once, and how lost or duplicated messages are handled. These guarantees help systems communicate without losing or repeating information. They are essential in distributed systems where messages travel over networks that can fail or delay.

Why it matters

Without message delivery guarantees, systems could lose important data or process the same message multiple times, causing errors and inconsistent results. Imagine sending a payment request twice or missing a notification; this could lead to financial loss or user frustration. Guarantees make communication trustworthy and predictable, which is critical for applications like banking, messaging apps, and online shopping.

Where it fits

Before learning message delivery guarantees, you should understand basic networking and distributed systems concepts like message passing and failures. After this, you can explore related topics like consensus algorithms, fault tolerance, and event-driven architectures to build robust systems.

Mental Model

Core Idea

Message delivery guarantees define how a system ensures messages are delivered reliably, without loss or unwanted duplication, despite failures.

Think of it like...

It's like sending a letter through the mail with different options: a regular letter that might get lost, a certified letter that ensures delivery once, or a letter that might be delivered multiple times but you confirm receipt to avoid confusion.

┌───────────────┐      ┌───────────────┐      ┌───────────────┐
│  Sender App   │─────▶│  Message Bus  │─────▶│ Receiver App  │
└───────────────┘      └───────────────┘      └───────────────┘
       │                      │                      │
       │                      │                      │
       │<----- Delivery Guarantees Control ------->│

Delivery Guarantees:
- At most once: message sent once, may be lost
- At least once: message retried until received, may duplicate
- Exactly once: message delivered once, no loss or duplicates

Build-Up - 7 Steps

1

FoundationUnderstanding basic message passing

Concept: Introduce the idea of sending messages between two systems or components.

In distributed systems, components communicate by sending messages. A message is a piece of data sent from one part to another. For example, a user clicks a button, and the app sends a message to the server to save data. This communication can fail due to network issues or crashes.

Result

You understand that messages are the basic unit of communication and that sending them is not always guaranteed to succeed.

Understanding that messages can fail to arrive is the foundation for why delivery guarantees are needed.

2

FoundationRecognizing message loss and duplication

3

IntermediateAt most once delivery explained

4

IntermediateAt least once delivery explained

5

IntermediateExactly once delivery explained

6

AdvancedTechniques to achieve delivery guarantees

7

ExpertTradeoffs and challenges in real systems

Under the Hood

Message delivery guarantees rely on protocols that track message state between sender and receiver. Senders tag messages with unique IDs and wait for acknowledgments. If no acknowledgment arrives, they retry sending. Receivers keep track of processed message IDs to avoid duplicates. Exactly once delivery often uses transactional storage to atomically process messages and record their status, preventing duplicates even if retries occur.

Why designed this way?

These guarantees were designed to handle unreliable networks and system failures common in distributed computing. Early systems either lost messages or duplicated them, causing errors. The design balances complexity and reliability: simple systems accept loss, while critical systems invest in complex protocols to ensure correctness. Alternatives like synchronous calls or shared memory were impractical at scale or across networks.

┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│   Sender      │──────▶│  Network/Bus  │──────▶│   Receiver    │
│  (assigns ID) │       │ (may lose/dup)│       │ (checks ID)   │
└───────────────┘       └───────────────┘       └───────────────┘
       │                      ▲                      │
       │                      │                      │
       │◀──── Acknowledgment ──┘                      │
       │                                             │
       │◀──────────── Deduplication check ──────────┘

Myth Busters - 4 Common Misconceptions

Quick: Does 'at most once' guarantee no message loss? Commit yes or no.

Common Belief:At most once delivery means messages are never lost.

Tap to reveal reality

Quick: Does 'at least once' guarantee no duplicates? Commit yes or no.

Common Belief:At least once delivery means messages are never duplicated.

Tap to reveal reality

Quick: Is exactly once delivery easy and cheap to implement? Commit yes or no.

Common Belief:Exactly once delivery is simple and has no performance cost.

Tap to reveal reality

Quick: Does unique message ID alone guarantee exactly once delivery? Commit yes or no.

Common Belief:Using unique IDs automatically ensures exactly once delivery.

Tap to reveal reality

Expert Zone

1

Exactly once delivery often requires idempotent operations on the receiver side to handle rare edge cases.

2

Network partitions can cause split-brain scenarios where duplicate messages appear despite guarantees.

3

Some systems use 'effectively once' delivery, accepting rare duplicates but simplifying design.

When NOT to use

Use at most once delivery for non-critical data where loss is acceptable, like logging. Use at least once when duplicates can be handled or are less harmful than loss. Exactly once should be reserved for critical operations like financial transactions, but consider performance impact and complexity.

Production Patterns

Real systems combine delivery guarantees with idempotent processing and deduplication caches. For example, Kafka uses offsets and transactional producers for exactly once. AWS SQS offers at least once with deduplication. Payment systems use distributed transactions to ensure exactly once.

Connections

Distributed Consensus

Builds-on

Understanding message delivery guarantees helps grasp how consensus algorithms like Paxos or Raft ensure agreement despite message loss or duplication.

Database Transactions

Shares principles

Exactly once delivery relies on atomic processing similar to database transactions, ensuring operations happen fully or not at all.

Human Communication

Analogous pattern

Message delivery guarantees mirror how humans confirm messages by repeating or acknowledging to avoid misunderstandings.

Common Pitfalls

#1Ignoring duplicate messages in at least once delivery systems.

Wrong approach:Process every received message without checking IDs or state.

Correct approach:Implement deduplication by tracking processed message IDs before processing.

Root cause:Misunderstanding that retries cause duplicates and that receivers must handle them.

#2Assuming at most once delivery is reliable for critical data.

Wrong approach:Send message once without retries or acknowledgments for important transactions.

Correct approach:Use at least once or exactly once guarantees with retries and acknowledgments for critical messages.

Root cause:Underestimating network failures and message loss probability.

#3Trying to implement exactly once delivery without transactional support.

Wrong approach:Use unique IDs but process messages without atomic state updates.

Correct approach:Combine unique IDs with transactional processing to atomically record message handling.

Root cause:Not realizing that deduplication requires atomic state changes to prevent duplicates.

Key Takeaways

Message delivery guarantees define how systems handle message loss and duplication to ensure reliable communication.

At most once delivery avoids duplicates but risks losing messages; at least once ensures delivery but may duplicate messages.

Exactly once delivery is the strongest guarantee but requires complex mechanisms like transactions and deduplication.

Choosing the right guarantee depends on application needs, balancing reliability, complexity, and performance.

Understanding these guarantees is essential for designing robust distributed systems that behave predictably under failure.