0
0
HLDsystem_design~25 mins

Message ordering guarantees in HLD - System Design Exercise

Choose your learning style9 modes available
Design: Message Ordering Guarantee System
Design focuses on the messaging system's ordering guarantees and core architecture. It excludes detailed security, user interface, and persistence backup strategies.
Functional Requirements
FR1: Ensure messages sent between distributed components are received in the correct order.
FR2: Support multiple ordering guarantees: no ordering, per-producer ordering, global ordering.
FR3: Handle message loss and retries without breaking ordering guarantees.
FR4: Allow scalable message throughput with low latency.
FR5: Provide APIs for producers to send messages and consumers to receive messages in order.
Non-Functional Requirements
NFR1: System must handle up to 100,000 messages per second.
NFR2: End-to-end message delivery latency should be under 200ms for 99th percentile.
NFR3: Availability target of 99.9% uptime.
NFR4: Support horizontal scaling of producers and consumers.
NFR5: Ordering guarantees must be maintained even under failures and retries.
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
Key Components
Message producers
Message brokers or queues
Partitions or shards for scaling
Message consumers
Ordering metadata (sequence numbers, timestamps)
Retry and dead-letter handling
Design Patterns
Partitioned queues with per-partition ordering
Sequence numbers for ordering enforcement
Idempotent message processing
Leader election for global ordering
Exactly-once or at-least-once delivery semantics
Reference Architecture
 +-------------+       +----------------+       +--------------+       +--------------+
 |  Producers  | ----> | Message Broker | ----> | Partitions   | ----> | Consumers    |
 +-------------+       +----------------+       +--------------+       +--------------+
                           |                        |                      |
                           |                        |                      |
                           +----> Ordering Metadata (sequence numbers) <----+

Notes:
- Producers send messages with sequence numbers per partition.
- Broker routes messages to partitions to maintain order.
- Consumers read from partitions in order.
- Retry and duplicate handling ensure ordering is preserved.
Components
Producers
Any client application or service
Send messages with ordering metadata (e.g., sequence numbers) to the broker.
Message Broker
Kafka, RabbitMQ, or custom queue system
Receive messages, assign them to partitions, and maintain order per partition.
Partitions
Logical or physical queues
Divide message stream to enable parallelism while preserving order within each partition.
Consumers
Client applications or services
Consume messages from partitions in order, process them reliably.
Ordering Metadata
Sequence numbers or timestamps
Track message order per producer or globally to enforce ordering guarantees.
Request Flow
1. Producer assigns a sequence number to each message per partition and sends it to the broker.
2. Broker receives messages and routes them to the correct partition based on key or producer ID.
3. Within each partition, messages are stored in the order received.
4. Consumers subscribe to partitions and receive messages in the exact order they were stored.
5. If a message fails processing, consumer retries or dead-letter handling ensures order is not broken.
6. Ordering metadata is used to detect missing or out-of-order messages and handle them appropriately.
Database Schema
Entities: - Message: {message_id (PK), partition_id, producer_id, sequence_number, payload, timestamp, status} - Partition: {partition_id (PK), broker_id} - Producer: {producer_id (PK), metadata} Relationships: - Each Message belongs to one Partition. - Each Message is produced by one Producer. - Sequence_number is unique per (producer_id, partition_id) pair to maintain order.
Scaling Discussion
Bottlenecks
Single partition can become a throughput bottleneck limiting parallelism.
Global ordering requires coordination, which can increase latency and reduce availability.
Message broker storage and network bandwidth can be overwhelmed at high message rates.
Consumer processing speed can limit end-to-end latency.
Solutions
Increase number of partitions to allow parallel processing and higher throughput.
Use partition keys to shard messages so ordering is guaranteed per partition, not globally.
Implement leader election and consensus protocols only when global ordering is strictly required.
Use efficient storage and replication strategies in the broker to handle load.
Scale consumers horizontally and implement backpressure to handle bursts.
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing architecture and data flow, 10 minutes discussing scaling and trade-offs, and 5 minutes summarizing.
Clarify types of ordering guarantees and their impact on design.
Explain partitioning strategy to balance ordering and scalability.
Describe how sequence numbers or metadata enforce order.
Discuss failure handling to maintain ordering guarantees.
Highlight trade-offs between global ordering and system performance.