HLDsystem_design~25 mins

NoSQL database types (document, key-value, column, graph) in HLD - System Design Exercise

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Design: NoSQL Database Types Overview System

Focus on explaining and demonstrating the four NoSQL types with architecture and data flow. Out of scope: detailed implementation of each database engine.

Functional Requirements

FR1: Explain the four main types of NoSQL databases: document, key-value, column, and graph.

FR2: Show use cases for each NoSQL type with simple examples.

FR3: Demonstrate how data is stored and retrieved in each type.

FR4: Support queries typical for each database type.

FR5: Handle up to 10,000 queries per second with average latency under 100ms.

Non-Functional Requirements

NFR1: System must be scalable to handle growing data and query load.

NFR2: Latency for read and write operations should be under 100ms at p99.

NFR3: Availability target of 99.9% uptime.

NFR4: Data consistency can be eventual for some types but must be explained.

Think Before You Design

Questions to Ask

❓ Question 1

❓ Question 2

❓ Question 3

❓ Question 4

❓ Question 5

Key Components

API layer to accept queries

Separate storage modules for each NoSQL type

Query processor tailored to each database type

Cache layer for frequently accessed data

Monitoring and logging components

Design Patterns

Sharding and partitioning for scaling

Eventual consistency vs strong consistency

Indexing strategies for fast queries

Graph traversal algorithms for graph DB

Document schema design and validation

Reference Architecture

          +---------------------+
          |     Client/API       |
          +----------+----------+
                     |
      +--------------+--------------+
      |     Query Processor Layer     |
      +--------------+--------------+
         |          |          |          |
+--------+  +-------+  +-------+  +-------+
|Document|  |Key-Value| |Column |  | Graph |
|  DB    |  |  DB    | |  DB   |  |  DB   |
+--------+  +--------+ +-------+  +-------+
     |           |         |          |
+----+----+ +----+----+ +--+--+  +----+----+
|Storage  | |Storage  | |Storage| |Storage  |
+---------+ +---------+ +------+ +---------+

Components

Client/API

REST/GraphQL API

Accepts user queries and routes them to the query processor.

Query Processor Layer

Custom middleware

Interprets queries and directs them to the appropriate NoSQL database type.

Document Database

MongoDB or Couchbase

Stores data as JSON-like documents, good for flexible schema and nested data.

Key-Value Database

Redis or DynamoDB

Stores data as simple key-value pairs for fast lookups.

Column Database

Cassandra or HBase

Stores data in columns for efficient aggregation and wide tables.

Graph Database

Neo4j or Amazon Neptune

Stores data as nodes and edges for complex relationships and graph queries.

Storage

Distributed storage clusters

Persist data reliably with replication and partitioning.

Request Flow

1. Client sends query to API layer.

2. API forwards query to Query Processor.

3. Query Processor identifies NoSQL type based on query and data model.

4. Query Processor sends query to the corresponding NoSQL database.

5. Database executes query and returns results.

6. Query Processor formats results and sends back to Client.

Database Schema

Document DB: Collections of JSON documents with flexible fields. Key-Value DB: Simple key and value pairs, no fixed schema. Column DB: Tables with rows and dynamic columns grouped in families. Graph DB: Nodes (entities) and edges (relationships) with properties.

Scaling Discussion

Bottlenecks

Query Processor can become a bottleneck under high load.

Storage nodes may face partition hotspots.

Network latency between components can increase with scale.

Consistency delays in eventual consistency models.

Indexing overhead for large datasets.

Solutions

Scale Query Processor horizontally with load balancers.

Use consistent hashing for even data partitioning.

Deploy components in the same region and use CDN for clients.

Use tunable consistency settings per use case.

Implement background index rebuilding and incremental indexing.

Interview Tips

Time: Spend 10 minutes explaining NoSQL types and use cases, 15 minutes on architecture and data flow, 10 minutes on scaling and trade-offs, 10 minutes for questions.

Clear understanding of each NoSQL type and when to use it.

How the system routes queries to the right database type.

Trade-offs between consistency, availability, and partition tolerance.

Scaling strategies for each database type.

Realistic latency and availability targets.