0
0
HLDsystem_design~25 mins

NoSQL database types (document, key-value, column, graph) in HLD - System Design Exercise

Choose your learning style9 modes available
Design: NoSQL Database Types Overview System
Focus on explaining and demonstrating the four NoSQL types with architecture and data flow. Out of scope: detailed implementation of each database engine.
Functional Requirements
FR1: Explain the four main types of NoSQL databases: document, key-value, column, and graph.
FR2: Show use cases for each NoSQL type with simple examples.
FR3: Demonstrate how data is stored and retrieved in each type.
FR4: Support queries typical for each database type.
FR5: Handle up to 10,000 queries per second with average latency under 100ms.
Non-Functional Requirements
NFR1: System must be scalable to handle growing data and query load.
NFR2: Latency for read and write operations should be under 100ms at p99.
NFR3: Availability target of 99.9% uptime.
NFR4: Data consistency can be eventual for some types but must be explained.
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
Key Components
API layer to accept queries
Separate storage modules for each NoSQL type
Query processor tailored to each database type
Cache layer for frequently accessed data
Monitoring and logging components
Design Patterns
Sharding and partitioning for scaling
Eventual consistency vs strong consistency
Indexing strategies for fast queries
Graph traversal algorithms for graph DB
Document schema design and validation
Reference Architecture
          +---------------------+
          |     Client/API       |
          +----------+----------+
                     |
      +--------------+--------------+
      |     Query Processor Layer     |
      +--------------+--------------+
         |          |          |          |
+--------+  +-------+  +-------+  +-------+
|Document|  |Key-Value| |Column |  | Graph |
|  DB    |  |  DB    | |  DB   |  |  DB   |
+--------+  +--------+ +-------+  +-------+
     |           |         |          |
+----+----+ +----+----+ +--+--+  +----+----+
|Storage  | |Storage  | |Storage| |Storage  |
+---------+ +---------+ +------+ +---------+
Components
Client/API
REST/GraphQL API
Accepts user queries and routes them to the query processor.
Query Processor Layer
Custom middleware
Interprets queries and directs them to the appropriate NoSQL database type.
Document Database
MongoDB or Couchbase
Stores data as JSON-like documents, good for flexible schema and nested data.
Key-Value Database
Redis or DynamoDB
Stores data as simple key-value pairs for fast lookups.
Column Database
Cassandra or HBase
Stores data in columns for efficient aggregation and wide tables.
Graph Database
Neo4j or Amazon Neptune
Stores data as nodes and edges for complex relationships and graph queries.
Storage
Distributed storage clusters
Persist data reliably with replication and partitioning.
Request Flow
1. Client sends query to API layer.
2. API forwards query to Query Processor.
3. Query Processor identifies NoSQL type based on query and data model.
4. Query Processor sends query to the corresponding NoSQL database.
5. Database executes query and returns results.
6. Query Processor formats results and sends back to Client.
Database Schema
Document DB: Collections of JSON documents with flexible fields. Key-Value DB: Simple key and value pairs, no fixed schema. Column DB: Tables with rows and dynamic columns grouped in families. Graph DB: Nodes (entities) and edges (relationships) with properties.
Scaling Discussion
Bottlenecks
Query Processor can become a bottleneck under high load.
Storage nodes may face partition hotspots.
Network latency between components can increase with scale.
Consistency delays in eventual consistency models.
Indexing overhead for large datasets.
Solutions
Scale Query Processor horizontally with load balancers.
Use consistent hashing for even data partitioning.
Deploy components in the same region and use CDN for clients.
Use tunable consistency settings per use case.
Implement background index rebuilding and incremental indexing.
Interview Tips
Time: Spend 10 minutes explaining NoSQL types and use cases, 15 minutes on architecture and data flow, 10 minutes on scaling and trade-offs, 10 minutes for questions.
Clear understanding of each NoSQL type and when to use it.
How the system routes queries to the right database type.
Trade-offs between consistency, availability, and partition tolerance.
Scaling strategies for each database type.
Realistic latency and availability targets.