0
0
MongoDBquery~15 mins

Why document databases over relational in MongoDB - Why It Works This Way

Choose your learning style9 modes available
Overview - Why document databases over relational
What is it?
Document databases store data as flexible, self-contained documents, usually in formats like JSON. Unlike relational databases that organize data in tables with fixed columns, document databases allow each record to have its own unique structure. This makes them easy to adapt when data changes or grows in complexity. They are designed to handle large volumes of diverse data quickly and efficiently.
Why it matters
Document databases solve the problem of rigid data structures in relational databases, which can slow down development and limit flexibility. Without document databases, developers would struggle to store and query complex, nested, or evolving data easily. This would make building modern applications like social media, content management, or real-time analytics much harder and slower.
Where it fits
Before learning this, you should understand basic database concepts like tables, rows, and columns in relational databases. After this, you can explore advanced topics like indexing, querying, and scaling in document databases, as well as comparing NoSQL and SQL systems.
Mental Model
Core Idea
Document databases store data as flexible, self-contained documents that can vary in structure, unlike rigid tables in relational databases.
Think of it like...
Imagine a filing cabinet where each folder can hold papers of any shape or size, instead of a cabinet where every folder must have the same number of papers arranged in fixed slots.
┌───────────────────────────────┐
│ Document Database             │
│ ┌───────────────┐             │
│ │ Document 1    │  {"name": "Alice", "age": 30}  │
│ └───────────────┘             │
│ ┌─────────────────────────┐   │
│ │ Document 2              │  {"name": "Bob", "hobbies": ["golf", "chess"]} │
│ └─────────────────────────┘   │
│ ┌─────────────────────────────┐│
│ │ Document 3                  │ {"product": "Book", "price": 12.99, "tags": ["fiction"]} │
│ └─────────────────────────────┘│
└───────────────────────────────┘
Build-Up - 6 Steps
1
FoundationBasic structure of relational databases
🤔
Concept: Relational databases organize data into tables with fixed columns and rows.
In a relational database, data is stored in tables. Each table has columns defining the type of data (like name, age) and rows representing individual records. Every row must follow the same column structure. For example, a 'Users' table might have columns for 'id', 'name', and 'email'.
Result
Data is stored in a strict, uniform format making it easy to enforce rules and relationships.
Understanding the fixed, uniform structure of relational tables helps explain why flexibility can be limited when data needs change.
2
FoundationIntroduction to document databases
🤔
Concept: Document databases store data as flexible documents, often in JSON-like formats.
Instead of tables, document databases store data as documents. Each document is a self-contained unit with fields and values. Documents can have different fields and nested data. For example, one document might have a 'name' and 'age', while another has 'name' and a list of 'hobbies'.
Result
Data can be stored with varying structures, adapting easily to changes.
Recognizing that documents can differ in structure shows how document databases offer flexibility relational tables lack.
3
IntermediateHandling evolving data schemas
🤔Before reading on: do you think relational databases or document databases handle changing data structures more easily? Commit to your answer.
Concept: Document databases allow schema changes without downtime or complex migrations.
In relational databases, changing the schema (like adding a new column) often requires altering tables and migrating data, which can be slow and risky. Document databases let you add new fields to some documents without affecting others. This means you can evolve your data model as your application grows without interrupting service.
Result
Applications can adapt faster to new requirements without costly database changes.
Knowing that document databases support flexible schemas explains why they speed up development and reduce maintenance.
4
IntermediateQuerying nested and complex data
🤔Before reading on: do you think querying nested data is simpler in relational or document databases? Commit to your answer.
Concept: Document databases natively support querying nested and complex data structures.
Document databases store nested data like arrays and objects inside documents. You can query these nested fields directly without needing to join multiple tables. For example, you can find all users who have 'chess' as a hobby inside their nested 'hobbies' list with a simple query.
Result
Queries on complex data are more straightforward and often faster.
Understanding native support for nested data clarifies why document databases are preferred for rich, hierarchical information.
5
AdvancedScaling and performance differences
🤔Before reading on: do you think document or relational databases scale better horizontally? Commit to your answer.
Concept: Document databases are designed to scale out easily across many servers.
Relational databases often scale by making a single server more powerful (vertical scaling), which has limits. Document databases are built to distribute data across multiple servers (horizontal scaling) by sharding documents. This allows them to handle very large datasets and high traffic with less complexity.
Result
Applications can grow to serve millions of users without major database redesign.
Knowing the scaling strengths of document databases explains their popularity in big, fast-growing applications.
6
ExpertTrade-offs and consistency models
🤔Before reading on: do you think document databases always guarantee strict consistency like relational databases? Commit to your answer.
Concept: Document databases often use eventual consistency models to improve availability and performance, trading off strict consistency.
Relational databases typically guarantee strong consistency, meaning all users see the same data immediately after a change. Document databases may relax this to eventual consistency, where updates propagate over time. This trade-off allows better performance and uptime but requires developers to handle temporary data differences.
Result
Applications can be faster and more available but need to manage data consistency carefully.
Understanding consistency trade-offs is crucial for designing reliable applications using document databases.
Under the Hood
Document databases store each record as a single document, often in JSON or BSON format, which includes all related data together. This eliminates the need for complex joins by embedding related information inside documents. Internally, documents are indexed by unique keys and can be distributed across servers using sharding. The database engine optimizes queries by indexing fields inside documents and supports flexible schemas by not enforcing a fixed structure.
Why designed this way?
Document databases were designed to address the limitations of relational databases in handling modern application needs like rapid development, flexible data models, and horizontal scaling. Traditional relational models were too rigid and costly to evolve, especially with large, diverse datasets. By storing data as self-contained documents, these databases allow developers to work more naturally with complex, nested data and scale systems more easily.
┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ Document 1    │──────▶│ Shard 1       │       │ Index on name │
│ {"name": "Alice"} │       │               │       └───────────────┘
└───────────────┘       └───────────────┘

┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ Document 2    │──────▶│ Shard 2       │       │ Index on age  │
│ {"name": "Bob"}   │       │               │       └───────────────┘
└───────────────┘       └───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Do document databases require a fixed schema like relational databases? Commit to yes or no.
Common Belief:Document databases require a fixed schema just like relational databases.
Tap to reveal reality
Reality:Document databases allow each document to have its own structure without a fixed schema.
Why it matters:Believing in fixed schemas limits the use of document databases and leads to unnecessary schema migrations.
Quick: Do document databases always guarantee immediate consistency? Commit to yes or no.
Common Belief:Document databases always provide the same strong consistency guarantees as relational databases.
Tap to reveal reality
Reality:Many document databases use eventual consistency to improve performance and availability, which means data may not be immediately consistent everywhere.
Why it matters:Assuming strong consistency can cause bugs if applications rely on immediate data accuracy.
Quick: Is querying nested data more complex in document databases than relational? Commit to yes or no.
Common Belief:Querying nested or complex data is harder in document databases because of their flexible structure.
Tap to reveal reality
Reality:Document databases natively support querying nested data, often making such queries simpler than in relational databases.
Why it matters:Misunderstanding query capabilities can lead to choosing less suitable databases and more complex code.
Quick: Do document databases scale vertically better than relational databases? Commit to yes or no.
Common Belief:Document databases scale vertically better than relational databases.
Tap to reveal reality
Reality:Document databases are designed to scale horizontally across many servers, unlike relational databases which often rely on vertical scaling.
Why it matters:Wrong assumptions about scaling can cause poor architecture decisions and performance bottlenecks.
Expert Zone
1
Document databases often allow embedding related data inside documents or referencing other documents; choosing between embedding and referencing affects performance and consistency.
2
Indexing strategies in document databases can be complex because indexes can be created on nested fields and arrays, requiring careful planning for query optimization.
3
Some document databases support multi-document transactions, but these are usually more limited or less performant than relational transactions, affecting how complex operations are designed.
When NOT to use
Document databases are not ideal when strict ACID transactions across many related entities are required, such as in banking systems. In such cases, relational databases or NewSQL systems that guarantee strong consistency and complex joins are better choices.
Production Patterns
In production, document databases are often used for content management, user profiles, and event logging where data structures vary and scale is important. Developers use schema validation rules, indexing on common query fields, and sharding to optimize performance and reliability.
Connections
JSON data format
Document databases store data in JSON-like formats, directly building on JSON's flexible structure.
Understanding JSON helps grasp how document databases represent and query complex, nested data naturally.
Distributed systems
Document databases often use distributed architectures to shard data across servers for scalability.
Knowing distributed system principles clarifies how document databases achieve high availability and horizontal scaling.
Object-oriented programming
Document databases map well to objects in programming languages, storing objects as documents.
Recognizing this connection helps developers design data models that align closely with application code, reducing impedance mismatch.
Common Pitfalls
#1Trying to enforce a fixed schema in a document database by manually validating every document on the client side.
Wrong approach:Manually checking each document's fields in application code before inserting into the database.
Correct approach:Use the database's built-in schema validation features to enforce rules efficiently and consistently.
Root cause:Misunderstanding that document databases can handle flexible schemas with built-in validation leads to redundant and error-prone client-side checks.
#2Modeling all relationships by embedding documents without considering data duplication or update complexity.
Wrong approach:Embedding full user profiles inside every order document to avoid joins.
Correct approach:Use referencing for large or frequently changing related data to avoid duplication and maintain consistency.
Root cause:Lack of understanding of trade-offs between embedding and referencing causes inefficient data models.
#3Assuming strong consistency and designing application logic that depends on immediate data synchronization.
Wrong approach:Relying on reading immediately updated data after a write in an eventually consistent document database.
Correct approach:Design application logic to handle eventual consistency or use transactions where supported.
Root cause:Confusing consistency models leads to subtle bugs and data anomalies.
Key Takeaways
Document databases store data as flexible, self-contained documents, allowing varied structures unlike fixed relational tables.
They enable faster development by supporting evolving schemas without costly migrations.
Native support for nested and complex data makes querying rich data simpler and more efficient.
Designed for horizontal scaling, document databases handle large, distributed workloads better than many relational systems.
Understanding trade-offs in consistency and data modeling is essential to use document databases effectively in production.