0
0
GCPcloud~15 mins

Firestore collections and documents in GCP - Deep Dive

Choose your learning style9 modes available
Overview - Firestore collections and documents
What is it?
Firestore is a cloud database that stores data in collections and documents. Collections are like folders that hold documents, and documents are like files that store data as key-value pairs. This structure helps organize data in a way that is easy to find and update. Firestore automatically scales and keeps data synced across devices.
Why it matters
Without collections and documents, storing and organizing data in the cloud would be chaotic and slow. Firestore's model makes it simple to group related data and access it quickly, even as your app grows. This means apps can respond faster and handle more users without extra work from developers.
Where it fits
Before learning Firestore collections and documents, you should understand basic database concepts like tables and records. After this, you can learn about Firestore queries, security rules, and data modeling for complex apps.
Mental Model
Core Idea
Firestore organizes data as collections of documents, where each document holds data fields and can link to subcollections, creating a flexible, nested structure.
Think of it like...
Imagine a filing cabinet where each drawer is a collection, and inside each drawer are folders (documents) holding sheets of paper (data fields). Some folders can have smaller folders inside them (subcollections) for more details.
Firestore Structure:

┌─────────────┐
│ Collection  │
│  (Drawer)   │
└─────┬───────┘
      │ contains
      ▼
┌─────────────┐
│ Document    │
│  (Folder)   │
│ ┌─────────┐ │
│ │ Fields  │ │
│ │ (Data)  │ │
│ └─────────┘ │
│     │       │
│     ▼       │
│ Subcollection│
│  (Smaller   │
│   Drawer)   │
└─────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding Firestore Basics
🤔
Concept: Firestore stores data in a NoSQL format using collections and documents instead of tables and rows.
Firestore is a cloud database where data is stored in documents. Each document is a set of key-value pairs, like a small JSON object. Documents are grouped inside collections. Collections do not store data directly but hold documents. This structure is flexible and easy to scale.
Result
You can store and retrieve data organized in collections and documents instead of traditional tables.
Understanding that Firestore uses collections and documents instead of tables helps you think differently about data organization in cloud databases.
2
FoundationDocuments as Data Containers
🤔
Concept: Documents hold the actual data fields and can contain simple or complex data types.
A document in Firestore contains fields with values like strings, numbers, booleans, arrays, or even nested objects. Each document has a unique ID within its collection. Documents are independent and can be created, updated, or deleted without affecting others.
Result
You can create documents with data fields and uniquely identify them within collections.
Knowing that documents are self-contained data units helps you design your data for easy access and updates.
3
IntermediateCollections as Groupings of Documents
🤔
Concept: Collections group related documents and provide a way to organize data hierarchically.
Collections are containers for documents. You can think of them as categories or groups. For example, a collection named 'users' might hold documents for each user. Collections do not store data themselves but organize documents. You can create multiple collections at the root or nested inside documents.
Result
You can organize your data logically by grouping documents into collections.
Understanding collections as logical groupings helps you plan your database structure for clarity and performance.
4
IntermediateSubcollections for Nested Data
🤔Before reading on: do you think documents can contain other documents directly, or only collections can hold documents? Commit to your answer.
Concept: Documents can have subcollections, allowing nested data structures without embedding all data in one document.
Firestore allows documents to have subcollections, which are collections nested inside a document. This lets you model complex relationships, like a 'users' document having an 'orders' subcollection. Subcollections are independent collections and can have their own documents and subcollections.
Result
You can create nested data structures that are easy to query and update separately.
Knowing that subcollections exist prevents the mistake of putting too much data in one document and helps keep data organized and scalable.
5
IntermediateDocument IDs and Paths
🤔Before reading on: do you think document IDs are assigned automatically or must always be set by the user? Commit to your answer.
Concept: Each document has a unique ID within its collection, and documents are accessed by their full path including collections and document IDs.
Documents have IDs that can be set manually or generated automatically by Firestore. The full path to a document includes the collection name and document ID, like 'users/user123'. This path is used to read, write, or listen to the document. Paths help locate data precisely in the database.
Result
You can uniquely identify and access any document using its path and ID.
Understanding document paths and IDs is key to accessing and managing data correctly in Firestore.
6
AdvancedData Modeling with Collections and Documents
🤔Before reading on: do you think embedding all related data in one document is better than using subcollections? Commit to your answer.
Concept: Choosing when to embed data in documents or use subcollections affects performance, scalability, and complexity.
Firestore limits document size to 1 MB, so large or growing data should be split into subcollections. Embedding small related data in a document is fast for reads but can cause issues if data grows. Using subcollections allows querying parts of data independently and keeps documents small. Good data modeling balances these tradeoffs.
Result
You design your database to be efficient, scalable, and easy to query.
Knowing how to model data with collections and documents prevents performance problems and supports app growth.
7
ExpertConsistency and Transactions in Firestore
🤔Before reading on: do you think Firestore guarantees strong consistency across all documents in different collections automatically? Commit to your answer.
Concept: Firestore provides strong consistency within documents and supports transactions to update multiple documents atomically.
Firestore ensures that reads and writes to a single document are strongly consistent. However, when updating multiple documents across collections or subcollections, you must use transactions or batch writes to keep data consistent. Transactions run atomically, so either all changes succeed or none do. This is crucial for maintaining data integrity in complex operations.
Result
You can safely update multiple related documents without risking partial updates.
Understanding Firestore's consistency model and transactions helps avoid subtle bugs and data corruption in production.
Under the Hood
Firestore stores data in a distributed cloud system where collections and documents are logical containers. Each document is stored as a separate entity with its own metadata and data fields. The system uses document paths to locate data quickly. Subcollections are stored as separate collections linked to parent documents. Firestore uses indexes to speed up queries and supports real-time synchronization by listening to document changes.
Why designed this way?
Firestore was designed to be flexible and scalable for mobile and web apps. Using collections and documents instead of tables allows dynamic schemas and nested data. This design supports offline use, real-time updates, and easy scaling without complex joins. Alternatives like relational databases were less suited for these needs, so Firestore chose a document model for simplicity and performance.
Firestore Internal Structure:

┌───────────────┐
│ Client App    │
└──────┬────────┘
       │ Reads/Writes
       ▼
┌───────────────┐
│ Firestore API │
└──────┬────────┘
       │ Maps paths
       ▼
┌───────────────┐
│ Distributed   │
│ Storage Nodes │
│ ┌───────────┐ │
│ │ Documents │ │
│ │ & Indexes │ │
│ └───────────┘ │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Do you think Firestore collections store data directly, or only documents do? Commit to your answer.
Common Belief:Collections store data directly like tables in a database.
Tap to reveal reality
Reality:Collections only hold documents; they do not store data themselves.
Why it matters:Thinking collections store data can lead to confusion when queries return documents, not collections, causing errors in data handling.
Quick: Do you think documents can contain other documents directly? Commit to your answer.
Common Belief:Documents can contain other documents inside them.
Tap to reveal reality
Reality:Documents cannot contain documents directly; only collections can hold documents, including subcollections inside documents.
Why it matters:Misunderstanding this leads to trying to nest documents inside documents, which Firestore does not support, causing design and query problems.
Quick: Do you think Firestore automatically handles large nested data inside one document? Commit to your answer.
Common Belief:You can store unlimited nested data inside a single document.
Tap to reveal reality
Reality:Documents have a size limit of 1 MB; large or growing data should be split into subcollections.
Why it matters:Ignoring document size limits causes write failures and poor app performance.
Quick: Do you think Firestore guarantees strong consistency across multiple documents without transactions? Commit to your answer.
Common Belief:All document updates are strongly consistent automatically, even across collections.
Tap to reveal reality
Reality:Strong consistency is guaranteed per document; multi-document updates require transactions or batch writes.
Why it matters:Assuming automatic consistency can cause data corruption or partial updates in multi-document operations.
Expert Zone
1
Subcollections are independent collections and can have their own security rules and indexes, allowing fine-grained access control.
2
Firestore's automatic indexing can be customized or disabled per field to optimize query performance and cost.
3
Document IDs can be chosen to encode meaningful information or use random IDs to avoid hotspots in distributed storage.
When NOT to use
Firestore collections and documents are not ideal for complex relational data requiring multi-table joins or heavy transactions. In such cases, a relational database like Cloud SQL is better. Also, for very large datasets with complex analytics, BigQuery or other data warehouses are more suitable.
Production Patterns
In production, developers use shallow queries to avoid reading large subcollections unnecessarily. They design data models to minimize document size and use transactions for critical multi-document updates. Security rules are applied at collection and document levels to protect data. Offline support is enabled by syncing documents selectively.
Connections
Relational Databases
Firestore collections and documents are a NoSQL alternative to tables and rows in relational databases.
Understanding relational tables helps grasp Firestore's different approach to organizing data without fixed schemas or joins.
JSON Data Format
Firestore documents store data as key-value pairs similar to JSON objects.
Knowing JSON structure helps understand how Firestore documents hold nested and flexible data.
File Systems
Firestore's collections and documents resemble folders and files in a file system hierarchy.
Recognizing this similarity aids in visualizing data organization and navigation in Firestore.
Common Pitfalls
#1Trying to store large nested data inside one document exceeding size limits.
Wrong approach:db.collection('users').doc('user1').set({ profile: { bio: '...', posts: [/* thousands of posts */] } })
Correct approach:db.collection('users').doc('user1').set({ profile: { bio: '...' } }); db.collection('users').doc('user1').collection('posts').add({ /* post data */ })
Root cause:Misunderstanding document size limits and how to use subcollections for large or growing data.
#2Assuming collections can be queried like documents with data fields.
Wrong approach:const data = db.collection('users').data(); // incorrect, collections have no data() method
Correct approach:const snapshot = await db.collection('users').get(); snapshot.forEach(doc => console.log(doc.data()));
Root cause:Confusing collections as data holders instead of containers for documents.
#3Updating multiple documents without transactions, causing partial updates.
Wrong approach:db.collection('users').doc('user1').update({ balance: 100 }); db.collection('accounts').doc('acc1').update({ balance: 200 });
Correct approach:const batch = db.batch(); batch.update(db.collection('users').doc('user1'), { balance: 100 }); batch.update(db.collection('accounts').doc('acc1'), { balance: 200 }); await batch.commit();
Root cause:Not using transactions or batch writes for atomic multi-document updates.
Key Takeaways
Firestore organizes data in collections and documents, where collections hold documents and documents store data fields.
Documents have unique IDs and can contain subcollections, enabling flexible and nested data structures.
Understanding document size limits and when to use subcollections is crucial for scalable data modeling.
Firestore guarantees strong consistency per document, but multi-document updates require transactions or batch writes.
Proper use of collections, documents, and transactions ensures efficient, reliable, and secure cloud data storage.