0
0
dbtdata~15 mins

Generating documentation site in dbt - Deep Dive

Choose your learning style9 modes available
Overview - Generating documentation site
What is it?
Generating a documentation site in dbt means creating a website that shows details about your data models, tests, and sources. This site helps you and your team understand how data flows and is transformed in your project. It includes descriptions, lineage graphs, and test results all in one place. This makes your data project easier to use and maintain.
Why it matters
Without a documentation site, understanding complex data projects is hard and slow. Teams waste time guessing what each model does or how data is connected. A documentation site solves this by making all information clear and accessible. This improves trust in data and speeds up collaboration and troubleshooting.
Where it fits
Before generating documentation, you should know how to build dbt models and write tests. After creating the site, you can learn how to deploy it for your team or integrate it with other tools like data catalogs or BI platforms.
Mental Model
Core Idea
Generating a documentation site in dbt turns your project’s metadata and descriptions into a clear, interactive website that explains your data models and their relationships.
Think of it like...
It's like creating a detailed map for a city you built, showing every street, building, and how they connect, so visitors can easily find their way and understand the layout.
┌───────────────────────────────┐
│        dbt Project            │
│ ┌───────────────┐             │
│ │ Models & Tests│             │
│ └──────┬────────┘             │
│        │                      │
│        ▼                      │
│ ┌───────────────┐             │
│ │ Metadata &    │             │
│ │ Descriptions  │             │
│ └──────┬────────┘             │
│        │                      │
│        ▼                      │
│ ┌───────────────┐             │
│ │ Documentation │────────────▶│
│ │ Site (HTML)   │             │
│ └───────────────┘             │
└───────────────────────────────┘
Build-Up - 7 Steps
1
FoundationWhat is dbt documentation site
🤔
Concept: Introducing the idea of a documentation site in dbt and what it contains.
A dbt documentation site is a website generated from your dbt project. It shows your data models, tests, and sources with descriptions and lineage. This site helps everyone understand your data transformations and quality checks in one place.
Result
You understand that dbt can create a website summarizing your data project details.
Knowing that dbt can automatically build a documentation site helps you see how it supports clear communication in data teams.
2
FoundationHow dbt stores documentation info
🤔
Concept: Understanding where dbt keeps the information used to build the documentation site.
dbt stores descriptions and metadata inside your project files, like model SQL files and schema.yml files. When you run dbt commands, it collects this info into a manifest and catalog. These files are the source for the documentation site.
Result
You see that your written descriptions and tests become part of the documentation data.
Recognizing that documentation is built from your project files encourages writing clear descriptions and tests.
3
IntermediateRunning dbt docs generate command
🤔Before reading on: do you think dbt docs generate creates the website directly or just prepares files? Commit to your answer.
Concept: Learning the command that prepares the documentation site files.
The command dbt docs generate reads your project metadata and creates a static website inside the target directory. This website includes HTML, CSS, and JavaScript files that show your models, tests, and lineage.
Result
You get a folder with all files needed for the documentation site, but it is not yet visible in a browser.
Understanding that dbt docs generate prepares the site files separately from viewing helps you control when and how to share documentation.
4
IntermediateViewing documentation site locally
🤔Before reading on: do you think dbt docs serve runs a web server or opens a file directly? Commit to your answer.
Concept: How to view the generated documentation site on your computer.
After generating docs, run dbt docs serve. This starts a local web server and opens the documentation site in your browser. You can explore models, tests, and lineage interactively. This helps you check your documentation before sharing.
Result
You see a fully interactive website showing your dbt project details in your browser.
Knowing how to serve docs locally lets you review and improve documentation before publishing.
5
IntermediateAdding descriptions and tests for docs
🤔Before reading on: do you think descriptions affect the docs site or only the code? Commit to your answer.
Concept: How writing descriptions and tests improves the documentation site content.
Descriptions in schema.yml files and model files appear in the docs site. Tests show data quality checks. Adding clear descriptions and tests makes the documentation site more useful and trustworthy for users.
Result
Your documentation site shows meaningful explanations and test results, making it easier to understand your data.
Recognizing that documentation quality depends on your input motivates writing good descriptions and tests.
6
AdvancedCustomizing documentation site appearance
🤔Before reading on: do you think dbt allows changing docs site style easily or not? Commit to your answer.
Concept: Exploring ways to customize the look and feel of the documentation site.
dbt lets you customize the docs site by editing the dbt_project.yml file and adding custom CSS or logos. You can also add extra markdown files for more context. This helps align the docs site with your company branding and needs.
Result
You create a documentation site that looks professional and fits your team's style.
Knowing customization options helps you make documentation more engaging and aligned with your team's culture.
7
ExpertDeploying documentation site for teams
🤔Before reading on: do you think dbt docs serve is suitable for team sharing or only local use? Commit to your answer.
Concept: How to share the documentation site with your whole team or organization.
dbt docs serve is for local use only. To share docs, you deploy the generated site files to a web server or cloud storage like S3. You can automate this in CI/CD pipelines. This makes docs accessible to everyone anytime, improving collaboration.
Result
Your team can access up-to-date documentation from a shared website without running dbt locally.
Understanding deployment options ensures documentation is not just created but effectively shared and maintained.
Under the Hood
When you run dbt docs generate, dbt compiles your project metadata into JSON files called manifest.json and catalog.json. These files describe models, tests, sources, and their relationships. dbt then uses a static site generator to convert this metadata into HTML, CSS, and JavaScript files that form the documentation site. The site includes interactive lineage graphs and searchable model info, all rendered client-side in the browser.
Why designed this way?
dbt separates generating docs from serving them to keep flexibility. Generating static files means docs can be hosted anywhere without running dbt. Using JSON metadata allows easy extension and integration with other tools. This design balances automation, performance, and ease of sharing.
┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ dbt Project   │──────▶│ Metadata JSON │──────▶│ Static Site   │
│ (models, yml) │       │ (manifest.json│       │ Generator     │
└───────────────┘       │  catalog.json)│       └──────┬────────┘
                        └───────────────┘              │
                                                       ▼
                                              ┌─────────────────┐
                                              │ Documentation   │
                                              │ Site (HTML/CSS) │
                                              └─────────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does running dbt docs generate automatically show the docs in your browser? Commit yes or no.
Common Belief:Running dbt docs generate immediately opens the documentation site in a browser.
Tap to reveal reality
Reality:dbt docs generate only creates the static files. You must run dbt docs serve to view the site locally.
Why it matters:Expecting docs to open automatically can cause confusion and wasted time searching for the site.
Quick: Do you think the documentation site updates automatically when you change model descriptions? Commit yes or no.
Common Belief:The docs site updates in real-time as you edit model descriptions or tests.
Tap to reveal reality
Reality:You must rerun dbt docs generate to update the documentation site with changes.
Why it matters:Not regenerating docs leads to outdated information, causing misunderstandings and errors.
Quick: Is dbt docs serve designed for team-wide sharing of documentation? Commit yes or no.
Common Belief:dbt docs serve is suitable for sharing docs with your whole team.
Tap to reveal reality
Reality:dbt docs serve runs a local server for one user only; it is not meant for team sharing.
Why it matters:Using dbt docs serve for sharing can cause access issues and confusion among team members.
Quick: Does adding descriptions in SQL files alone guarantee they appear in the docs site? Commit yes or no.
Common Belief:Descriptions in SQL files automatically appear in the documentation site without extra steps.
Tap to reveal reality
Reality:Descriptions must be properly added in schema.yml files or as model descriptions to appear in docs.
Why it matters:Misplaced descriptions lead to missing or incomplete documentation, reducing its usefulness.
Expert Zone
1
The manifest.json file contains detailed dependency graphs that enable lineage visualization and impact analysis beyond simple documentation.
2
Custom macros can inject dynamic content into documentation, allowing advanced users to automate description updates or include external metadata.
3
Integrating dbt docs with external data catalogs requires understanding the JSON schema and possibly transforming it to match catalog APIs.
When NOT to use
Generating a documentation site is not ideal for very small projects where simple README files suffice. For real-time collaborative editing or rich interactive docs, consider dedicated documentation platforms like MkDocs or Confluence integrated with dbt metadata exports.
Production Patterns
In production, teams automate docs generation and deployment in CI/CD pipelines, hosting the site on cloud storage or internal web servers. They combine docs with data catalogs and alerting systems to maintain data quality and transparency at scale.
Connections
Data Lineage
The documentation site visualizes data lineage graphs showing dependencies between models.
Understanding how dbt docs show lineage helps grasp the flow of data transformations and their impact.
Static Site Generators
dbt uses static site generation techniques to build the documentation website from JSON metadata.
Knowing static site generation principles clarifies why docs are fast, portable, and easy to host.
Software Documentation
Generating docs in dbt parallels creating API docs in software, turning code and comments into user-friendly websites.
Seeing dbt docs as software docs highlights the importance of clear descriptions and automated updates.
Common Pitfalls
#1Expecting the documentation site to update automatically after editing descriptions.
Wrong approach:Edit schema.yml descriptions and immediately open the old docs site without regenerating. # No dbt docs generate run
Correct approach:After editing descriptions, run: dbt docs generate dbt docs serve
Root cause:Misunderstanding that docs generation is a separate step from editing project files.
#2Trying to share the documentation site by sending the local URL from dbt docs serve.
Wrong approach:Tell teammates to open http://localhost:8080 to see docs served locally.
Correct approach:Deploy the generated docs site folder to a shared web server or cloud storage accessible to the team.
Root cause:Not realizing dbt docs serve runs a local server only accessible on your machine.
#3Adding descriptions only inside SQL model files without schema.yml entries.
Wrong approach:# In model.sql -- This is my model SELECT * FROM source_table
Correct approach:# In schema.yml models: - name: model_name description: "This model calculates sales totals."
Root cause:Not knowing that dbt docs site pulls descriptions mainly from schema.yml, not SQL comments.
Key Takeaways
Generating a documentation site in dbt transforms your project metadata into an interactive website that explains your data models and tests clearly.
You must run dbt docs generate to create the site files and dbt docs serve to view them locally; these are separate steps.
Good documentation depends on writing clear descriptions and tests in your project files, especially schema.yml.
For team sharing, deploy the generated static site to a web server or cloud storage, as dbt docs serve is only for local use.
Understanding the internal JSON metadata and static site generation helps you customize, integrate, and automate your documentation effectively.