Documenting models in YAML helps explain what your data models do. It makes your work clear for others and yourself later.
Documenting models in YAML in dbt
Start learning this pattern below
Jump into concepts and practice - no test required
or
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Syntax
dbt
version: 2 models: - name: model_name description: 'Description of the model' columns: - name: column_name description: 'Description of the column'
The YAML file usually starts with version: 2 to specify the schema version.
Indentation is important in YAML. Use 2 spaces per level for clarity.
Examples
customers model with two columns described.dbt
version: 2 models: - name: customers description: 'Contains customer details' columns: - name: customer_id description: 'Unique ID for each customer' - name: email description: 'Customer email address'
sales model with sale ID and amount columns.dbt
version: 2 models: - name: sales description: 'Sales transactions data' columns: - name: sale_id description: 'Unique sale identifier' - name: amount description: 'Sale amount in USD'
Sample Program
This YAML documents an orders model with three columns and their descriptions.
dbt
version: 2 models: - name: orders description: 'Table containing order information' columns: - name: order_id description: 'Unique identifier for each order' - name: order_date description: 'Date when the order was placed' - name: customer_id description: 'ID of the customer who placed the order'
Important Notes
Always keep your YAML files well-indented to avoid errors.
Descriptions help others understand your data without reading code.
You can add tests and tags in the same YAML file for more features.
Summary
Documenting models in YAML makes your data project clear and easy to use.
Use models and columns sections to add descriptions.
Good documentation helps teamwork and future maintenance.
Practice
1. What is the main purpose of documenting models in YAML in a dbt project?
easy
Solution
Step 1: Understand the role of YAML documentation
YAML files in dbt are used to add metadata like descriptions, not to run code or store data.Step 2: Identify the benefit of documentation
Adding descriptions for models and columns helps team members understand the data and maintain the project easily.Final Answer:
To add clear descriptions for models and columns to improve understanding -> Option CQuick Check:
Documentation purpose = Add descriptions [OK]
Hint: Documentation in YAML means adding descriptions, not code [OK]
Common Mistakes:
- Thinking YAML runs SQL code
- Confusing YAML with data storage
- Ignoring the importance of descriptions
2. Which of the following is the correct way to start documenting a model named
orders in a YAML file?easy
Solution
Step 1: Recall YAML syntax for dbt model documentation
dbt expects a list undermodels:with each model as a dictionary containingnameanddescription.Step 2: Match the correct structure
models: - name: orders description: 'Contains order details' correctly uses a list with a dictionary havingnameanddescription. Other options misuse keys or structure.Final Answer:
models: - name: orders description: 'Contains order details' -> Option DQuick Check:
Model list with name and description = models: - name: orders description: 'Contains order details' [OK]
Hint: Use dash (-) for list items under models in YAML [OK]
Common Mistakes:
- Using singular 'model' instead of 'models'
- Not using dash for list items
- Incorrect indentation or key names
3. Given this YAML snippet documenting a model and its columns:
models:
- name: customers
description: 'Customer information'
columns:
- name: id
description: 'Unique customer ID'
- name: email
description: 'Customer email address'
What will dbt show as the description for the email column?medium
Solution
Step 1: Locate the column description in YAML
Theemailcolumn is listed undercolumnswith its owndescriptionkey.Step 2: Identify the description text for the email column
The description foremailis 'Customer email address', which dbt will display for that column.Final Answer:
Customer email address -> Option BQuick Check:
Column description matches YAML text [OK]
Hint: Column descriptions are under columns > name in YAML [OK]
Common Mistakes:
- Confusing model description with column description
- Missing indentation causing YAML parsing errors
- Assuming no description if not repeated
4. You wrote this YAML to document a model but dbt throws an error:
models:
- name: sales
description: 'Sales data'
columns:
name: amount
description: 'Sale amount'
What is the error in this YAML?medium
Solution
Step 1: Check YAML list syntax for columns
Each column should be a list item with a dash (-) before its dictionary of keys.Step 2: Identify missing dash in columns
Thenameanddescriptionkeys undercolumnslack the dash, so YAML treats them as keys ofcolumnsinstead of list items.Final Answer:
Missing dash (-) before column name and description -> Option AQuick Check:
List items need dash (-) in YAML [OK]
Hint: Use dash (-) before each column in columns list [OK]
Common Mistakes:
- Forgetting dash for list items
- Misplacing description keys
- Confusing YAML lists and dictionaries
5. You want to document two models,
users and transactions, each with columns and descriptions. Which YAML structure correctly documents both models with their columns?hard
Solution
Step 1: Understand YAML list structure for multiple models
dbt expectsmodelsas a list of dictionaries, each withname,description, andcolumnsas a list.Step 2: Evaluate each option's structure
models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' correctly uses a list with two model dictionaries, each with proper keys and column lists. The other options misuse keys or structure. models: name: users description: 'User data' columns: - name: user_id description: 'User identifier' name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' repeats keys incorrectly.Final Answer:
models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' -> Option AQuick Check:
Multiple models as list items with name and columns = models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' [OK]
Hint: List each model with dash (-) and include columns as lists [OK]
Common Mistakes:
- Using model names as keys instead of list items
- Repeating keys at same level
- Not using dash for multiple models
