Bird
Raised Fist0
dbtdata~5 mins

Documenting models in YAML in dbt - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is the purpose of documenting models in YAML in dbt?
Documenting models in YAML helps explain what each model does, its columns, and any important details. This makes it easier for others to understand and use the models correctly.
Click to reveal answer
beginner
How do you define a model description in a dbt YAML file?
You add a description field under the model name in the YAML file to explain what the model represents or does.
Click to reveal answer
beginner
What is the structure to document columns of a model in dbt YAML?
Under the model, you use a columns list. Each column has a name and a description to explain its meaning.
Click to reveal answer
intermediate
Why is it helpful to document tests in the YAML file for dbt models?
Documenting tests in YAML shows what checks are done on the data, like uniqueness or not-null. This helps keep data quality clear and maintainable.
Click to reveal answer
beginner
Give an example of a simple YAML snippet documenting a model and one column.
Example:
models:
  - name: customers
    description: 'Contains customer details'
    columns:
      - name: customer_id
        description: 'Unique ID for each customer'
Click to reveal answer
In dbt YAML documentation, where do you write the description of a model?
AInside the SQL model file as a comment
BIn a separate markdown file
CUnder the model's name with a 'description' field
DIn the dbt_project.yml file
How do you document columns in a dbt YAML file?
AUsing a 'columns' list with 'name' and 'description' for each column
BBy adding comments in the SQL file
CBy creating a separate CSV file
DBy naming columns in the dbt_project.yml file
What is one benefit of documenting tests in dbt YAML?
AIt automatically fixes data errors
BIt speeds up SQL query execution
CIt replaces the need for SQL models
DIt helps explain data quality checks clearly
Which of these is NOT typically documented in a dbt YAML model file?
AData source connection details
BColumn descriptions
CModel description
DTests on columns
What format does dbt use for model documentation?
AJSON
BYAML
CXML
DMarkdown
Explain how to document a dbt model and its columns using YAML.
Think about the YAML structure with models, columns, and descriptions.
You got /4 concepts.
    Why is documenting tests in the YAML file important for dbt models?
    Consider how documentation helps others understand what checks are done.
    You got /3 concepts.

      Practice

      (1/5)
      1. What is the main purpose of documenting models in YAML in a dbt project?
      easy
      A. To write SQL queries inside YAML files
      B. To execute dbt models automatically
      C. To add clear descriptions for models and columns to improve understanding
      D. To store raw data files

      Solution

      1. Step 1: Understand the role of YAML documentation

        YAML files in dbt are used to add metadata like descriptions, not to run code or store data.
      2. Step 2: Identify the benefit of documentation

        Adding descriptions for models and columns helps team members understand the data and maintain the project easily.
      3. Final Answer:

        To add clear descriptions for models and columns to improve understanding -> Option C
      4. Quick Check:

        Documentation purpose = Add descriptions [OK]
      Hint: Documentation in YAML means adding descriptions, not code [OK]
      Common Mistakes:
      • Thinking YAML runs SQL code
      • Confusing YAML with data storage
      • Ignoring the importance of descriptions
      2. Which of the following is the correct way to start documenting a model named orders in a YAML file?
      easy
      A. models: orders description: 'Contains order details'
      B. model: name: orders description: 'Contains order details'
      C. models: - orders: description: 'Contains order details'
      D. models: - name: orders description: 'Contains order details'

      Solution

      1. Step 1: Recall YAML syntax for dbt model documentation

        dbt expects a list under models: with each model as a dictionary containing name and description.
      2. Step 2: Match the correct structure

        models: - name: orders description: 'Contains order details' correctly uses a list with a dictionary having name and description. Other options misuse keys or structure.
      3. Final Answer:

        models: - name: orders description: 'Contains order details' -> Option D
      4. Quick Check:

        Model list with name and description = models: - name: orders description: 'Contains order details' [OK]
      Hint: Use dash (-) for list items under models in YAML [OK]
      Common Mistakes:
      • Using singular 'model' instead of 'models'
      • Not using dash for list items
      • Incorrect indentation or key names
      3. Given this YAML snippet documenting a model and its columns:
      models:
        - name: customers
          description: 'Customer information'
          columns:
            - name: id
              description: 'Unique customer ID'
            - name: email
              description: 'Customer email address'
      What will dbt show as the description for the email column?
      medium
      A. Unique customer ID
      B. Customer email address
      C. Customer information
      D. No description

      Solution

      1. Step 1: Locate the column description in YAML

        The email column is listed under columns with its own description key.
      2. Step 2: Identify the description text for the email column

        The description for email is 'Customer email address', which dbt will display for that column.
      3. Final Answer:

        Customer email address -> Option B
      4. Quick Check:

        Column description matches YAML text [OK]
      Hint: Column descriptions are under columns > name in YAML [OK]
      Common Mistakes:
      • Confusing model description with column description
      • Missing indentation causing YAML parsing errors
      • Assuming no description if not repeated
      4. You wrote this YAML to document a model but dbt throws an error:
      models:
        - name: sales
          description: 'Sales data'
          columns:
            name: amount
            description: 'Sale amount'
      What is the error in this YAML?
      medium
      A. Missing dash (-) before column name and description
      B. Incorrect model name key
      C. Description should be under models, not columns
      D. YAML does not support nested lists

      Solution

      1. Step 1: Check YAML list syntax for columns

        Each column should be a list item with a dash (-) before its dictionary of keys.
      2. Step 2: Identify missing dash in columns

        The name and description keys under columns lack the dash, so YAML treats them as keys of columns instead of list items.
      3. Final Answer:

        Missing dash (-) before column name and description -> Option A
      4. Quick Check:

        List items need dash (-) in YAML [OK]
      Hint: Use dash (-) before each column in columns list [OK]
      Common Mistakes:
      • Forgetting dash for list items
      • Misplacing description keys
      • Confusing YAML lists and dictionaries
      5. You want to document two models, users and transactions, each with columns and descriptions. Which YAML structure correctly documents both models with their columns?
      hard
      A. models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier'
      B. models: users: description: 'User data' columns: user_id: 'User identifier' transactions: description: 'Transaction data' columns: transaction_id: 'Transaction identifier'
      C. models: - users: description: 'User data' columns: - user_id: 'User identifier' - transactions: description: 'Transaction data' columns: - transaction_id: 'Transaction identifier'
      D. models: name: users description: 'User data' columns: - name: user_id description: 'User identifier' name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier'

      Solution

      1. Step 1: Understand YAML list structure for multiple models

        dbt expects models as a list of dictionaries, each with name, description, and columns as a list.
      2. Step 2: Evaluate each option's structure

        models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' correctly uses a list with two model dictionaries, each with proper keys and column lists. The other options misuse keys or structure. models: name: users description: 'User data' columns: - name: user_id description: 'User identifier' name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' repeats keys incorrectly.
      3. Final Answer:

        models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' -> Option A
      4. Quick Check:

        Multiple models as list items with name and columns = models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' [OK]
      Hint: List each model with dash (-) and include columns as lists [OK]
      Common Mistakes:
      • Using model names as keys instead of list items
      • Repeating keys at same level
      • Not using dash for multiple models