Bird
Raised Fist0
dbtdata~20 mins

Documenting models in YAML in dbt - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
dbt YAML Documentation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding YAML structure for dbt model documentation
Which YAML structure correctly documents a dbt model named sales_data with a description and a column named order_id that has its own description?
A
models:
  - sales_data:
      description: "Contains sales order records"
      columns:
        - order_id:
            description: "Unique identifier for each order"
B
models:
  - name: sales_data
    description: "Contains sales order records"
    columns:
      - name: order_id
        description: "Unique identifier for each order"
C
models:
  - name: sales_data
    description: "Contains sales order records"
    columns:
      order_id:
        description: "Unique identifier for each order"
D
models:
  sales_data:
    description: "Contains sales order records"
    columns:
      order_id:
        description: "Unique identifier for each order"
Attempts:
2 left
💡 Hint
Remember that in dbt YAML, models are listed as a sequence with - name: and columns are a list of dictionaries with - name:.
Predict Output
intermediate
1:30remaining
Output of parsing a dbt model YAML snippet
Given this YAML snippet for a dbt model documentation, what is the value of the description for the column customer_id?
models:
  - name: customer_orders
    description: "Orders placed by customers"
    columns:
      - name: order_id
        description: "Order unique ID"
      - name: customer_id
        description: "Unique customer identifier"
Anull
B"Order unique ID"
C"Orders placed by customers"
D"Unique customer identifier"
Attempts:
2 left
💡 Hint
Look for the column named customer_id and read its description.
🔧 Debug
advanced
2:00remaining
Identify the error in this dbt model YAML documentation
What error will occur when dbt tries to parse this YAML documentation for a model?
models:
  - name: product_sales
    description: "Sales data for products"
    columns:
      name: product_id
      description: "ID of the product"
AKeyError because 'columns' is not a list
BTypeError because description is not a string
CSyntaxError due to missing dash before column name
DNo error, YAML is valid
Attempts:
2 left
💡 Hint
Check how columns should be listed in dbt YAML documentation.
data_output
advanced
1:30remaining
Number of documented columns in a dbt model YAML
How many columns are documented in this YAML snippet?
models:
  - name: user_activity
    description: "Tracks user actions"
    columns:
      - name: user_id
        description: "User identifier"
      - name: action_type
        description: "Type of action performed"
      - name: timestamp
        description: "When the action occurred"
A3
B2
C1
D4
Attempts:
2 left
💡 Hint
Count the number of items under the columns list.
🚀 Application
expert
3:00remaining
Visualizing model documentation completeness from YAML
You have a YAML file documenting multiple dbt models. You want to create a bar chart showing the number of columns documented per model. Which Python code snippet using PyYAML and matplotlib correctly extracts the data and plots it?
A
import yaml
import matplotlib.pyplot as plt

with open('models.yml') as f:
    data = yaml.load(f, Loader=yaml.SafeLoader)

models = data['models']
model_names = []
column_counts = []
for model in models:
    model_names.append(model['name'])
    column_counts.append(len(model['columns']))

plt.plot(model_names, column_counts)
plt.xlabel('Model Name')
plt.ylabel('Number of Columns')
plt.title('Columns Documented per Model')
plt.show()
B
import yaml
import matplotlib.pyplot as plt

with open('models.yml') as f:
    data = yaml.safe_load(f)

model_names = []
column_counts = []
for model in data['models']:
    model_names.append(model['name'])
    column_counts.append(len(model['columns']))

plt.barh(model_names, column_counts)
plt.xlabel('Number of Columns')
plt.ylabel('Model Name')
plt.title('Columns Documented per Model')
plt.show()
C
import yaml
import matplotlib.pyplot as plt

with open('models.yml') as f:
    data = yaml.safe_load(f)

models = data['models']
model_names = []
column_counts = []
for model in models:
    model_names.append(model['name'])
    column_counts.append(len(model.get('columns', [])))

plt.bar(model_names, column_counts)
plt.xlabel('Model Name')
plt.ylabel('Number of Columns')
plt.title('Columns Documented per Model')
plt.show()
D
import yaml
import matplotlib.pyplot as plt

with open('models.yml') as f:
    data = yaml.safe_load(f)

models = data['models']
model_names = []
column_counts = []
for model in models:
    model_names.append(model['name'])
    column_counts.append(len(model['columns']))

plt.scatter(model_names, column_counts)
plt.xlabel('Model Name')
plt.ylabel('Number of Columns')
plt.title('Columns Documented per Model')
plt.show()
Attempts:
2 left
💡 Hint
Use safe_load for YAML parsing and bar chart for count visualization. Handle missing columns with get().

Practice

(1/5)
1. What is the main purpose of documenting models in YAML in a dbt project?
easy
A. To write SQL queries inside YAML files
B. To execute dbt models automatically
C. To add clear descriptions for models and columns to improve understanding
D. To store raw data files

Solution

  1. Step 1: Understand the role of YAML documentation

    YAML files in dbt are used to add metadata like descriptions, not to run code or store data.
  2. Step 2: Identify the benefit of documentation

    Adding descriptions for models and columns helps team members understand the data and maintain the project easily.
  3. Final Answer:

    To add clear descriptions for models and columns to improve understanding -> Option C
  4. Quick Check:

    Documentation purpose = Add descriptions [OK]
Hint: Documentation in YAML means adding descriptions, not code [OK]
Common Mistakes:
  • Thinking YAML runs SQL code
  • Confusing YAML with data storage
  • Ignoring the importance of descriptions
2. Which of the following is the correct way to start documenting a model named orders in a YAML file?
easy
A. models: orders description: 'Contains order details'
B. model: name: orders description: 'Contains order details'
C. models: - orders: description: 'Contains order details'
D. models: - name: orders description: 'Contains order details'

Solution

  1. Step 1: Recall YAML syntax for dbt model documentation

    dbt expects a list under models: with each model as a dictionary containing name and description.
  2. Step 2: Match the correct structure

    models: - name: orders description: 'Contains order details' correctly uses a list with a dictionary having name and description. Other options misuse keys or structure.
  3. Final Answer:

    models: - name: orders description: 'Contains order details' -> Option D
  4. Quick Check:

    Model list with name and description = models: - name: orders description: 'Contains order details' [OK]
Hint: Use dash (-) for list items under models in YAML [OK]
Common Mistakes:
  • Using singular 'model' instead of 'models'
  • Not using dash for list items
  • Incorrect indentation or key names
3. Given this YAML snippet documenting a model and its columns:
models:
  - name: customers
    description: 'Customer information'
    columns:
      - name: id
        description: 'Unique customer ID'
      - name: email
        description: 'Customer email address'
What will dbt show as the description for the email column?
medium
A. Unique customer ID
B. Customer email address
C. Customer information
D. No description

Solution

  1. Step 1: Locate the column description in YAML

    The email column is listed under columns with its own description key.
  2. Step 2: Identify the description text for the email column

    The description for email is 'Customer email address', which dbt will display for that column.
  3. Final Answer:

    Customer email address -> Option B
  4. Quick Check:

    Column description matches YAML text [OK]
Hint: Column descriptions are under columns > name in YAML [OK]
Common Mistakes:
  • Confusing model description with column description
  • Missing indentation causing YAML parsing errors
  • Assuming no description if not repeated
4. You wrote this YAML to document a model but dbt throws an error:
models:
  - name: sales
    description: 'Sales data'
    columns:
      name: amount
      description: 'Sale amount'
What is the error in this YAML?
medium
A. Missing dash (-) before column name and description
B. Incorrect model name key
C. Description should be under models, not columns
D. YAML does not support nested lists

Solution

  1. Step 1: Check YAML list syntax for columns

    Each column should be a list item with a dash (-) before its dictionary of keys.
  2. Step 2: Identify missing dash in columns

    The name and description keys under columns lack the dash, so YAML treats them as keys of columns instead of list items.
  3. Final Answer:

    Missing dash (-) before column name and description -> Option A
  4. Quick Check:

    List items need dash (-) in YAML [OK]
Hint: Use dash (-) before each column in columns list [OK]
Common Mistakes:
  • Forgetting dash for list items
  • Misplacing description keys
  • Confusing YAML lists and dictionaries
5. You want to document two models, users and transactions, each with columns and descriptions. Which YAML structure correctly documents both models with their columns?
hard
A. models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier'
B. models: users: description: 'User data' columns: user_id: 'User identifier' transactions: description: 'Transaction data' columns: transaction_id: 'Transaction identifier'
C. models: - users: description: 'User data' columns: - user_id: 'User identifier' - transactions: description: 'Transaction data' columns: - transaction_id: 'Transaction identifier'
D. models: name: users description: 'User data' columns: - name: user_id description: 'User identifier' name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier'

Solution

  1. Step 1: Understand YAML list structure for multiple models

    dbt expects models as a list of dictionaries, each with name, description, and columns as a list.
  2. Step 2: Evaluate each option's structure

    models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' correctly uses a list with two model dictionaries, each with proper keys and column lists. The other options misuse keys or structure. models: name: users description: 'User data' columns: - name: user_id description: 'User identifier' name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' repeats keys incorrectly.
  3. Final Answer:

    models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' -> Option A
  4. Quick Check:

    Multiple models as list items with name and columns = models: - name: users description: 'User data' columns: - name: user_id description: 'User identifier' - name: transactions description: 'Transaction data' columns: - name: transaction_id description: 'Transaction identifier' [OK]
Hint: List each model with dash (-) and include columns as lists [OK]
Common Mistakes:
  • Using model names as keys instead of list items
  • Repeating keys at same level
  • Not using dash for multiple models