0
0
dbtdata~20 mins

Documenting models in YAML in dbt - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
dbt YAML Documentation Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding YAML structure for dbt model documentation
Which YAML structure correctly documents a dbt model named sales_data with a description and a column named order_id that has its own description?
A
models:
  - sales_data:
      description: "Contains sales order records"
      columns:
        - order_id:
            description: "Unique identifier for each order"
B
models:
  - name: sales_data
    description: "Contains sales order records"
    columns:
      - name: order_id
        description: "Unique identifier for each order"
C
models:
  - name: sales_data
    description: "Contains sales order records"
    columns:
      order_id:
        description: "Unique identifier for each order"
D
models:
  sales_data:
    description: "Contains sales order records"
    columns:
      order_id:
        description: "Unique identifier for each order"
Attempts:
2 left
💡 Hint
Remember that in dbt YAML, models are listed as a sequence with - name: and columns are a list of dictionaries with - name:.
Predict Output
intermediate
1:30remaining
Output of parsing a dbt model YAML snippet
Given this YAML snippet for a dbt model documentation, what is the value of the description for the column customer_id?
models:
  - name: customer_orders
    description: "Orders placed by customers"
    columns:
      - name: order_id
        description: "Order unique ID"
      - name: customer_id
        description: "Unique customer identifier"
Anull
B"Order unique ID"
C"Orders placed by customers"
D"Unique customer identifier"
Attempts:
2 left
💡 Hint
Look for the column named customer_id and read its description.
🔧 Debug
advanced
2:00remaining
Identify the error in this dbt model YAML documentation
What error will occur when dbt tries to parse this YAML documentation for a model?
models:
  - name: product_sales
    description: "Sales data for products"
    columns:
      name: product_id
      description: "ID of the product"
AKeyError because 'columns' is not a list
BTypeError because description is not a string
CSyntaxError due to missing dash before column name
DNo error, YAML is valid
Attempts:
2 left
💡 Hint
Check how columns should be listed in dbt YAML documentation.
data_output
advanced
1:30remaining
Number of documented columns in a dbt model YAML
How many columns are documented in this YAML snippet?
models:
  - name: user_activity
    description: "Tracks user actions"
    columns:
      - name: user_id
        description: "User identifier"
      - name: action_type
        description: "Type of action performed"
      - name: timestamp
        description: "When the action occurred"
A3
B2
C1
D4
Attempts:
2 left
💡 Hint
Count the number of items under the columns list.
🚀 Application
expert
3:00remaining
Visualizing model documentation completeness from YAML
You have a YAML file documenting multiple dbt models. You want to create a bar chart showing the number of columns documented per model. Which Python code snippet using PyYAML and matplotlib correctly extracts the data and plots it?
A
import yaml
import matplotlib.pyplot as plt

with open('models.yml') as f:
    data = yaml.load(f, Loader=yaml.SafeLoader)

models = data['models']
model_names = []
column_counts = []
for model in models:
    model_names.append(model['name'])
    column_counts.append(len(model['columns']))

plt.plot(model_names, column_counts)
plt.xlabel('Model Name')
plt.ylabel('Number of Columns')
plt.title('Columns Documented per Model')
plt.show()
B
import yaml
import matplotlib.pyplot as plt

with open('models.yml') as f:
    data = yaml.safe_load(f)

model_names = []
column_counts = []
for model in data['models']:
    model_names.append(model['name'])
    column_counts.append(len(model['columns']))

plt.barh(model_names, column_counts)
plt.xlabel('Number of Columns')
plt.ylabel('Model Name')
plt.title('Columns Documented per Model')
plt.show()
C
import yaml
import matplotlib.pyplot as plt

with open('models.yml') as f:
    data = yaml.safe_load(f)

models = data['models']
model_names = []
column_counts = []
for model in models:
    model_names.append(model['name'])
    column_counts.append(len(model.get('columns', [])))

plt.bar(model_names, column_counts)
plt.xlabel('Model Name')
plt.ylabel('Number of Columns')
plt.title('Columns Documented per Model')
plt.show()
D
import yaml
import matplotlib.pyplot as plt

with open('models.yml') as f:
    data = yaml.safe_load(f)

models = data['models']
model_names = []
column_counts = []
for model in models:
    model_names.append(model['name'])
    column_counts.append(len(model['columns']))

plt.scatter(model_names, column_counts)
plt.xlabel('Model Name')
plt.ylabel('Number of Columns')
plt.title('Columns Documented per Model')
plt.show()
Attempts:
2 left
💡 Hint
Use safe_load for YAML parsing and bar chart for count visualization. Handle missing columns with get().