Building a DAG of models in dbt - Mini Project: Build & Apply
Start learning this pattern below
Jump into concepts and practice - no test required
base_customers.sql with a SQL query that selects id and name from a table called raw_customers. Write the exact SQL: select id, name from raw_customers.This model is the starting point. It just selects data from the raw source table.
active_customers.sql. Use the ref() function to select all columns from the base_customers model. Write the exact SQL: select * from {{ ref('base_customers') }}.This model depends on the base model. The ref() function tells dbt about this dependency.
active_customers.sql model to select only customers with id greater than 100. Use the exact SQL: select * from {{ ref('base_customers') }} where id > 100.Adding a filter shows how you can transform data in dependent models.
dbt ls --select active_customers --output graph.This command lists the models and shows their dependencies as a graph.
Practice
What does a DAG represent in dbt?
Solution
Step 1: Understand what DAG means in dbt context
A DAG (Directed Acyclic Graph) shows how models are connected by dependencies.Step 2: Identify the role of DAG in dbt
dbt uses the DAG to know which models to run first based on dependencies.Final Answer:
The order in which models depend on each other -> Option CQuick Check:
DAG = model dependency order [OK]
- Confusing DAG with SQL syntax
- Thinking DAG lists all tables
- Mixing DAG with dbt config files
Which of the following is the correct way to reference another model in a dbt SQL file?
SELECT * FROM ___Solution
Step 1: Recall the syntax for referencing models in dbt
dbt uses the function ref() with the model name as a string inside parentheses.Step 2: Check each option for correct syntax
ref('model_name') uses ref('model_name') which is correct; others have syntax errors or wrong quotes.Final Answer:
ref('model_name') -> Option BQuick Check:
Use ref('model_name') with quotes [OK]
- Omitting quotes around model name
- Using wrong quote types
- Using colons or other symbols
Given these two models, what is the order dbt will run them?
-- model_a.sql
SELECT * FROM source_table
-- model_b.sql
SELECT * FROM {{ ref('model_a') }}Solution
Step 1: Identify dependencies from ref()
model_b references model_a using ref(), so model_b depends on model_a.Step 2: Determine run order based on dependencies
dbt runs model_a first, then model_b to ensure data is ready.Final Answer:
model_a runs first, then model_b -> Option AQuick Check:
Dependency order = model_a before model_b [OK]
- Assuming ref() means reverse dependency
- Thinking models run simultaneously
- Confusing circular dependency errors
What is wrong with this dbt model code snippet?
SELECT * FROM {{ ref(model_a) }}Solution
Step 1: Check syntax of ref() usage
ref() requires the model name as a string with quotes inside the parentheses.Step 2: Identify the error in the code snippet
model_a is not quoted, causing a syntax error in dbt compilation.Final Answer:
Missing quotes around model name in ref() -> Option DQuick Check:
ref('model_name') needs quotes [OK]
- Forgetting quotes around model names
- Thinking ref() can't be in SELECT
- Assuming case sensitivity causes error
You have three models: model_x, model_y, and model_z. model_y references model_x, and model_z references both model_x and model_y. Which of the following is the correct order dbt will run these models?
Solution
Step 1: Analyze dependencies among models
model_y depends on model_x; model_z depends on both model_x and model_y.Step 2: Determine run order respecting dependencies
model_x runs first (no dependencies), then model_y (depends on model_x), then model_z (depends on both).Final Answer:
model_x, model_y, model_z -> Option AQuick Check:
Run order respects dependencies [OK]
- Running dependent models before their dependencies
- Ignoring multiple dependencies
- Assuming any order works if models reference each other
