ref() function for model dependencies in dbt - Time & Space Complexity
When using dbt, the ref() function helps models depend on each other. Understanding how this affects execution time is important.
We want to know how the time to build models grows as the number of dependencies increases.
Analyze the time complexity of the following dbt model code using ref().
select *
from {{ ref('base_model') }}
where id in (
select id from {{ ref('dependent_model') }}
)
This code selects data from one model that depends on two other models using ref().
Look for repeated work in model building and data fetching.
- Primary operation: Running SQL queries for each referenced model.
- How many times: Once per model dependency, plus the current model.
As you add more models that depend on others, the total work grows.
| Number of Models (n) | Approx. Operations |
|---|---|
| 3 | 3 queries |
| 10 | 10 queries |
| 100 | 100 queries |
Pattern observation: Each model adds one query to run, so work grows linearly.
Time Complexity: O(n)
This means the time to build models grows directly with the number of models referenced.
[X] Wrong: "Using ref() does not add any extra time because it just points to models."
[OK] Correct: Each ref() causes dbt to run that model's SQL, so more refs mean more queries and more time.
Knowing how dependencies affect build time helps you design efficient data pipelines and explain your choices clearly.
What if we changed from many small models with ref() to one big model without dependencies? How would the time complexity change?