What is dbt - Complexity Analysis
We want to understand how the time it takes to run dbt projects changes as the data or code grows.
How does dbt's execution time grow when we add more models or data?
Analyze the time complexity of the following dbt project run commands.
-- dbt project with multiple models
models:
- name: model_a
sql: "SELECT * FROM source_table"
- name: model_b
sql: "SELECT * FROM {{ ref('model_a') }}"
- name: model_c
sql: "SELECT * FROM {{ ref('model_b') }}"
-- Running dbt models
$ dbt run
This code runs three models where each depends on the previous one.
Look at what repeats when dbt runs models.
- Primary operation: Running each model's SQL query.
- How many times: Once per model, in order of dependencies.
As you add more models, dbt runs more queries one after another.
| Input Size (n models) | Approx. Operations (queries run) |
|---|---|
| 3 | 3 |
| 10 | 10 |
| 100 | 100 |
Pattern observation: The number of operations grows directly with the number of models.
Time Complexity: O(n)
This means if you double the number of models, the time to run roughly doubles.
[X] Wrong: "Running more models will take the same time because dbt runs them all at once."
[OK] Correct: dbt runs models one by one following dependencies, so more models mean more queries and more time.
Understanding how dbt runs models helps you explain project scaling and performance in real data workflows.
"What if dbt could run independent models in parallel? How would that change the time complexity?"