0
0
dbtdata~5 mins

Building a DAG of models in dbt - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Building a DAG of models
O(n)
Understanding Time Complexity

When building a DAG of models in dbt, we want to understand how the time to run all models grows as we add more models.

We ask: How does the total work increase when the number of models grows?

Scenario Under Consideration

Analyze the time complexity of the following dbt model dependencies.


-- model_a.sql
select * from source_table

-- model_b.sql
select * from {{ ref('model_a') }}

-- model_c.sql
select * from {{ ref('model_b') }}

-- model_d.sql
select * from {{ ref('model_a') }}

-- model_e.sql
select * from {{ ref('model_c') }} join {{ ref('model_d') }} on ...
    

This code shows models depending on others, forming a Directed Acyclic Graph (DAG) of dependencies.

Identify Repeating Operations

Look at how dbt runs models based on dependencies.

  • Primary operation: Running each model once after its dependencies.
  • How many times: Each model runs exactly one time.
How Execution Grows With Input

As you add more models, the total work grows roughly by the number of models.

Input Size (n)Approx. Operations
10 models10 runs
100 models100 runs
1000 models1000 runs

Pattern observation: The total work grows linearly as you add more models.

Final Time Complexity

Time Complexity: O(n)

This means the total time to build all models grows directly with the number of models.

Common Mistake

[X] Wrong: "Running one model means running all its dependencies multiple times."

[OK] Correct: dbt runs each model once and reuses results, so dependencies are not rerun repeatedly.

Interview Connect

Understanding how work grows with model count helps you design efficient data pipelines and explain your approach clearly.

Self-Check

"What if some models depend on many others and run slower? How would that affect the overall time complexity?"