0
0
dbtdata~10 mins

Model dependencies and parallelism in dbt - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Model dependencies and parallelism
Start: Define models
Identify dependencies
Build dependency graph
Find models with no dependencies
Run independent models in parallel
Wait for dependent models
Run dependent models
Repeat until all models built
End: All models built
This flow shows how dbt finds which models depend on others, then runs independent models at the same time, waiting to run dependent models after their dependencies finish.
Execution Sample
dbt
models:
  - name: customers
    depends_on: []
  - name: orders
    depends_on: [customers]
  - name: order_items
    depends_on: [orders]
  - name: products
    depends_on: []
Defines four models with dependencies: 'orders' depends on 'customers', 'order_items' depends on 'orders', 'products' has no dependencies.
Execution Table
StepActionModels RunDependencies Met?ParallelismNotes
1Identify models with no dependenciescustomers, productsYesRun both in parallelStart with independent models
2Wait for customers and products to finishcustomers, productsYesWaitingDependent models wait
3Run orders (depends on customers)orderscustomers doneSingle runOrders can run now
4Wait for orders to finishordersYesWaitingorder_items waits
5Run order_items (depends on orders)order_itemsorders doneSingle runFinal dependent model
6All models built---Process complete
💡 All models have been run respecting dependencies and maximizing parallelism.
Variable Tracker
VariableStartAfter Step 1After Step 3After Step 5Final
customers_statusnot runrunningdonedonedone
products_statusnot runrunningdonedonedone
orders_statusnot runnot runrunningdonedone
order_items_statusnot runnot runnot runrunningdone
Key Moments - 3 Insights
Why can 'customers' and 'products' run at the same time?
Because they have no dependencies, as shown in execution_table step 1, they can run in parallel safely.
Why must 'orders' wait until 'customers' finishes?
Orders depend on customers, so orders can only run after customers are done, as shown in step 3 where dependencies are met.
What happens if a model has multiple dependencies?
It waits until all dependencies are done before running, ensuring data correctness, similar to how 'order_items' waits for 'orders' in step 5.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, which models run in parallel at step 1?
Acustomers and products
Borders and order_items
Corders only
Dorder_items only
💡 Hint
Check the 'Models Run' and 'Parallelism' columns in step 1 of the execution_table.
At which step does 'orders' start running?
AStep 5
BStep 1
CStep 3
DStep 2
💡 Hint
Look at the 'Action' and 'Models Run' columns in the execution_table to find when 'orders' runs.
If 'products' depended on 'customers', how would step 1 change?
A'orders' would run at step 1
BOnly 'customers' would run at step 1
C'products' would run before 'customers'
DNo models would run at step 1
💡 Hint
Consider dependencies and which models have none at step 1 in the execution_table.
Concept Snapshot
dbt runs models respecting dependencies.
Models with no dependencies run in parallel.
Dependent models wait until their dependencies finish.
This speeds up builds and keeps data correct.
Define dependencies clearly to maximize parallelism.
Full Transcript
In dbt, models can depend on other models. The system builds a graph of these dependencies. Models without dependencies run first and can run at the same time, or in parallel. Models that depend on others wait until those finish. This process repeats until all models are built. For example, 'customers' and 'products' run together because they have no dependencies. 'orders' waits for 'customers' to finish. 'order_items' waits for 'orders'. This approach speeds up the build while ensuring data is ready when needed.