0
0
dbtdata~5 mins

One model per source table rule in dbt - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: One model per source table rule
O(n)
Understanding Time Complexity

When using dbt, we often create one model for each source table. This helps keep things clear and organized.

We want to understand how the time to run these models grows as the number of source tables increases.

Scenario Under Consideration

Analyze the time complexity of this dbt project structure.


-- models/customer.sql
select * from {{ source('sales', 'customer') }}

-- models/orders.sql
select * from {{ source('sales', 'orders') }}

-- models/products.sql
select * from {{ source('inventory', 'products') }}

-- models/payments.sql
select * from {{ source('finance', 'payments') }}

This project has one model for each source table, each simply selecting from its source.

Identify Repeating Operations

Each model runs a query that reads one source table.

  • Primary operation: Running a query on a source table.
  • How many times: Once per source table (one model per source).
How Execution Grows With Input

As the number of source tables grows, the number of models grows the same way.

Number of Source Tables (n)Number of Models Run
1010
100100
10001000

Pattern observation: The total work grows directly with the number of source tables.

Final Time Complexity

Time Complexity: O(n)

This means the total time to run all models grows linearly with the number of source tables.

Common Mistake

[X] Wrong: "Adding more source tables won't affect total run time much because each model runs independently."

[OK] Correct: Even if models run separately, the total time adds up because each model processes its source table once.

Interview Connect

Understanding how your dbt project scales with more source tables shows you can plan for growth and keep your data pipeline efficient.

Self-Check

"What if we combined multiple source tables into fewer models? How would that change the time complexity?"