0
0
dbtdata~5 mins

Why models are the core of dbt - Performance Analysis

Choose your learning style9 modes available
Time Complexity: Why models are the core of dbt
O(n)
Understanding Time Complexity

We want to understand how the work done by dbt models grows as data size increases.

How does the time to build models change when the input data grows?

Scenario Under Consideration

Analyze the time complexity of this simple dbt model SQL code.


-- models/my_model.sql
select
  user_id,
  count(*) as total_orders
from {{ ref('raw_orders') }}
group by user_id
    

This model groups raw orders by user and counts orders per user.

Identify Repeating Operations

Look at what repeats as data grows.

  • Primary operation: Scanning all rows in the raw_orders table.
  • How many times: Once over all input rows to group and count.
How Execution Grows With Input

As the number of rows in raw_orders grows, the work grows roughly the same amount.

Input Size (n)Approx. Operations
1010 scans and counts
100100 scans and counts
10001000 scans and counts

Pattern observation: The work grows directly with the number of rows.

Final Time Complexity

Time Complexity: O(n)

This means the time to build the model grows linearly with the input data size.

Common Mistake

[X] Wrong: "The model runs instantly no matter how big the data is."

[OK] Correct: The model must process every row, so more data means more work and more time.

Interview Connect

Knowing how model build time grows helps you explain performance and plan data workflows confidently.

Self-Check

"What if the model joined two large tables instead of one? How would the time complexity change?"