0
0
dbtdata~5 mins

Creating your first model in dbt - Performance & Efficiency

Choose your learning style9 modes available
Time Complexity: Creating your first model
O(n)
Understanding Time Complexity

When creating your first model in dbt, it is helpful to understand how the time to run your model grows as your data grows.

We want to know how the work dbt does changes when the input data size changes.

Scenario Under Consideration

Analyze the time complexity of the following dbt model SQL code.


select
  customer_id,
  count(order_id) as total_orders
from {{ ref('raw_orders') }}
group by customer_id

This code creates a model that counts orders per customer from a raw orders table.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Scanning all rows in the raw_orders table.
  • How many times: Once over all rows to group and count orders per customer.
How Execution Grows With Input

As the number of orders grows, the time to scan and group them grows roughly in direct proportion.

Input Size (n)Approx. Operations
10About 10 rows scanned and grouped
100About 100 rows scanned and grouped
1000About 1000 rows scanned and grouped

Pattern observation: The work grows steadily as the number of rows grows.

Final Time Complexity

Time Complexity: O(n)

This means the time to run the model grows roughly in direct proportion to the number of rows in the input table.

Common Mistake

[X] Wrong: "Grouping data is instant and does not depend on input size."

[OK] Correct: Grouping requires looking at each row to count, so it takes more time as data grows.

Interview Connect

Understanding how data size affects model run time helps you write efficient dbt models and explain your reasoning clearly.

Self-Check

"What if we added a filter to only include recent orders? How would that change the time complexity?"