0
0
dbtdata~5 mins

Query profiling and optimization in dbt - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: Query profiling and optimization
O(n)
Understanding Time Complexity

When we profile and optimize queries in dbt, we want to know how the time to run a query changes as the data grows.

We ask: How does the query's work increase when the input data gets bigger?

Scenario Under Consideration

Analyze the time complexity of the following dbt SQL model.


select
  user_id,
  count(*) as total_orders
from {{ ref('orders') }}
where order_date >= '2023-01-01'
group by user_id
    

This query counts orders per user since the start of 2023.

Identify Repeating Operations

Look for repeated work inside the query.

  • Primary operation: Scanning all rows in the orders table after filtering by date.
  • How many times: Once per row that matches the date filter.
How Execution Grows With Input

As the number of orders grows, the query must check more rows and group them.

Input Size (n)Approx. Operations
10,000 ordersAbout 10,000 row checks and grouping steps
100,000 ordersAbout 100,000 row checks and grouping steps
1,000,000 ordersAbout 1,000,000 row checks and grouping steps

Pattern observation: The work grows roughly in direct proportion to the number of rows.

Final Time Complexity

Time Complexity: O(n)

This means the query time grows linearly as the number of orders increases.

Common Mistake

[X] Wrong: "Adding a WHERE filter always makes the query run in constant time."

[OK] Correct: The filter reduces rows but the query still scans all matching rows, so time grows with filtered data size.

Interview Connect

Understanding how query time grows helps you write efficient dbt models and explain your choices clearly in interviews.

Self-Check

What if we added an index on order_date? How would the time complexity change?