Testing model outputs in dbt - Time & Space Complexity
When testing model outputs in dbt, we want to know how the time to run tests changes as data grows.
We ask: How does the test execution time grow with more data?
Analyze the time complexity of the following dbt test code.
-- test to check for nulls in a column
select
count(*) as null_count
from {{ ref('my_model') }}
where important_column is null
This test counts how many rows have nulls in a specific column of a model.
Look for repeated work in the test query.
- Primary operation: Scanning all rows in the model table.
- How many times: Once per test run, but it checks every row.
The test reads every row to count nulls, so more rows mean more work.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 row checks |
| 100 | 100 row checks |
| 1000 | 1000 row checks |
Pattern observation: The work grows directly with the number of rows.
Time Complexity: O(n)
This means the test time grows linearly as the data size grows.
[X] Wrong: "The test only checks a few rows, so it runs in constant time."
[OK] Correct: The test scans every row to count nulls, so it must look at all data, making time grow with data size.
Understanding how test queries scale helps you write efficient checks and explain their impact clearly in real projects.
"What if the test checked for nulls only in a small indexed subset of rows? How would the time complexity change?"