Custom singular tests in dbt - Time & Space Complexity
When we write custom singular tests in dbt, we want to know how the time to run these tests changes as our data grows.
We ask: How does the test execution time grow when the data size increases?
Analyze the time complexity of the following dbt custom singular test.
select
count(*) as error_count
from {{ model }}
where some_column is null
This test counts how many rows in a model have a null value in a specific column.
Look for repeated actions in the test query.
- Primary operation: Scanning all rows in the model to check the column value.
- How many times: Once for each row in the model.
The test checks every row once, so if the number of rows grows, the work grows too.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 checks |
| 100 | 100 checks |
| 1000 | 1000 checks |
Pattern observation: The number of operations grows directly with the number of rows.
Time Complexity: O(n)
This means the test takes longer in direct proportion to the number of rows it checks.
[X] Wrong: "The test runs instantly no matter how big the data is."
[OK] Correct: Because the test looks at every row, more rows mean more work and more time.
Understanding how tests scale with data size helps you write efficient checks and shows you think about performance in real projects.
"What if the test checked two columns instead of one? How would the time complexity change?"