Naming conventions at scale in dbt - Time & Space Complexity
When working with many dbt models, naming conventions help organize and find things quickly.
We want to understand how the effort to manage names grows as the number of models increases.
Analyze the time complexity of the following dbt naming convention check.
{% for model in graph.nodes.values() %}
{% if not model.name.startswith('stg_') and model.resource_type == 'model' %}
{{ exceptions.raise_compiler_error('Model name must start with stg_') }}
{% endif %}
{% endfor %}
This code loops over all models and checks if their names follow the 'stg_' prefix rule.
Look for repeated actions in the code.
- Primary operation: Looping over all models in the project.
- How many times: Once for each model, so as many times as there are models.
As the number of models grows, the checks grow too.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 checks |
| 100 | 100 checks |
| 1000 | 1000 checks |
Pattern observation: The number of checks grows directly with the number of models.
Time Complexity: O(n)
This means the time to check naming grows in a straight line as you add more models.
[X] Wrong: "Checking names only once is enough, no matter how many models there are."
[OK] Correct: Each model needs its own check, so the total work grows with the number of models.
Understanding how checks scale helps you design dbt projects that stay manageable as they grow.
"What if we grouped models by folder and checked naming only once per folder? How would the time complexity change?"