0
0
dbtdata~5 mins

Creating your own dbt package - Performance & Efficiency

Choose your learning style9 modes available
Time Complexity: Creating your own dbt package
O(n)
Understanding Time Complexity

When creating your own dbt package, it's important to understand how the time to run your models grows as you add more models or data.

We want to know how the execution time changes when the package size or data size increases.

Scenario Under Consideration

Analyze the time complexity of the following dbt package structure.

-- models/my_package/model_a.sql
select * from source_table;

-- models/my_package/model_b.sql
select * from {{ ref('model_a') }} where condition = 'value';

-- models/my_package/model_c.sql
select * from {{ ref('model_b') }} join another_table on key = key;

This package has three models where each model depends on the previous one, building on the data step by step.

Identify Repeating Operations

Look at what repeats when dbt runs this package.

  • Primary operation: Running each model's SQL query once in order.
  • How many times: Once per model, so three times here.
How Execution Grows With Input

As you add more models to your package, dbt runs each one in sequence.

Input Size (models)Approx. Operations (queries run)
33
1010
100100

Pattern observation: The total work grows directly with the number of models you have.

Final Time Complexity

Time Complexity: O(n)

This means the time to run your package grows linearly with the number of models you include.

Common Mistake

[X] Wrong: "Adding more models won't affect run time much because dbt runs them fast."

[OK] Correct: Each model runs a query, so more models mean more queries and longer total run time.

Interview Connect

Understanding how your dbt package scales helps you design efficient data workflows and shows you can think about performance as your projects grow.

Self-Check

"What if your models had multiple dependencies and some ran in parallel? How would that affect the time complexity?"