dbtdata~5 mins

Multi-source fan-in patterns in dbt - Time & Space Complexity

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Time Complexity: Multi-source fan-in patterns

O(n)

Understanding Time Complexity

When combining data from many sources into one place, it is important to know how the work grows as sources increase.

We ask: How does the time to combine data change when we add more sources?

Scenario Under Consideration

Analyze the time complexity of the following dbt model combining multiple sources.


with source1 as (
  select * from {{ source('db1', 'table1') }}
),
source2 as (
  select * from {{ source('db2', 'table2') }}
),
source3 as (
  select * from {{ source('db3', 'table3') }}
)

select * from source1
union all
select * from source2
union all
select * from source3

This code reads from three different sources and combines all rows into one result.

Identify Repeating Operations

Look for repeated work in the code.

Primary operation: Reading and scanning each source table.
How many times: Once per source, here 3 times, but could be more if more sources added.

How Execution Grows With Input

As the number of sources grows, the total work grows by adding the work for each source.

Number of Sources (n)	Approx. Operations
3	3 times reading source tables
10	10 times reading source tables
100	100 times reading source tables

Pattern observation: The work grows linearly as you add more sources.

Final Time Complexity

Time Complexity: O(n)

This means the time to combine data grows directly in proportion to the number of sources.

Common Mistake

[X] Wrong: "Adding more sources won't increase time much because they run in parallel."

[OK] Correct: Even if sources run in parallel, total work still adds up, and combining results takes time too.

Interview Connect

Understanding how combining many data sources affects time helps you design efficient data pipelines and explain your choices clearly.

Self-Check

"What if we replaced union all with join operations? How would the time complexity change?"