0
0
dbtdata~10 mins

Multi-source fan-in patterns in dbt - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Multi-source fan-in patterns
Source Table A
Transform A
Fan-in Join
Source Table B
Transform B
Final Output Table
Data flows from multiple source tables through transformations, then joins together in a fan-in pattern to create a combined output.
Execution Sample
dbt
with a as (
  select id, value from source_a
), b as (
  select id, value from source_b
)
select a.id, a.value, b.value as b_value
from a
join b on a.id = b.id
This SQL code combines data from two sources by joining on a common id.
Execution Table
StepActionEvaluationResult
1Read source_aTable source_a loadedRows: 3 (id=1,2,3)
2Read source_bTable source_b loadedRows: 3 (id=2,3,4)
3Transform aSelect id, valuea: [(1,10), (2,20), (3,30)]
4Transform bSelect id, valueb: [(2,200), (3,300), (4,400)]
5Join a and b on idMatch ids in bothJoined rows: [(2,20,200), (3,30,300)]
6Output final tableCombined dataFinal rows: 2 with columns (id, value, b_value)
💡 Join only includes ids present in both sources, so id=1 and id=4 are excluded.
Variable Tracker
VariableStartAfter Step 3After Step 4After Step 5Final
aempty[(1,10), (2,20), (3,30)][(1,10), (2,20), (3,30)][(2,20), (3,30)][(2,20), (3,30)]
bemptyempty[(2,200), (3,300), (4,400)][(2,200), (3,300)][(2,200), (3,300)]
joinedemptyemptyempty[(2,20,200), (3,30,300)][(2,20,200), (3,30,300)]
Key Moments - 2 Insights
Why are some ids missing from the final output?
Because the join only includes ids present in both tables, ids like 1 (only in a) and 4 (only in b) are excluded as shown in execution_table step 5.
What happens if one source has more rows than the other?
Only matching rows based on the join condition appear in the final output, so extra rows without matches are dropped, as seen in variable_tracker for 'joined' after step 5.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 5, how many rows are in the joined result?
A2
B3
C4
D1
💡 Hint
Check the 'Joined rows' column in execution_table step 5.
According to variable_tracker, what is the value of variable 'a' after step 3?
A[(2,200), (3,300), (4,400)]
Bempty
C[(1,10), (2,20), (3,30)]
D[(2,20), (3,30)]
💡 Hint
Look at the row for 'a' under 'After Step 3' in variable_tracker.
If source_b had an extra row with id=5, how would the final output change?
AFinal output would include id=5
BFinal output would remain the same
CFinal output would include all ids from source_b
DFinal output would include all ids from source_a
💡 Hint
Recall the join only includes ids present in both sources as shown in execution_table step 5.
Concept Snapshot
Multi-source fan-in pattern:
- Extract data from multiple sources
- Transform each source separately
- Join transformed data on common keys
- Result is combined dataset with matching records only
- Useful for merging related data from different tables
Full Transcript
This visual execution shows how multi-source fan-in patterns work in dbt. We start by reading two source tables, source_a and source_b. Each is transformed by selecting relevant columns. Then, these transformed tables are joined on a common id column. The join keeps only rows where ids match in both tables, excluding unmatched rows. Variables 'a' and 'b' hold transformed data, and 'joined' holds the combined result. This pattern helps combine data from multiple sources into one clean output table.