0
0
dbtdata~30 mins

Multi-source fan-in patterns in dbt - Mini Project: Build & Apply

Choose your learning style9 modes available
Multi-source Fan-in Patterns with dbt
📖 Scenario: You work as a data analyst at a retail company. You have sales data coming from two different sources: online_sales and store_sales. Your goal is to combine these two sources into a single table that shows total sales per product.
🎯 Goal: Build a dbt model that uses multi-source fan-in patterns to combine online_sales and store_sales tables into one table showing total sales per product.
📋 What You'll Learn
Create source definitions for online_sales and store_sales
Create a staging model for each source
Create a final model that combines both staging models using UNION ALL
Aggregate total sales per product in the final model
💡 Why This Matters
🌍 Real World
Retail companies often have sales data from multiple channels. Combining these sources helps get a complete view of product performance.
💼 Career
Data analysts and engineers use multi-source fan-in patterns in dbt to build clean, combined datasets for reporting and analysis.
Progress0 / 4 steps
1
Define sources for online_sales and store_sales
Create a sources.yml file with source definitions for online_sales and store_sales tables under the schema raw. Use source names online and store respectively.
dbt
Need a hint?

Use the sources key and define two sources named online and store with their respective tables.

2
Create staging models for each source
Create two staging models: stg_online_sales.sql and stg_store_sales.sql. In each, select all columns from their respective sources using the source() function. For example, in stg_online_sales.sql, select all from source('online', 'online_sales').
dbt
Need a hint?

Use the source() function inside your SQL models to select all data from each source table.

3
Create a final model combining both staging models
Create a model called final_sales.sql that combines stg_online_sales and stg_store_sales using UNION ALL. Select product_id and sales_amount from both staging models.
dbt
Need a hint?

Use UNION ALL to combine the two staging models. Use ref() to refer to the staging models.

4
Aggregate total sales per product in the final model
Modify final_sales.sql to aggregate total sales per product_id. Use a SELECT statement that sums sales_amount grouped by product_id from the union of both staging models.
dbt
Need a hint?

Use a subquery to combine both staging models with UNION ALL, then group by product_id and sum sales_amount.