0
0
dbtdata~10 mins

source() function for raw tables in dbt - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - source() function for raw tables
Define source in schema.yml
Call source() in model SQL
dbt compiles SQL with source reference
Query raw table in warehouse
Return raw table data for model processing
The source() function links your dbt model to raw tables defined in your source configuration, allowing you to query raw data directly.
Execution Sample
dbt
select * from {{ source('raw_data', 'users') }}
where signup_date > '2023-01-01'
This code selects all users from the raw_data.users table who signed up after January 1, 2023.
Execution Table
StepActionInputOutputNotes
1Read source() callsource('raw_data', 'users')Reference to raw_data.users tabledbt identifies source config
2Compile SQLselect * from {{ source(...) }} where signup_date > '2023-01-01'select * from raw_data.users where signup_date > '2023-01-01'source() replaced with raw table name
3Run query in warehouseselect * from raw_data.users where signup_date > '2023-01-01'Rows of users with signup_date > 2023-01-01Raw data fetched for model
4Return data to modelRaw rowsDataFrame or table for further processingModel can now transform raw data
5EndNo more stepsExecution completeQuery finished successfully
💡 Query runs until all matching rows from raw_data.users are returned
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3Final
source_refundefinedraw_data.usersraw_data.usersraw_data.usersraw_data.users
compiled_sqlundefinedundefinedselect * from raw_data.users where signup_date > '2023-01-01'select * from raw_data.users where signup_date > '2023-01-01'completed
query_resultundefinedundefinedundefinedrows matching conditionrows matching condition
Key Moments - 3 Insights
Why does source() need the source name and table name?
Because source() uses these to find the exact raw table defined in your schema.yml, as shown in execution_table step 1 where source('raw_data', 'users') resolves to raw_data.users.
What happens if you forget to define the source in schema.yml?
dbt will fail to compile the SQL because source() cannot resolve the raw table name, stopping at execution_table step 2.
Does source() run the query or just reference the table?
source() only creates a reference to the raw table; the actual query runs later in the warehouse as shown in execution_table step 3.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what does source('raw_data', 'users') return at Step 1?
AA reference to the raw_data.users table
BThe actual data rows from users
CAn error message
DCompiled SQL query
💡 Hint
Check the Output column in execution_table Step 1
At which step does dbt replace source() with the actual table name?
AStep 1
BStep 2
CStep 3
DStep 4
💡 Hint
Look at the Action and Output columns in execution_table Step 2
If the source is not defined in schema.yml, what will happen during execution?
Adbt will compile SQL successfully
BThe query will run but return no data
Cdbt will fail to compile SQL
DThe query will run on a default table
💡 Hint
Refer to key_moments about missing source definition
Concept Snapshot
source() function in dbt:
- Use source('source_name', 'table_name') to reference raw tables
- source() links to tables defined in schema.yml
- dbt replaces source() with actual table name during compilation
- Enables querying raw data directly in models
- Prevents hardcoding raw table names
- Essential for data lineage and documentation
Full Transcript
The source() function in dbt is used to reference raw tables defined in your source configuration file, usually schema.yml. When you write source('raw_data', 'users') in your model SQL, dbt looks up the raw_data source and users table name. During compilation, dbt replaces the source() call with the actual raw table name like raw_data.users. Then the compiled SQL runs in your data warehouse, fetching raw data rows. This process allows you to write models that depend on raw tables without hardcoding table names, improving maintainability and documentation. If the source is not defined, dbt will fail to compile the SQL. The execution steps show how source() is resolved, compiled, and executed step-by-step.