The source() function helps you tell dbt where your raw data tables are. It connects your project to the original data so you can use it safely and clearly.
source() function for raw tables in dbt
Start learning this pattern below
Jump into concepts and practice - no test required
or
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Syntax
dbt
source('source_name', 'table_name')
source_name is the name you gave to your data source in your sources.yml file.
table_name is the raw table name inside that source.
Examples
customers table inside the raw_data source.dbt
select * from {{ source('raw_data', 'customers') }}
id and name columns from the orders table in the sales_db source, filtering for completed orders.dbt
select id, name from {{ source('sales_db', 'orders') }} where status = 'completed'
Sample Program
This example shows how to define a source called raw_data with a table customers in the sources.yml file. Then, in a dbt model, it selects the id and email of active customers from that source.
dbt
/* In your sources.yml file */ version: 2 sources: - name: raw_data schema: raw_data tables: - name: customers /* In your dbt model SQL file */ select id, email from {{ source('raw_data', 'customers') }} where active = true
Important Notes
Always define your sources in sources.yml before using source() in your models.
Using source() helps dbt track data lineage and improves documentation.
Summary
source() connects your dbt models to raw tables safely.
It uses names from your sources.yml file for clarity and maintenance.
It helps you write cleaner, more understandable SQL in dbt projects.
Practice
1. What is the main purpose of the
source() function in dbt?easy
Solution
Step 1: Understand the role of
Thesource()source()function is used to safely reference raw tables that are defined in thesources.ymlfile.Step 2: Differentiate from other dbt functions
It does not create or delete tables, nor run scripts. It only connects models to existing raw data tables.Final Answer:
To reference raw tables defined in thesources.ymlfile -> Option DQuick Check:
source()connects to raw tables [OK]
Hint: Remember:
source() links to raw tables only [OK]Common Mistakes:
- Thinking
source()creates tables - Confusing
source()with model creation - Assuming
source()runs scripts
2. Which of the following is the correct syntax to reference a raw table named
customers in the source named raw_data using source() in a dbt model?easy
Solution
Step 1: Recall
The correct syntax uses two string arguments: the source name and the table name, both in quotes, wrapped insource()function syntax{{ }}.Step 2: Check each option
The valid syntax isselect * from {{ source('raw_data', 'customers') }}. Dot notation, unquoted arguments, and bracket notation are all invalid in dbt.Final Answer:
select * from {{ source('raw_data', 'customers') }} -> Option BQuick Check:
{{ source() }}with quoted arguments [OK]
Hint: Always use quotes around source and table names [OK]
Common Mistakes:
- Omitting quotes around source or table names
- Using dot or bracket notation instead of function call
- Passing variables without quotes
3. Given the following dbt model SQL code:
What does this query do?
select id, name from {{ source('sales_db', 'customers') }} where active = trueWhat does this query do?
medium
Solution
Step 1: Understand the
The code references the raw tablesource()usagecustomersinside the sourcesales_db.Step 2: Analyze the SQL query
The query selectsidandnamecolumns whereactiveis true, filtering active customers.Final Answer:
Selects all customers from thecustomerstable in thesales_dbsource whereactiveis true -> Option AQuick Check:
source()reads raw tables, query filters active customers [OK]
Hint: Look for
source() to identify raw table references [OK]Common Mistakes:
- Thinking
source()creates or deletes tables - Confusing source with models
- Ignoring the WHERE clause filtering
4. You wrote this dbt model code:
but dbt throws an error:
select * from source('marketing', 'leads')but dbt throws an error:
Compilation Error: 'source' is undefined. What is the most likely cause?medium
Solution
Step 1: Check dbt Jinja syntax
dbt requires Jinja functions likesource()to be inside double curly braces:{{ source(...) }}.Step 2: Understand the error message
The error sayssourceis undefined, meaning dbt treats it as plain SQL, not a Jinja function.Final Answer:
You forgot to wrapsource()in double curly braces {{ }} -> Option CQuick Check:
Use{{ source() }}to call source function [OK]
Hint: Always use {{ }} around dbt functions like source() [OK]
Common Mistakes:
- Writing source() without {{ }}
- Assuming quotes type causes error
- Ignoring missing source definition errors
5. You want to create a dbt model that selects only customers from the
raw_data source's customers table who joined after 2023-01-01. Which of the following is the correct way to write this using source()?hard
Solution
Step 1: Use correct
Thesource()syntax with Jinja bracessource()function must be inside{{ }}and both source and table names must be strings in quotes.Step 2: Use correct date format in SQL condition
The date value must be a string in quotes to compare properly in SQL.Final Answer:
select * from {{ source('raw_data', 'customers') }} where join_date > '2023-01-01' -> Option AQuick Check:
Correct syntax and date string format [OK]
Hint: Wrap source() in {{ }}, use quotes for strings and dates [OK]
Common Mistakes:
- Missing {{ }} around source()
- Not quoting table name or date string
- Using unquoted date causing SQL error
