Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is the purpose of the source() function in dbt?
The source() function in dbt is used to reference raw tables or external data sources in your project. It helps you track and document where your data comes from.
Click to reveal answer
beginner
How do you use the source() function in a dbt model?
You use source('source_name', 'table_name') inside your SQL to refer to a raw table defined in your sources.yml file. This makes your code clear and maintainable.
Click to reveal answer
beginner
What file do you configure to define sources for the source() function?
You define sources in a sources.yml file inside your dbt project. This file lists the source names and their raw tables with metadata.
Click to reveal answer
intermediate
Why is using source() better than hardcoding raw table names?
Using source() helps track data lineage, improves documentation, and makes your project easier to update if raw table names or locations change.
Click to reveal answer
intermediate
Can source() be used to reference tables outside your dbt project?
Yes, source() can reference external raw tables as long as they are defined in your sources.yml file and accessible by your data warehouse.
Click to reveal answer
What does source('raw', 'customers') do in dbt?
AReferences the 'customers' table from the 'raw' source defined in sources.yml
BCreates a new table called 'customers' in the 'raw' schema
CDeletes the 'customers' table from the 'raw' source
DRuns a transformation on the 'customers' table
✗ Incorrect
The source() function references an existing raw table defined in sources.yml. It does not create, delete, or transform tables.
Where do you define the sources used by the source() function?
AIn the <code>models/</code> folder
BIn the <code>sources.yml</code> file
CIn the <code>dbt_project.yml</code> file
DIn the SQL model files
✗ Incorrect
Sources are defined in sources.yml files to list raw tables and their metadata.
Why should you use source() instead of hardcoding table names?
ATo improve data lineage and documentation
BTo speed up query execution
CTo automatically create tables
DTo encrypt data
✗ Incorrect
Using source() helps track where data comes from and documents your sources clearly.
Can source() reference tables outside your dbt project?
ANo, only tables inside the project
BOnly if you use a special plugin
CYes, if defined in <code>sources.yml</code> and accessible
DOnly for views, not tables
✗ Incorrect
source() can reference any table accessible by your warehouse if defined properly.
What is the correct syntax to reference a source table named 'orders' in source 'raw_data'?
A<code>source('orders', 'raw_data')</code>
B<code>ref('orders', 'raw_data')</code>
C<code>ref('raw_data', 'orders')</code>
D<code>source('raw_data', 'orders')</code>
✗ Incorrect
The syntax is source('source_name', 'table_name'), so source('raw_data', 'orders') is correct.
Explain how the source() function helps manage raw tables in dbt projects.
Think about how you keep track of where your data comes from.
You got /4 concepts.
Describe the steps to set up and use the source() function for a new raw table.
Start from configuration, then usage, then benefits.
You got /4 concepts.
Practice
(1/5)
1. What is the main purpose of the source() function in dbt?
easy
A. To create new tables in the database
B. To run Python scripts inside dbt models
C. To delete raw tables from the database
D. To reference raw tables defined in the sources.yml file
Solution
Step 1: Understand the role of source()
The source() function is used to safely reference raw tables that are defined in the sources.yml file.
Step 2: Differentiate from other dbt functions
It does not create or delete tables, nor run scripts. It only connects models to existing raw data tables.
Final Answer:
To reference raw tables defined in the sources.yml file -> Option D
Quick Check:
source() connects to raw tables [OK]
Hint: Remember: source() links to raw tables only [OK]
Common Mistakes:
Thinking source() creates tables
Confusing source() with model creation
Assuming source() runs scripts
2. Which of the following is the correct syntax to reference a raw table named customers in the source named raw_data using source() in a dbt model?
easy
A. select * from source.raw_data.customers
B. select * from {{ source('raw_data', 'customers') }}
C. select * from source['raw_data']['customers']
D. select * from source(raw_data, customers)
Solution
Step 1: Recall source() function syntax
The correct syntax uses two string arguments: the source name and the table name, both in quotes, wrapped in {{ }}.
Step 2: Check each option
The valid syntax is select * from {{ source('raw_data', 'customers') }}. Dot notation, unquoted arguments, and bracket notation are all invalid in dbt.
Final Answer:
select * from {{ source('raw_data', 'customers') }} -> Option B
Quick Check:
{{ source() }} with quoted arguments [OK]
Hint: Always use quotes around source and table names [OK]
Common Mistakes:
Omitting quotes around source or table names
Using dot or bracket notation instead of function call
Passing variables without quotes
3. Given the following dbt model SQL code:
select id, name from {{ source('sales_db', 'customers') }} where active = true
What does this query do?
medium
A. Selects all customers from the customers table in the sales_db source where active is true
B. Creates a new table named customers in sales_db
C. Deletes inactive customers from the customers table
D. Selects all customers from a model named sales_db
Solution
Step 1: Understand the source() usage
The code references the raw table customers inside the source sales_db.
Step 2: Analyze the SQL query
The query selects id and name columns where active is true, filtering active customers.
Final Answer:
Selects all customers from the customers table in the sales_db source where active is true -> Option A
Quick Check:
source() reads raw tables, query filters active customers [OK]
Hint: Look for source() to identify raw table references [OK]
Common Mistakes:
Thinking source() creates or deletes tables
Confusing source with models
Ignoring the WHERE clause filtering
4. You wrote this dbt model code:
select * from source('marketing', 'leads')
but dbt throws an error: Compilation Error: 'source' is undefined. What is the most likely cause?
medium
A. You used single quotes instead of double quotes inside source()
B. The table leads does not exist in the database
C. You forgot to wrap source() in double curly braces {{ }}
D. The marketing source is not defined in sources.yml
Solution
Step 1: Check dbt Jinja syntax
dbt requires Jinja functions like source() to be inside double curly braces: {{ source(...) }}.
Step 2: Understand the error message
The error says source is undefined, meaning dbt treats it as plain SQL, not a Jinja function.
Final Answer:
You forgot to wrap source() in double curly braces {{ }} -> Option C
Quick Check:
Use {{ source() }} to call source function [OK]
Hint: Always use {{ }} around dbt functions like source() [OK]
Common Mistakes:
Writing source() without {{ }}
Assuming quotes type causes error
Ignoring missing source definition errors
5. You want to create a dbt model that selects only customers from the raw_data source's customers table who joined after 2023-01-01. Which of the following is the correct way to write this using source()?
hard
A. select * from {{ source('raw_data', 'customers') }} where join_date > '2023-01-01'
B. select * from source('raw_data', 'customers') where join_date > '2023-01-01'
C. select * from {{ source('raw_data', customers) }} where join_date > '2023-01-01'
D. select * from {{ source('raw_data', 'customers') }} where join_date > 2023-01-01
Solution
Step 1: Use correct source() syntax with Jinja braces
The source() function must be inside {{ }} and both source and table names must be strings in quotes.
Step 2: Use correct date format in SQL condition
The date value must be a string in quotes to compare properly in SQL.
Final Answer:
select * from {{ source('raw_data', 'customers') }} where join_date > '2023-01-01' -> Option A
Quick Check:
Correct syntax and date string format [OK]
Hint: Wrap source() in {{ }}, use quotes for strings and dates [OK]