Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is the purpose of configuring sources in YAML in dbt?
Configuring sources in YAML in dbt helps define where your raw data lives. It tells dbt which tables or files to use as inputs for your transformations.
Click to reveal answer
beginner
In a dbt source configuration YAML, what key is used to list the tables or files?
The key tables is used to list the tables or files under a source in the YAML configuration.
Click to reveal answer
intermediate
How do you specify the database and schema for a source in dbt YAML?
You specify the database and schema keys under the source name to tell dbt where to find the source data.
Click to reveal answer
beginner
What is the benefit of adding descriptions to sources and tables in YAML?
Adding descriptions helps document your data sources clearly. It makes it easier for anyone reading the project to understand what each source and table represents.
Click to reveal answer
beginner
Show a simple example of a source configuration in YAML for a source named 'raw_data' with one table 'users'.
Example:
sources:
- name: raw_data
database: analytics_db
schema: public
tables:
- name: users
description: 'User information table'
Click to reveal answer
Which key in a dbt YAML source config lists the tables?
Atables
Bcolumns
Csources
Dmodels
✗ Incorrect
The tables key lists the tables under a source.
Where do you specify the schema for a source in dbt YAML?
AUnder the source name using the <code>schema</code> key
BInside each table definition
CIn the dbt_project.yml file
DIn the model SQL files
✗ Incorrect
The schema key is set under the source name to tell dbt where to find the source.
Why add descriptions to sources and tables in YAML?
ATo change table names
BTo speed up data loading
CTo define SQL queries
DTo improve documentation and clarity
✗ Incorrect
Descriptions help document the data sources for better understanding.
What is the top-level key used to define sources in a dbt YAML file?
Amodels
Bsources
Ctables
Dschemas
✗ Incorrect
The top-level key sources holds all source definitions.
In dbt, what does a source configuration NOT include?
ADatabase and schema location
BList of tables
CSQL transformation logic
DDescriptions
✗ Incorrect
SQL transformation logic is written in model files, not in source YAML configurations.
Explain how to configure a source in dbt using YAML. Include keys you would use and why.
Think about how you tell dbt where your raw data lives and what tables it includes.
You got /5 concepts.
Describe the benefits of documenting sources and tables with descriptions in your YAML configuration.
Why is it good to add notes about your data?
You got /3 concepts.
Practice
(1/5)
1. What is the main purpose of configuring sources in a dbt YAML file?
easy
A. To write SQL queries for data transformation
B. To tell dbt where to find raw data tables
C. To create dashboards for data visualization
D. To schedule dbt runs automatically
Solution
Step 1: Understand the role of source configuration
Source configuration in dbt YAML files defines where raw data tables are located in the database.
Step 2: Differentiate from other dbt tasks
Writing SQL queries and scheduling runs are done elsewhere, not in source YAML files.
Final Answer:
To tell dbt where to find raw data tables -> Option B
Quick Check:
Source config = raw data location [OK]
Hint: Sources define raw table locations in YAML [OK]
Common Mistakes:
Confusing source config with SQL model code
Thinking sources schedule runs
Assuming sources create visualizations
2. Which of the following is the correct syntax to define a source in a dbt YAML file?
easy
A. source:
name: raw_data
table:
- customers
B. sources:
name: raw_data
tables:
- customers
C. sources:
- name: raw_data
tables:
- name: customers
D. source:
- raw_data:
tables:
- customers
Solution
Step 1: Recall correct YAML source structure
The correct syntax uses 'sources' as a list with 'name' and nested 'tables' list, each with a 'name'.
Step 2: Compare options to syntax
sources:
- name: raw_data
tables:
- name: customers matches the correct indentation and keys exactly.
A. 'warn_after' and 'error_after' counts are reversed
B. The indentation under 'freshness' is incorrect
C. The 'error_after' period should be less than 'warn_after'
D. The 'period' values must be singular strings
Solution
Step 1: Understand dbt freshness period syntax
dbt freshness requires singular 'period' values like 'hour', 'day', 'minute'. Plural forms ('hours', 'days') are invalid and cause errors.
Step 2: Check the YAML periods
'period: hours' and 'period: days' use plural, which dbt does not recognize.
Step 3: Rule out other options
A: Counts logical (12 hours warn before 1 day/24 hours error). B: Indentation correct. C: Incorrect--error_after time must be *longer* than warn_after.
Final Answer:
The 'period' values must be singular strings -> Option D
Quick Check:
period: hour/day (singular only) [OK]
Hint: dbt freshness periods must be singular (hour, day) [OK]
Common Mistakes:
Using plural periods ('hours', 'days')
Incorrect YAML indentation
Thinking error_after time should be shorter than warn_after
5. You want to add a test to ensure the 'email' column in the 'users' table source is never null. Which YAML snippet correctly adds this test?