We use YAML to tell dbt where to find our raw data tables. This helps dbt understand and organize the data before we work with it.
0
0
Configuring sources in YAML in dbt
Introduction
When you want to tell dbt about the raw tables in your database.
When you need to document your data sources clearly for your team.
When you want to track freshness or test the quality of your source data.
When you want to reference raw tables in your dbt models easily.
Syntax
dbt
version: 2 sources: - name: source_name database: your_database schema: your_schema tables: - name: table_name description: 'Description of the table' freshness: warn_after: count: 24 period: hour tests: - unique: column_name: id - not_null: column_name: id
The version: 2 line is required for dbt to read the YAML correctly.
Indentation is important in YAML. Use 2 spaces per level.
Examples
This example defines a source named
sales_db with one table customers.dbt
version: 2 sources: - name: sales_db database: analytics schema: raw tables: - name: customers description: 'Customer details table'
This example adds freshness settings to warn if data is older than 12 hours.
dbt
version: 2 sources: - name: marketing_data database: marketing schema: public tables: - name: campaigns freshness: warn_after: count: 12 period: hour
This example adds tests to check that the
transactions table has unique and non-null values.dbt
version: 2
sources:
- name: finance
database: finance_db
schema: reports
tables:
- name: transactions
tests:
- unique:
column_name: id
- not_null:
column_name: id
Sample Program
This YAML config tells dbt about the orders table in the ecommerce source. It includes a description, freshness check to warn if data is older than 6 hours, and tests to ensure data quality.
dbt
version: 2 sources: - name: ecommerce database: analytics_db schema: raw_data tables: - name: orders description: 'Raw orders data from ecommerce platform' freshness: warn_after: count: 6 period: hour tests: - unique: column_name: id - not_null: column_name: id
OutputSuccess
Important Notes
Always keep your YAML files well-indented to avoid errors.
Use descriptive names and descriptions to help your team understand the data.
Run dbt source freshness and dbt test to check your source configurations.
Summary
YAML files tell dbt where to find raw data tables.
You can add descriptions, freshness rules, and tests to sources.
Proper source configuration helps keep data organized and reliable.