dbt helps you organize and run your data transformations easily. It uses SQL for queries, Jinja to add logic, and YAML to manage settings.
0
0
How dbt works (SQL + Jinja + YAML)
Introduction
You want to build clean, reusable data models from raw data.
You need to automate running SQL queries with dynamic parts.
You want to document and test your data models clearly.
You want to manage data sources and model configurations in one place.
You want to track dependencies between data transformations.
Syntax
dbt
model.sql -- SQL code with Jinja templating models.yml # YAML file for configuration and documentation
SQL files contain your data transformation queries.
Jinja lets you add variables, loops, and conditions inside SQL.
Examples
This SQL uses Jinja to get data from a source table named 'users' in the 'raw' schema.
dbt
-- model.sql
SELECT * FROM {{ source('raw', 'users') }} WHERE active = trueHere, Jinja sets a variable 'cutoff' and uses it inside the SQL query.
dbt
{% set cutoff = '2023-01-01' %}
SELECT * FROM sales WHERE sale_date >= '{{ cutoff }}'This YAML documents the 'users' model and its columns.
dbt
models.yml version: 2 models: - name: users description: 'User details table' columns: - name: id description: 'User ID' - name: active description: 'If user is active'
Sample Program
This example shows a dbt model SQL file using Jinja to filter active users. The YAML file documents the model and its columns.
dbt
-- models/users.sql
{% set active_only = true %}
SELECT id, name, email
FROM {{ source('raw', 'users') }}
{% if active_only %} WHERE active = true {% endif %}
-- models.yml
version: 2
models:
- name: users
description: 'Filtered active users'
columns:
- name: id
description: 'User ID'
- name: name
description: 'User name'
- name: email
description: 'User email address'OutputSuccess
Important Notes
dbt runs Jinja first to create the final SQL query before running it.
YAML files help with documentation and testing but do not run SQL.
Using Jinja makes your SQL flexible and reusable.
Summary
dbt combines SQL, Jinja, and YAML to build, run, and document data models.
SQL writes the queries, Jinja adds logic, YAML manages configs and docs.
This makes data transformation easier, clearer, and more maintainable.