0
0
dbtdata~5 mins

How dbt works (SQL + Jinja + YAML)

Choose your learning style9 modes available
Introduction

dbt helps you organize and run your data transformations easily. It uses SQL for queries, Jinja to add logic, and YAML to manage settings.

You want to build clean, reusable data models from raw data.
You need to automate running SQL queries with dynamic parts.
You want to document and test your data models clearly.
You want to manage data sources and model configurations in one place.
You want to track dependencies between data transformations.
Syntax
dbt
model.sql
-- SQL code with Jinja templating

models.yml
# YAML file for configuration and documentation

SQL files contain your data transformation queries.

Jinja lets you add variables, loops, and conditions inside SQL.

Examples
This SQL uses Jinja to get data from a source table named 'users' in the 'raw' schema.
dbt
-- model.sql
SELECT * FROM {{ source('raw', 'users') }} WHERE active = true
Here, Jinja sets a variable 'cutoff' and uses it inside the SQL query.
dbt
{% set cutoff = '2023-01-01' %}
SELECT * FROM sales WHERE sale_date >= '{{ cutoff }}'
This YAML documents the 'users' model and its columns.
dbt
models.yml
version: 2
models:
  - name: users
    description: 'User details table'
    columns:
      - name: id
        description: 'User ID'
      - name: active
        description: 'If user is active'
Sample Program

This example shows a dbt model SQL file using Jinja to filter active users. The YAML file documents the model and its columns.

dbt
-- models/users.sql
{% set active_only = true %}
SELECT id, name, email
FROM {{ source('raw', 'users') }}
{% if active_only %} WHERE active = true {% endif %}

-- models.yml
version: 2
models:
  - name: users
    description: 'Filtered active users'
    columns:
      - name: id
        description: 'User ID'
      - name: name
        description: 'User name'
      - name: email
        description: 'User email address'
OutputSuccess
Important Notes

dbt runs Jinja first to create the final SQL query before running it.

YAML files help with documentation and testing but do not run SQL.

Using Jinja makes your SQL flexible and reusable.

Summary

dbt combines SQL, Jinja, and YAML to build, run, and document data models.

SQL writes the queries, Jinja adds logic, YAML manages configs and docs.

This makes data transformation easier, clearer, and more maintainable.