Bird
Raised Fist0
dbtdata~10 mins

How dbt works (SQL + Jinja + YAML) - Visual Walkthrough

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Concept Flow - How dbt works (SQL + Jinja + YAML)
Start: User writes YAML config
User writes SQL model with Jinja
dbt parses YAML for metadata
dbt renders SQL by processing Jinja
dbt compiles final SQL query
dbt runs SQL on database
Results stored as tables/views
User queries transformed data
dbt uses YAML for config, SQL with Jinja for models, compiles SQL, runs it on the database, and stores results.
Execution Sample
dbt
version: 2
models:
  - name: customers
    description: 'Customer data'

-- customers.sql
select * from raw.customers where active = true
This example shows YAML config for a model and a SQL file selecting active customers.
Execution Table
StepActionInputOutput/Result
1Read YAML configversion: 2 models: - name: customers description: 'Customer data'Metadata loaded: model 'customers' with description
2Read SQL model with Jinjaselect * from raw.customers where active = trueSQL template loaded
3Render Jinja in SQLSQL templateFinal SQL: select * from raw.customers where active = true
4Compile SQLFinal SQLCompiled SQL ready for execution
5Run SQL on databaseCompiled SQLData extracted: active customers
6Store resultsQuery resultsTable/view 'customers' created/updated
7User queries transformed dataTable/view 'customers'User gets filtered active customer data
💡 All steps complete, transformed data ready for use
Variable Tracker
VariableStartAfter Step 1After Step 3After Step 5Final
YAML configemptyLoaded with model metadataLoadedLoadedLoaded
SQL templateemptyLoadedRendered to final SQLRenderedRendered
Compiled SQLemptyemptyCompiled SQL readyExecuted SQLExecuted SQL
Database table/viewnonenonenoneCreated/updatedAvailable for queries
Key Moments - 3 Insights
Why does dbt use YAML files alongside SQL?
YAML files hold metadata and configuration like model names and descriptions, which dbt reads first (see execution_table step 1). This separates config from SQL logic.
What happens when dbt processes Jinja in SQL?
dbt replaces Jinja placeholders with actual values or logic before running SQL (see execution_table step 3). This lets you write dynamic SQL.
How does dbt store the results of SQL execution?
After running the compiled SQL on the database (step 5), dbt saves the results as tables or views (step 6) so you can query transformed data easily.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the output after step 3?
AFinal SQL with Jinja placeholders
BRaw YAML config loaded
CFinal SQL with Jinja rendered
DData stored in database
💡 Hint
Check the 'Output/Result' column for step 3 in the execution_table
At which step does dbt run the SQL query on the database?
AStep 4
BStep 5
CStep 2
DStep 6
💡 Hint
Look for the step mentioning 'Run SQL on database' in the execution_table
If the YAML config is missing, what will happen in the execution flow?
Adbt will not load model metadata, causing errors before SQL runs
Bdbt will run SQL without any changes
Cdbt will skip SQL rendering
Ddbt will store results without running SQL
💡 Hint
Refer to step 1 in execution_table where YAML config is loaded
Concept Snapshot
dbt workflow:
1. Write YAML for model config
2. Write SQL with Jinja templates
3. dbt reads YAML metadata
4. dbt renders SQL by processing Jinja
5. dbt compiles and runs SQL on DB
6. Results saved as tables/views
7. Query transformed data easily
Full Transcript
dbt works by combining YAML, SQL, and Jinja. First, you write YAML files to configure your models, giving names and descriptions. Then, you write SQL files that can include Jinja templates for dynamic parts. dbt reads the YAML to get metadata, then processes the SQL files by rendering the Jinja templates into final SQL queries. It compiles these queries and runs them on your database. The results are stored as tables or views. Finally, you can query these transformed tables easily. This flow separates configuration, logic, and execution cleanly.

Practice

(1/5)
1. What is the main role of Jinja in dbt projects?
easy
A. To add logic and dynamic behavior to SQL queries
B. To write raw SQL queries without any modification
C. To manage configuration and documentation files
D. To execute the SQL queries on the database

Solution

  1. Step 1: Understand Jinja's purpose in dbt

    Jinja is a templating language that allows adding logic like loops and conditions inside SQL files.
  2. Step 2: Differentiate roles of SQL, Jinja, and YAML

    SQL writes queries, YAML manages configs/docs, and Jinja adds dynamic logic to SQL.
  3. Final Answer:

    To add logic and dynamic behavior to SQL queries -> Option A
  4. Quick Check:

    Jinja = logic in SQL [OK]
Hint: Jinja = logic inside SQL, YAML = configs/docs [OK]
Common Mistakes:
  • Confusing Jinja with YAML for configs
  • Thinking Jinja executes SQL queries
  • Assuming Jinja writes raw SQL without changes
2. Which of the following is the correct way to use a Jinja variable inside a dbt SQL model?
easy
A. SELECT * FROM var('table_name')
B. SELECT * FROM {{ var('table_name') }}
C. SELECT * FROM {% var('table_name') %}
D. SELECT * FROM [[ var('table_name') ]]

Solution

  1. Step 1: Recall Jinja syntax for variables

    Jinja variables are inserted using double curly braces {{ }} around expressions.
  2. Step 2: Identify correct syntax for var function

    The correct syntax is {{ var('variable_name') }} to access a variable in dbt.
  3. Final Answer:

    SELECT * FROM {{ var('table_name') }} -> Option B
  4. Quick Check:

    Jinja variables use {{ }} [OK]
Hint: Use {{ var('name') }} to insert variables in SQL [OK]
Common Mistakes:
  • Using single curly braces or wrong brackets
  • Confusing Jinja tags {% %} with variable insertion {{ }}
  • Using square brackets instead of curly braces
3. Given this dbt model SQL code, what will be the output SQL after rendering?
SELECT
  user_id,
  {% if var('include_email', false) %}
    email,
  {% endif %}
  created_at
FROM users

Assuming the variable include_email is set to true in dbt_project.yml.
medium
A. SELECT user_id, true, created_at FROM users
B. SELECT user_id, created_at FROM users
C. Syntax error due to misplaced Jinja
D. SELECT user_id, email, created_at FROM users

Solution

  1. Step 1: Check the value of the variable include_email

    The variable include_email is true, so the if condition passes and the email column is included.
  2. Step 2: Render the SQL with the if block included

    The SQL will have user_id, email, and created_at columns selected from users.
  3. Final Answer:

    SELECT user_id, email, created_at FROM users -> Option D
  4. Quick Check:

    include_email true means email included [OK]
Hint: If var true, include block inside {% if %} [OK]
Common Mistakes:
  • Ignoring the variable value and excluding email
  • Thinking Jinja syntax causes SQL errors
  • Confusing variable default values
4. You wrote this YAML config in your dbt project:
models:
  my_project:
    +materialized: table
      users:
        +tags: ['important']

Why does dbt raise an error when running?
medium
A. Because the indentation for 'users' is incorrect under 'my_project'
B. Because '+materialized' cannot be set in YAML
C. Because tags must be a string, not a list
D. Because 'models' key is missing

Solution

  1. Step 1: Check YAML indentation rules for dbt configs

    In dbt, model configs under a project must be indented properly; 'users' should be at the same level as '+materialized'.
  2. Step 2: Identify the indentation error

    'users' is indented too far, making it a child of '+materialized' which is invalid.
  3. Final Answer:

    Because the indentation for 'users' is incorrect under 'my_project' -> Option A
  4. Quick Check:

    YAML indentation matters for nested configs [OK]
Hint: Check YAML indentation carefully for nested configs [OK]
Common Mistakes:
  • Ignoring YAML indentation importance
  • Thinking '+materialized' is invalid syntax
  • Assuming tags cannot be lists
5. You want to create a dbt model that selects only active users from a table, but the 'active' flag is stored in a YAML config. Which approach correctly combines SQL, Jinja, and YAML to achieve this?
hard
A. Use Jinja to read YAML directly inside SQL without defining variables
B. Write WHERE active = true directly in SQL without YAML or Jinja
C. Define 'active_flag: true' in YAML, then use WHERE active = {{ var('active_flag') }} in SQL with Jinja
D. Set 'active_flag' in YAML but forget to use Jinja in SQL, so filter is missing

Solution

  1. Step 1: Store the filter value in YAML as a variable

    Define 'active_flag: true' in YAML under vars or config to make it accessible.
  2. Step 2: Use Jinja to insert the variable in SQL WHERE clause

    Use WHERE active = {{ var('active_flag') }} so the SQL filters active users dynamically.
  3. Final Answer:

    Define 'active_flag: true' in YAML, then use WHERE active = {{ var('active_flag') }} in SQL with Jinja -> Option C
  4. Quick Check:

    YAML vars + Jinja in SQL = dynamic filters [OK]
Hint: Use YAML vars + Jinja {{ var() }} in SQL WHERE [OK]
Common Mistakes:
  • Hardcoding filter in SQL ignoring YAML
  • Not using Jinja to insert YAML vars
  • Trying to read YAML directly in SQL without var()