0
0
dbtdata~15 mins

For loops for dynamic SQL in dbt - Deep Dive

Choose your learning style9 modes available
Overview - For loops for dynamic SQL
What is it?
For loops for dynamic SQL in dbt allow you to write SQL code that repeats or changes based on a list or set of values. Instead of writing the same SQL many times, you use a loop to generate parts of the query automatically. This helps create flexible and reusable SQL models that adapt to different data or conditions.
Why it matters
Without for loops, you would have to write repetitive SQL code manually, which is slow and error-prone. For loops save time and reduce mistakes by automating repetitive tasks. They make your data models easier to maintain and update, especially when working with many similar tables or columns.
Where it fits
Before learning for loops in dbt, you should understand basic SQL and how dbt models work. After mastering for loops, you can explore more advanced dbt features like macros, Jinja templating, and dynamic model generation.
Mental Model
Core Idea
A for loop in dynamic SQL repeats or creates SQL code blocks automatically by cycling through a list of items.
Think of it like...
Using a for loop in SQL is like using a cookie cutter to make many cookies of the same shape instead of shaping each cookie by hand.
┌───────────────┐
│   For Loop    │
├───────────────┤
│ List of items │
│ (e.g., tables)│
└──────┬────────┘
       │
       ▼
┌─────────────────────────────┐
│ Repeat SQL block for each    │
│ item, replacing placeholders │
│ with current item            │
└─────────────────────────────┘
       │
       ▼
┌─────────────────────────────┐
│ Combined SQL with all parts  │
└─────────────────────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding basic SQL repetition
🤔
Concept: Learn how repeating similar SQL code manually works and why it is inefficient.
Imagine you want to select data from three similar tables: sales_2019, sales_2020, sales_2021. Without loops, you write three separate SELECT statements and combine them with UNION ALL. This is repetitive and hard to maintain if you add more years.
Result
You get a long SQL query with repeated code blocks for each table.
Understanding manual repetition shows why automating with loops saves time and reduces errors.
2
FoundationIntroduction to Jinja templating in dbt
🤔
Concept: Learn how dbt uses Jinja to add programming features like loops inside SQL files.
dbt lets you write SQL files with Jinja code inside {% %} blocks. For example, {% for year in [2019, 2020, 2021] %} lets you loop over years. Inside the loop, you can write SQL that changes based on the current year.
Result
You can write one SQL template that generates multiple similar queries automatically.
Knowing Jinja templating is key to using for loops in dbt for dynamic SQL generation.
3
IntermediateWriting a simple for loop for dynamic SQL
🤔Before reading on: do you think a for loop can generate multiple SELECT statements combined with UNION ALL automatically? Commit to your answer.
Concept: Use a for loop in Jinja to create repeated SQL blocks and combine them.
Example: {% raw %} {% set years = [2019, 2020, 2021] %} {% for year in years %} SELECT * FROM sales_{{ year }} {% if not loop.last %}UNION ALL{% endif %} {% endfor %} {% endraw %} This code loops over years, creates a SELECT for each sales table, and adds UNION ALL except after the last one.
Result
The generated SQL selects from all sales tables combined with UNION ALL.
Using loop control like 'loop.last' helps build correct SQL syntax dynamically.
4
IntermediateUsing for loops with variables and filters
🤔Before reading on: can you filter a list inside a for loop to skip some items? Commit to your answer.
Concept: Add logic inside loops to include or exclude items dynamically.
You can filter lists before looping or use if conditions inside loops. For example: {% raw %} {% set years = [2019, 2020, 2021, 2022] %} {% for year in years if year != 2020 %} SELECT * FROM sales_{{ year }} {% if not loop.last %}UNION ALL{% endif %} {% endfor %} {% endraw %} This skips the year 2020 table.
Result
The SQL excludes sales_2020 but includes others.
Filtering inside loops lets you customize dynamic SQL for complex conditions.
5
IntermediateLooping over complex data structures
🤔Before reading on: do you think you can loop over a list of dictionaries to generate SQL? Commit to your answer.
Concept: Use loops to iterate over lists of dictionaries or objects to create richer SQL code.
Example: {% raw %} {% set tables = [ {'name': 'sales_2019', 'region': 'north'}, {'name': 'sales_2020', 'region': 'south'} ] %} {% for table in tables %} SELECT * FROM {{ table.name }} WHERE region = '{{ table.region }}' {% if not loop.last %}UNION ALL{% endif %} {% endfor %} {% endraw %} This generates SQL selecting from each table with a region filter.
Result
SQL queries are customized per table with correct filters.
Looping over complex data allows dynamic SQL to adapt to multiple parameters.
6
AdvancedCombining for loops with macros for reuse
🤔Before reading on: can you use a macro with a for loop inside to generate SQL dynamically? Commit to your answer.
Concept: Create reusable macros that include for loops to generate dynamic SQL in multiple places.
Define a macro in dbt: {% raw %} {% macro union_sales_tables(years) %} {% for year in years %} SELECT * FROM sales_{{ year }} {% if not loop.last %}UNION ALL{% endif %} {% endfor %} {% endmacro %} -- Use macro in model: {{ union_sales_tables([2019, 2020, 2021]) }} {% endraw %} This lets you reuse the loop logic anywhere.
Result
Dynamic SQL generation is modular and maintainable.
Combining loops with macros scales dynamic SQL across projects.
7
ExpertHandling edge cases and performance in loops
🤔Before reading on: do you think generating very large SQL with loops can cause performance or readability issues? Commit to your answer.
Concept: Understand limits of dynamic SQL loops and how to optimize or avoid pitfalls.
Generating huge SQL with many loop iterations can slow down compilation and make debugging hard. Use filters to limit items, break queries into smaller models, or use incremental models. Also, watch for SQL syntax errors from missing separators or wrong loop logic.
Result
Better performing, readable, and maintainable dynamic SQL code.
Knowing loop limits prevents production issues and improves dbt project health.
Under the Hood
dbt uses the Jinja templating engine to process SQL files before sending them to the database. When it encounters a for loop, it runs the loop in Python, generating repeated SQL code by replacing variables with current loop values. This generated SQL is then compiled into a final query that the database executes.
Why designed this way?
dbt separates templating from SQL execution to keep SQL clean and database-agnostic. Using Jinja for loops allows flexible code generation without changing SQL syntax. This design lets users write DRY (Don't Repeat Yourself) SQL and handle complex logic outside the database.
┌───────────────┐
│ dbt Model SQL │
│ with Jinja    │
└──────┬────────┘
       │
       ▼
┌─────────────────────┐
│ Jinja Template Engine│
│ - Runs for loops     │
│ - Replaces variables │
└──────┬──────────────┘
       │
       ▼
┌─────────────────────┐
│ Generated SQL Query  │
│ (no Jinja code)     │
└──────┬──────────────┘
       │
       ▼
┌─────────────────────┐
│ Database Executes SQL│
└─────────────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does a for loop in dbt run inside the database or during compilation? Commit to your answer.
Common Belief:For loops in dbt run inside the database engine as part of SQL execution.
Tap to reveal reality
Reality:For loops run during dbt's compilation step using Jinja in Python, generating SQL before sending it to the database.
Why it matters:Thinking loops run in the database leads to confusion about performance and debugging, since errors happen at compile time, not runtime.
Quick: Can you use any Python code inside dbt for loops? Commit to your answer.
Common Belief:You can write full Python code inside dbt for loops to manipulate data.
Tap to reveal reality
Reality:dbt uses Jinja templating, which supports limited Python-like syntax but not full Python code or libraries.
Why it matters:Expecting full Python causes errors and frustration; understanding Jinja limits helps write valid dynamic SQL.
Quick: Does using for loops always make SQL faster? Commit to your answer.
Common Belief:Using for loops in dbt always improves SQL query performance.
Tap to reveal reality
Reality:For loops only affect SQL generation, not query execution speed. Poorly generated SQL can even hurt performance.
Why it matters:Assuming loops improve runtime leads to ignoring query optimization and database tuning.
Quick: Can you safely omit separators like UNION ALL inside for loops? Commit to your answer.
Common Belief:You can omit separators between repeated SQL blocks in loops without issues.
Tap to reveal reality
Reality:Omitting separators causes invalid SQL syntax and runtime errors.
Why it matters:Missing separators is a common source of bugs in dynamic SQL generation.
Expert Zone
1
Loop variables like 'loop.index' and 'loop.last' provide fine control over SQL formatting and prevent syntax errors.
2
Using macros with for loops enables parameterized, reusable SQL templates that scale across large dbt projects.
3
Filtering or slicing lists before looping avoids generating overly large SQL and improves compilation speed.
When NOT to use
Avoid for loops when the dynamic SQL logic is simple or static; writing explicit SQL is clearer. For very complex logic, consider using database views, stored procedures, or external ETL tools instead of heavy templating.
Production Patterns
Professionals use for loops inside macros to generate union queries across many tables or partitions. They combine loops with incremental models to process data in chunks. Also, loops help build dynamic column lists for pivot or aggregation queries.
Connections
Programming loops (for/while)
For loops in dbt are a templating version of programming loops that repeat code blocks.
Understanding programming loops helps grasp how dbt repeats SQL code dynamically.
Template engines in web development
dbt's use of Jinja for loops is similar to how web templates generate repeated HTML elements.
Knowing web templating clarifies how dynamic SQL is generated before execution.
Manufacturing assembly lines
For loops automate repetitive tasks like assembly lines produce many identical products efficiently.
Seeing loops as automation tools highlights their role in reducing manual work and errors.
Common Pitfalls
#1Missing UNION ALL between repeated SELECT statements.
Wrong approach:{% raw %} {% for year in [2019, 2020, 2021] %} SELECT * FROM sales_{{ year }} {% endfor %} {% endraw %}
Correct approach:{% raw %} {% for year in [2019, 2020, 2021] %} SELECT * FROM sales_{{ year }} {% if not loop.last %}UNION ALL{% endif %} {% endfor %} {% endraw %}
Root cause:Not adding separators causes invalid SQL; forgetting to check loop position leads to syntax errors.
#2Trying to run Python code inside dbt SQL files directly.
Wrong approach:{% raw %} {% for i in range(5): %} SELECT {{ i }} {% endfor %} {% endraw %}
Correct approach:{% raw %} {% for i in range(5) %} SELECT {{ i }} {% endfor %} {% endraw %}
Root cause:Jinja syntax differs from Python; using Python syntax causes template errors.
#3Generating very large SQL with many loop iterations without filtering.
Wrong approach:{% raw %} {% for i in range(1000) %} SELECT * FROM sales_{{ i }} UNION ALL {% endfor %} {% endraw %}
Correct approach:{% raw %} {% set years = range(2010, 2021) %} {% for year in years if year % 2 == 0 %} SELECT * FROM sales_{{ year }} {% if not loop.last %}UNION ALL{% endif %} {% endfor %} {% endraw %}
Root cause:Not limiting loop size causes slow compilation and hard-to-debug SQL.
Key Takeaways
For loops in dbt use Jinja templating to generate repeated SQL code automatically.
They save time and reduce errors by avoiding manual repetition of similar SQL blocks.
Loop controls like 'loop.last' are essential to produce valid SQL syntax with separators.
Combining loops with macros creates reusable and scalable dynamic SQL templates.
Understanding the compilation process helps avoid common mistakes and performance issues.