dbtdata~15 mins

ref() function for model dependencies in dbt - Deep Dive

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Overview - ref() function for model dependencies

What is it?

The ref() function in dbt is a way to tell your project that one model depends on another. It creates a link between models so dbt knows the order to run them. This helps build complex data pipelines by managing dependencies automatically. It also makes your code easier to read and maintain.

Why it matters

Without ref(), you would have to manually manage the order of running models and write full table names everywhere. This would be error-prone and hard to update. ref() solves this by tracking dependencies and generating the correct SQL references. This means your data pipeline runs smoothly and changes in one model automatically update downstream models.

Where it fits

Before learning ref(), you should understand basic SQL and how dbt models work. After mastering ref(), you can learn about advanced dbt features like macros, snapshots, and testing. ref() is a foundational concept that connects your models and enables dbt's powerful dependency management.

Mental Model

Core Idea

ref() is a function that links one dbt model to another, telling dbt which models depend on each other and in what order to run them.

Think of it like...

Imagine building a LEGO castle where each piece must be placed in a specific order. ref() is like the instruction manual that tells you which piece to put next so the castle stands strong.

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  model_a    │────▶│  model_b    │────▶│  model_c    │
└─────────────┘     └─────────────┘     └─────────────┘
       ▲                  ▲                  ▲
       │                  │                  │
    ref('model_a')     ref('model_b')     ref('model_c')

Build-Up - 6 Steps

FoundationUnderstanding dbt Models

Concept: Learn what a dbt model is and how it represents a SQL query that creates a table or view.

In dbt, a model is a SQL file that defines a dataset. When you run dbt, it runs these SQL queries to build tables or views in your database. Models are the building blocks of your data pipeline.

Result

You can create simple tables or views by writing SQL files in your dbt project.

Knowing what a model is helps you understand what ref() will link together.

FoundationBasic SQL References in dbt

IntermediateUsing ref() to Link Models

IntermediateHow ref() Manages Model Dependencies

Advancedref() and Environment Awareness

Expertref() in Complex Dependency Graphs

Under the Hood

ref() is a Jinja macro that, during compilation, replaces the call with the fully qualified table name of the referenced model. dbt builds a dependency graph by parsing all ref() calls across models, then orders model execution accordingly. This graph is a directed acyclic graph ensuring no circular dependencies. At runtime, ref() adapts to the target environment's schema and database settings.

Why designed this way?

ref() was designed to solve the problem of managing complex dependencies in SQL-based data pipelines. Hardcoding table names was error-prone and inflexible. By using ref(), dbt can automate dependency tracking, environment management, and model ordering, making pipelines more reliable and easier to maintain.

┌───────────────┐
│  Model Files  │
└──────┬────────┘
       │ parse ref() calls
       ▼
┌─────────────────────┐
│ Dependency Graph    │
│ (DAG of models)     │
└──────┬──────────────┘
       │ topological sort
       ▼
┌─────────────────────┐
│ Ordered Model Runs  │
└──────┬──────────────┘
       │ during compilation
       ▼
┌─────────────────────┐
│ SQL with replaced   │
│ ref() calls         │
└─────────────────────┘

Myth Busters - 3 Common Misconceptions

Quick: Does ref() execute the referenced model's SQL immediately when called? Commit to yes or no.

Common Belief:ref() runs the referenced model's SQL query immediately and returns its data.

Tap to reveal reality

Quick: Can you use ref() to reference models outside your current dbt project? Commit to yes or no.

Common Belief:ref() can reference any table in the database, even if it's not part of the dbt project.

Tap to reveal reality

Quick: Does ref() automatically handle schema changes in referenced models? Commit to yes or no.

Common Belief:ref() automatically updates your SQL if the schema or columns of the referenced model change.

Tap to reveal reality

Expert Zone

ref() calls are resolved during compilation, not runtime, which means dynamic SQL generation depends on the compilation context.

Using ref() inside macros or hooks requires careful handling because the context may differ, affecting how dependencies are tracked.

ref() supports cross-project references when using dbt packages, but this requires explicit configuration and understanding of package namespaces.

When NOT to use

ref() should not be used to reference external tables or views not managed by dbt; instead, use source() for external data. Also, avoid using ref() in raw SQL outside dbt models, as it requires compilation context.

Production Patterns

In production, teams use ref() to build modular, reusable models that form a clear dependency graph. This enables incremental builds, testing, and documentation generation. Complex projects often combine ref() with source() and macros to manage both internal and external data dependencies.

Connections

Directed Acyclic Graph (DAG)

ref() builds a DAG of model dependencies, similar to how DAGs represent workflows in other fields.

Understanding DAGs from project management or computer science helps grasp how dbt orders model runs without cycles.

Makefile Dependency Management

ref() functions like dependencies in a Makefile, where targets depend on other files to build in order.

Knowing how Makefiles track dependencies clarifies how ref() automates build order in data pipelines.

Software Package Imports

ref() is like importing modules in programming, where one module depends on another to function.

Seeing ref() as an import mechanism helps understand modularity and dependency resolution in dbt projects.

Common Pitfalls

#1Referencing models with hardcoded table names instead of ref()

Wrong approach:SELECT * FROM analytics.sales_data;

Correct approach:SELECT * FROM {{ ref('sales_data') }};

Root cause:Not understanding that hardcoding table names breaks dependency tracking and environment flexibility.

#2Using ref() to reference tables outside the dbt project

Wrong approach:SELECT * FROM {{ ref('external_table') }};

Correct approach:SELECT * FROM {{ source('external_schema', 'external_table') }};

Root cause:Confusing ref() with source(), leading to compilation errors.

#3Creating circular dependencies with ref() calls

Wrong approach:model_a.sql contains {{ ref('model_b') }} and model_b.sql contains {{ ref('model_a') }}

Correct approach:Refactor models to remove circular references, e.g., combine logic or create intermediate models.

Root cause:Not realizing ref() builds a DAG that cannot have cycles.

Key Takeaways

ref() is the core function in dbt that links models and manages dependencies automatically.

Using ref() instead of hardcoded table names makes your SQL flexible and environment-aware.

ref() builds a directed acyclic graph of models, ensuring correct build order and preventing circular dependencies.

ref() only returns table names during compilation and does not execute SQL immediately.

Understanding ref() is essential for building scalable, maintainable, and robust data pipelines with dbt.

Practice

(1/5)

1. What is the main purpose of the ref() function in dbt?

easy

A. To create new database users

B. To write raw SQL queries inside dbt models

C. To link models and define dependencies between them

D. To schedule dbt runs automatically

ref() function for model dependencies in dbt - Deep Dive

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of `ref()`

Step 2: Identify what `ref()` does not do

Final Answer:

Quick Check:

Solution

Step 1: Recall dbt Jinja syntax for ref()

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Understand what ref() compiles to

Step 2: Check the compiled SQL output

Final Answer:

Quick Check:

Solution

Step 1: Check the syntax of ref() usage

Step 2: Identify the error cause

Final Answer:

Quick Check:

Solution

Step 1: Use ref() with correct Jinja syntax for both models

Step 2: Avoid hardcoding table names or missing Jinja syntax

Final Answer:

Quick Check:

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of ref()

Step 2: Identify what ref() does not do

Final Answer:

Quick Check:

Solution

Step 1: Recall dbt Jinja syntax for ref()

Step 2: Check each option

Final Answer:

Quick Check:

Solution

Step 1: Understand what ref() compiles to

Step 2: Check the compiled SQL output

Final Answer:

Quick Check:

Solution

Step 1: Check the syntax of ref() usage

Step 2: Identify the error cause

Final Answer:

Quick Check:

Solution

Step 1: Use ref() with correct Jinja syntax for both models

Step 2: Avoid hardcoding table names or missing Jinja syntax

Final Answer:

Quick Check:

Step 1: Understand the role of `ref()`

Step 2: Identify what `ref()` does not do