Bird
Raised Fist0
dbtdata~5 mins

Building a DAG of models in dbt - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is a DAG in the context of dbt?
A DAG (Directed Acyclic Graph) in dbt is a way to organize models so that each model runs only after the models it depends on have run. It shows the order of execution without any loops.
Click to reveal answer
beginner
How does dbt know the dependencies between models?
dbt finds dependencies by looking at the references inside models using the ref() function. When one model uses ref() to call another, dbt knows the first depends on the second.
Click to reveal answer
intermediate
Why is it important that the DAG is acyclic?
The DAG must be acyclic (no loops) because if models depend on each other in a circle, dbt cannot decide which to run first. This would cause an error and stop the build.
Click to reveal answer
beginner
What command can you use to visualize the DAG of your dbt project?
You can use dbt docs generate to create documentation and dbt docs serve to open a web page that shows the DAG graphically.
Click to reveal answer
intermediate
How does building a DAG help in managing complex data transformations?
Building a DAG helps by clearly showing the order models run, making sure data flows correctly. It prevents errors from running models too early and helps debug by showing dependencies.
Click to reveal answer
What does the ref() function do in dbt?
ACreates a new table
BRuns a SQL query
CDeletes a model
DDefines a dependency on another model
Why can't a DAG have cycles?
ABecause cycles reduce data quality
BBecause cycles cause infinite loops in model execution
CBecause cycles speed up execution too much
DBecause cycles make the graph look messy
Which command helps you see the DAG visually in dbt?
Adbt docs serve
Bdbt run
Cdbt test
Ddbt clean
If model A uses ref('model_b'), what does this mean?
AModel A and B run independently
BModel B depends on model A
CModel A depends on model B
DModel A deletes model B
What is the main benefit of building a DAG of models?
AEnsures models run in the correct order
BMakes models run faster
CReduces the size of data
DAutomatically fixes errors
Explain how dbt uses the DAG to run models in the right order.
Think about how one model can depend on another and how dbt figures out the order.
You got /4 concepts.
    Describe why having cycles in your DAG causes problems in dbt.
    Imagine trying to run models that depend on each other in a circle.
    You got /4 concepts.

      Practice

      (1/5)
      1.

      What does a DAG represent in dbt?

      easy
      A. The configuration settings for dbt profiles
      B. The syntax rules for writing SQL queries
      C. The order in which models depend on each other
      D. The list of all tables in the database

      Solution

      1. Step 1: Understand what DAG means in dbt context

        A DAG (Directed Acyclic Graph) shows how models are connected by dependencies.
      2. Step 2: Identify the role of DAG in dbt

        dbt uses the DAG to know which models to run first based on dependencies.
      3. Final Answer:

        The order in which models depend on each other -> Option C
      4. Quick Check:

        DAG = model dependency order [OK]
      Hint: DAG shows model dependencies and run order [OK]
      Common Mistakes:
      • Confusing DAG with SQL syntax
      • Thinking DAG lists all tables
      • Mixing DAG with dbt config files
      2.

      Which of the following is the correct way to reference another model in a dbt SQL file?

      SELECT * FROM ___
      easy
      A. ref(model_name)
      B. ref('model_name')
      C. 'ref(model_name)'
      D. ref:"model_name"

      Solution

      1. Step 1: Recall the syntax for referencing models in dbt

        dbt uses the function ref() with the model name as a string inside parentheses.
      2. Step 2: Check each option for correct syntax

        ref('model_name') uses ref('model_name') which is correct; others have syntax errors or wrong quotes.
      3. Final Answer:

        ref('model_name') -> Option B
      4. Quick Check:

        Use ref('model_name') with quotes [OK]
      Hint: Use ref('model_name') with quotes and parentheses [OK]
      Common Mistakes:
      • Omitting quotes around model name
      • Using wrong quote types
      • Using colons or other symbols
      3.

      Given these two models, what is the order dbt will run them?

      -- model_a.sql
      SELECT * FROM source_table
      
      -- model_b.sql
      SELECT * FROM {{ ref('model_a') }}
      medium
      A. model_a runs first, then model_b
      B. model_b runs first, then model_a
      C. Both run simultaneously
      D. dbt will error due to circular dependency

      Solution

      1. Step 1: Identify dependencies from ref()

        model_b references model_a using ref(), so model_b depends on model_a.
      2. Step 2: Determine run order based on dependencies

        dbt runs model_a first, then model_b to ensure data is ready.
      3. Final Answer:

        model_a runs first, then model_b -> Option A
      4. Quick Check:

        Dependency order = model_a before model_b [OK]
      Hint: Models run in dependency order: referenced first [OK]
      Common Mistakes:
      • Assuming ref() means reverse dependency
      • Thinking models run simultaneously
      • Confusing circular dependency errors
      4.

      What is wrong with this dbt model code snippet?

      SELECT * FROM {{ ref(model_a) }}
      medium
      A. Model name should be uppercase
      B. ref() cannot be used inside SELECT
      C. Missing FROM keyword
      D. Missing quotes around model name in ref()

      Solution

      1. Step 1: Check syntax of ref() usage

        ref() requires the model name as a string with quotes inside the parentheses.
      2. Step 2: Identify the error in the code snippet

        model_a is not quoted, causing a syntax error in dbt compilation.
      3. Final Answer:

        Missing quotes around model name in ref() -> Option D
      4. Quick Check:

        ref('model_name') needs quotes [OK]
      Hint: Always put model names in quotes inside ref() [OK]
      Common Mistakes:
      • Forgetting quotes around model names
      • Thinking ref() can't be in SELECT
      • Assuming case sensitivity causes error
      5.

      You have three models: model_x, model_y, and model_z. model_y references model_x, and model_z references both model_x and model_y. Which of the following is the correct order dbt will run these models?

      hard
      A. model_x, model_y, model_z
      B. model_y, model_x, model_z
      C. model_z, model_y, model_x
      D. model_x, model_z, model_y

      Solution

      1. Step 1: Analyze dependencies among models

        model_y depends on model_x; model_z depends on both model_x and model_y.
      2. Step 2: Determine run order respecting dependencies

        model_x runs first (no dependencies), then model_y (depends on model_x), then model_z (depends on both).
      3. Final Answer:

        model_x, model_y, model_z -> Option A
      4. Quick Check:

        Run order respects dependencies [OK]
      Hint: Run models so dependencies are built before dependents [OK]
      Common Mistakes:
      • Running dependent models before their dependencies
      • Ignoring multiple dependencies
      • Assuming any order works if models reference each other