0
0
dbtdata~5 mins

Orchestrating dbt with Airflow

Choose your learning style9 modes available
Introduction

We use Airflow to run dbt tasks automatically and in order. This helps keep data fresh and organized without doing it by hand.

You want to update your data models every day without manual work.
You need to run tests on your data after it changes.
You want to run multiple dbt commands in a specific order.
You want to get alerts if something goes wrong in your data pipeline.
You want to combine dbt with other data tasks like loading or cleaning.
Syntax
dbt
from airflow import DAG
from airflow.operators.bash import BashOperator
from datetime import datetime

with DAG('dbt_dag', start_date=datetime(2024, 1, 1), schedule_interval='@daily') as dag:
    run_dbt = BashOperator(
        task_id='run_dbt',
        bash_command='dbt run --profiles-dir /path/to/profiles'
    )

Use BashOperator to run dbt commands as shell commands.

Set schedule_interval to control how often dbt runs.

Examples
Runs all dbt models once.
dbt
run_dbt = BashOperator(
    task_id='dbt_run',
    bash_command='dbt run'
)
Runs tests on your dbt models to check data quality.
dbt
test_dbt = BashOperator(
    task_id='dbt_test',
    bash_command='dbt test'
)
This makes sure tests run only after models finish running.
dbt
run_dbt >> test_dbt
Sample Program

This Airflow DAG runs dbt models first, then runs tests to check data quality. It runs every day automatically.

dbt
from airflow import DAG
from airflow.operators.bash import BashOperator
from datetime import datetime

with DAG('dbt_workflow', start_date=datetime(2024, 1, 1), schedule_interval='@daily') as dag:
    dbt_run = BashOperator(
        task_id='dbt_run',
        bash_command='dbt run --profiles-dir /home/user/.dbt'
    )

    dbt_test = BashOperator(
        task_id='dbt_test',
        bash_command='dbt test --profiles-dir /home/user/.dbt'
    )

    dbt_run >> dbt_test
OutputSuccess
Important Notes

Make sure Airflow can find your dbt profiles by setting --profiles-dir correctly.

You can add more tasks like dbt seed or dbt snapshot as needed.

Use Airflow's UI to monitor your dbt runs and see logs if something fails.

Summary

Airflow helps run dbt commands automatically and in order.

Use BashOperator to run dbt commands inside Airflow tasks.

Set task dependencies so dbt tests run after models finish.