0
0
Snowflakecloud~5 mins

Integration with dbt and Airflow in Snowflake - Commands & Configuration

Choose your learning style9 modes available
Introduction
When you want to automate data transformations and workflows, you can connect dbt with Airflow. This helps run your data models in Snowflake on a schedule without manual steps.
When you want to run dbt models automatically every day without manual commands.
When you need to orchestrate multiple data tasks in order, like loading data then transforming it.
When you want to monitor and retry failed data jobs easily.
When you want to keep your data warehouse in Snowflake updated with fresh transformed data.
When you want to combine SQL-based transformations (dbt) with workflow automation (Airflow).
Config File - dbt_airflow_dag.py
dbt_airflow_dag.py
from airflow import DAG
from airflow.operators.bash import BashOperator
from datetime import datetime, timedelta

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime(2024, 1, 1),
    'retries': 1,
    'retry_delay': timedelta(minutes=5),
}

dag = DAG(
    'dbt_snowflake_workflow',
    default_args=default_args,
    description='Run dbt models in Snowflake using Airflow',
    schedule_interval='0 2 * * *',  # daily at 2 AM
    catchup=False,
)

run_dbt = BashOperator(
    task_id='run_dbt_models',
    bash_command='cd /opt/dbt_project && dbt run --profiles-dir .',
    dag=dag,
)

run_dbt

This Airflow DAG file defines a workflow to run dbt models daily at 2 AM.

default_args sets retry and start date settings.

schedule_interval controls when the DAG runs.

BashOperator runs the dbt command in the project directory.

Commands
Check that Airflow recognizes the new DAG file and lists it.
Terminal
airflow dags list
Expected OutputExpected
dbt_snowflake_workflow example_bash_operator example_branch_dop_operator example_subdag_operator
Manually start the dbt workflow DAG to test it runs correctly.
Terminal
airflow dags trigger dbt_snowflake_workflow
Expected OutputExpected
Created <DagRun dbt_snowflake_workflow @ 2024-06-01 10:00:00: manual__2024-06-01T10:00:00+00:00, externally triggered: True>
List all tasks in the dbt workflow DAG to verify the task names.
Terminal
airflow tasks list dbt_snowflake_workflow
Expected OutputExpected
run_dbt_models
Run the dbt task manually for the given date to check it executes without errors.
Terminal
airflow tasks test dbt_snowflake_workflow run_dbt_models 2024-06-01
Expected OutputExpected
[2024-06-01 10:05:00,000] {bash.py:123} INFO - Running command: cd /opt/dbt_project && dbt run --profiles-dir . Running with dbt=1.5.0 Found 5 models, 0 tests, 0 snapshots, 0 analyses, 123 macros, 0 operations, 0 seed files, 0 sources 17:05:00 | Concurrency: 1 threads (target='dev') 17:05:00 | 17:05:00 | 1 of 5 START table model my_model................................ [RUN] 17:05:05 | 1 of 5 OK created table model my_model............................ [OK in 5.0s] Completed successfully Done. PASS=5 WARN=0 ERROR=0 SKIP=0 TOTAL=5
Key Concept

If you remember nothing else from this pattern, remember: Airflow runs dbt commands on schedule to automate Snowflake data transformations.

Common Mistakes
Not setting the correct working directory in the BashOperator bash_command.
dbt commands fail if run outside the project folder because they can't find the dbt project files.
Use 'cd /opt/dbt_project && dbt run --profiles-dir .' to ensure the command runs in the right folder.
Forgetting to set 'catchup=False' in the DAG schedule.
Airflow tries to run all past missed schedules, causing many unwanted runs.
Set 'catchup=False' to run only from the current schedule forward.
Not verifying the DAG is loaded with 'airflow dags list' before triggering.
Triggering a DAG that Airflow does not recognize will fail silently or cause confusion.
Always check the DAG list to confirm your DAG is active.
Summary
Create an Airflow DAG file that runs dbt commands in the correct project directory.
Use Airflow CLI commands to list, trigger, and test the DAG and its tasks.
Set proper DAG arguments like schedule and catchup to control execution timing.