0
0
Snowflakecloud~20 mins

Integration with dbt and Airflow in Snowflake - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
dbt-Airflow Snowflake Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
service_behavior
intermediate
2:00remaining
Understanding Airflow DAG Scheduling with dbt Tasks

You have an Airflow DAG that runs dbt models on Snowflake every day at midnight. The DAG has three tasks: extract, transform, and load. The transform task runs dbt models using the dbt run command. What will happen if the extract task fails?

Snowflake
dag = DAG('daily_dbt', schedule_interval='0 0 * * *')
extract = BashOperator(task_id='extract', bash_command='python extract.py', dag=dag)
transform = BashOperator(task_id='transform', bash_command='dbt run --profiles-dir ./profiles', dag=dag)
load = BashOperator(task_id='load', bash_command='python load.py', dag=dag)
extract >> transform >> load
AAll tasks will run in parallel ignoring dependencies.
BThe transform task will not run because Airflow stops the DAG on task failure.
CThe load task will run but the transform task will be skipped.
DThe transform task will run regardless of the extract task status.
Attempts:
2 left
💡 Hint

Think about how Airflow handles task dependencies and failures.

Architecture
intermediate
2:00remaining
Designing a Reliable dbt and Airflow Integration on Snowflake

You want to build a data pipeline using Airflow to orchestrate dbt models on Snowflake. Which architecture ensures that dbt models only run after the raw data is fully loaded into Snowflake?

AUse Airflow to run dbt models and load data in parallel to save time.
BSchedule dbt models to run on a fixed time daily without checking data load status.
CCreate an Airflow DAG where the load task runs first, then the transform task runs dbt models, with explicit task dependencies.
DRun dbt models manually after loading data into Snowflake.
Attempts:
2 left
💡 Hint

Think about how to enforce order in task execution.

security
advanced
2:00remaining
Securing Credentials for dbt and Airflow Integration

You need to securely manage Snowflake credentials used by dbt in an Airflow environment. Which approach follows best security practices?

AStore Snowflake credentials in Airflow Variables encrypted with a key and reference them in dbt profiles.
BHardcode Snowflake credentials directly in the dbt profiles.yml file checked into Git.
CPass Snowflake credentials as plain text environment variables in Airflow tasks.
DUse the same Snowflake user credentials for all Airflow and dbt tasks without rotation.
Attempts:
2 left
💡 Hint

Consider encryption and avoiding hardcoding secrets.

Configuration
advanced
2:00remaining
Configuring dbt Profiles for Airflow Execution on Snowflake

You want to run dbt models from Airflow using the dbt run command. Which profiles.yml configuration snippet correctly sets up a Snowflake connection for Airflow?

Snowflake
profiles.yml snippet:
A
my_profile:
  target: prod
  outputs:
    prod:
      type: snowflake
      account: 'xy12345'
      user: '{{ var.value.snowflake_user }}'
      password: '{{ var.value.snowflake_password }}'
      role: 'ANALYST'
      database: 'MY_DB'
      warehouse: 'MY_WH'
      schema: 'PUBLIC'
      threads: 1
      client_session_keep_alive: false
B
my_profile:
  target: prod
  outputs:
    prod:
      type: snowflake
      account: 'xy12345'
      user: 'airflow'
      password: 'password123'
      role: 'ANALYST'
      database: 'MY_DB'
      warehouse: 'MY_WH'
      schema: 'PUBLIC'
      threads: 4
      client_session_keep_alive: true
C
my_profile:
  target: prod
  outputs:
    prod:
      type: snowflake
      account: 'xy12345'
      user: 'dbt_user'
      password: '{{ var.value.snowflake_password }}'
      role: 'ANALYST'
      database: 'MY_DB'
      warehouse: 'MY_WH'
      schema: 'PUBLIC'
      threads: 1
      client_session_keep_alive: true
D
my_profile:
  target: dev
  outputs:
    dev:
      type: snowflake
      account: 'xy12345'
      user: '{{ env_var("SNOWFLAKE_USER") }}'
      password: '{{ env_var("SNOWFLAKE_PASSWORD") }}'
      role: 'ANALYST'
      database: 'MY_DB'
      warehouse: 'MY_WH'
      schema: 'PUBLIC'
      threads: 2
      client_session_keep_alive: false
Attempts:
2 left
💡 Hint

Consider how Airflow passes environment variables securely to tasks.

Best Practice
expert
2:00remaining
Optimizing Airflow and dbt Pipeline for Snowflake Cost Efficiency

You want to optimize your Airflow orchestrated dbt pipeline on Snowflake to reduce compute costs without sacrificing data freshness. Which strategy is best?

AUse Snowflake warehouses with auto-suspend enabled and set Airflow schedules to run only when new data arrives using sensors.
BKeep Snowflake warehouses running 24/7 to avoid startup delays and schedule dbt runs every 15 minutes.
CRun all dbt models daily regardless of data changes to ensure freshness.
DUse the largest Snowflake warehouse size to speed up dbt runs and reduce runtime.
Attempts:
2 left
💡 Hint

Think about balancing cost and data freshness with scheduling and warehouse settings.