dbtdata~20 mins

Source freshness checks in dbt - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Challenge - 5 Problems

🎖️

Source Freshness Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of freshness check query

Given the following SQL query used in a dbt source freshness check, what will be the output if the source table's max timestamp is '2024-06-01 10:00:00' and the current timestamp is '2024-06-01 12:00:00'?

dbt

SELECT
  MAX(updated_at) AS last_loaded,
  CURRENT_TIMESTAMP AS check_time,
  TIMESTAMPDIFF(MINUTE, MAX(updated_at), CURRENT_TIMESTAMP) AS minutes_behind
FROM source_schema.source_table;

ASyntaxError: TIMESTAMPDIFF function not found

B{"last_loaded": "2024-06-01 10:00:00", "check_time": "2024-06-01 12:00:00", "minutes_behind": 120}

C{"last_loaded": null, "check_time": "2024-06-01 12:00:00", "minutes_behind": null}

D{"last_loaded": "2024-06-01 12:00:00", "check_time": "2024-06-01 10:00:00", "minutes_behind": -120}

Attempts:

2 left

🧠 Conceptual

intermediate

1:30remaining

Understanding freshness thresholds

In dbt source freshness checks, what does the 'warn_after' threshold represent?

AThe maximum allowed time difference before a warning is triggered

BThe exact time when the source data was last updated

CThe minimum time difference required to mark data as fresh

DThe time interval to refresh the source table automatically

Attempts:

2 left

🔧 Debug

advanced

2:30remaining

Identify the error in freshness SQL

This SQL snippet is used in a dbt freshness check but raises an error. What is the cause? SELECT MAX(last_updated) AS last_loaded, CURRENT_TIMESTAMP AS check_time, DATEDIFF(MINUTES, MAX(last_updated), CURRENT_TIMESTAMP) AS minutes_behind FROM source_schema.source_table;

AMAX function cannot be used on timestamp columns

BCURRENT_TIMESTAMP is not valid SQL syntax

CDATEDIFF function does not accept 'MINUTES' as a unit

DMissing GROUP BY clause causes error

Attempts:

2 left

❓ data_output

advanced

2:00remaining

Result of freshness check with null timestamps

What will be the output of this freshness check query if the source table has no rows (empty)? SELECT MAX(updated_at) AS last_loaded, CURRENT_TIMESTAMP AS check_time, TIMESTAMPDIFF(MINUTE, MAX(updated_at), CURRENT_TIMESTAMP) AS minutes_behind FROM source_schema.source_table;

ARuntimeError: Cannot compute TIMESTAMPDIFF with null values

B{"last_loaded": null, "check_time": "2024-06-01 12:00:00", "minutes_behind": 0}

C{"last_loaded": "1970-01-01 00:00:00", "check_time": "2024-06-01 12:00:00", "minutes_behind": 0}

D{"last_loaded": null, "check_time": "2024-06-01 12:00:00", "minutes_behind": null}

Attempts:

2 left

🚀 Application

expert

3:00remaining

Designing a freshness check for multiple sources

You have three source tables with different update frequencies. You want to configure dbt freshness checks with these thresholds: - source A: warn if data is older than 30 minutes, error if older than 1 hour - source B: warn if data is older than 2 hours, error if older than 4 hours - source C: warn if data is older than 1 day, error if older than 2 days Which YAML configuration snippet correctly sets these freshness checks?

dbt

version: 2
sources:
  - name: source_a
    freshness:
      warn_after: {count: 30, period: minute}
      error_after: {count: 1, period: hour}
  - name: source_b
    freshness:
      warn_after: {count: 2, period: hour}
      error_after: {count: 4, period: hour}
  - name: source_c
    freshness:
      warn_after: {count: 1, period: day}
      error_after: {count: 2, period: day}

version: 2
sources:
  - name: source_a
    freshness:
      warn_after: {count: 30, period: minute}
      error_after: {count: 1, period: hour}
  - name: source_b
    freshness:
      warn_after: {count: 2, period: hour}
      error_after: {count: 4, period: hour}
  - name: source_c
    freshness:
      warn_after: {count: 1, period: day}
      error_after: {count: 2, period: day}

version: 2
sources:
  - name: source_a
    freshness:
      warn_after: {count: 1, period: hour}
      error_after: {count: 30, period: minute}
  - name: source_b
    freshness:
      warn_after: {count: 4, period: hour}
      error_after: {count: 2, period: hour}
  - name: source_c
    freshness:
      warn_after: {count: 2, period: day}
      error_after: {count: 1, period: day}

version: 2
sources:
  - name: source_a
    freshness:
      warn_after: {count: 30, period: hour}
      error_after: {count: 1, period: minute}
  - name: source_b
    freshness:
      warn_after: {count: 2, period: minute}
      error_after: {count: 4, period: minute}
  - name: source_c
    freshness:
      warn_after: {count: 1, period: hour}
      error_after: {count: 2, period: day}

version: 2
sources:
  - name: source_a
    freshness:
      warn_after: {count: 1, period: minute}
      error_after: {count: 30, period: hour}
  - name: source_b
    freshness:
      warn_after: {count: 2, period: day}
      error_after: {count: 4, period: day}
  - name: source_c
    freshness:
      warn_after: {count: 1, period: hour}
      error_after: {count: 2, period: hour}

Attempts:

2 left

Practice

(1/5)

1. What is the main purpose of source freshness checks in dbt?

easy

A. To track how recent the data in your source tables is

B. To create new tables from raw data

C. To optimize SQL query performance

D. To schedule dbt runs automatically

Source freshness checks in dbt - Practice Problems & Coding Challenges

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of freshness checks

Step 2: Compare options to the purpose

Final Answer:

Quick Check:

Solution

Step 1: Recall correct YAML syntax for freshness

Step 2: Match options to syntax

Final Answer:

Quick Check:

Solution

Step 1: Calculate data age from last loaded timestamp

Step 2: Compare data age to thresholds

Final Answer:

Quick Check:

Solution

Step 1: Check period values in freshness YAML

Step 2: Identify error cause

Final Answer:

Quick Check:

Solution

Step 1: Identify correct period and count values

Step 2: Check warn_after and error_after order

Step 3: Validate options

Final Answer:

Quick Check: