dbtdata~20 mins

Why sources define raw data contracts in dbt - Challenge Your Understanding

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Challenge - 5 Problems

🎖️

Raw Data Contract Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Purpose of Raw Data Contracts in Sources

Why do data teams define raw data contracts when setting up sources in dbt?

ATo replace the need for data testing in later stages

BTo speed up the data loading process by skipping validations

CTo automatically generate visualizations from raw data

DTo ensure the incoming data meets expected formats and quality before transformations

Attempts:

2 left

❓ Predict Output

intermediate

2:00remaining

Output of dbt Source Freshness Check

Given this dbt source freshness configuration, what will be the output status if the source data was last updated 3 hours ago?

sources:
  - name: sales_data
    freshness:
      warn_after:
        count: 2
        period: hour
      error_after:
        count: 4
        period: hour

AStatus: warn (data is older than 2 hours but less than 4 hours)

BStatus: pass (data is fresh within 2 hours)

CStatus: error (data is older than 4 hours)

DStatus: unknown (freshness not configured properly)

Attempts:

2 left

❓ data_output

advanced

2:00remaining

Result of Source Schema Validation

Consider a source defined with a raw data contract expecting columns id (integer) and date (date). If the actual source data has an extra column status (string), what will be the result of the schema validation in dbt?

AValidation fails because <code>status</code> column type is incorrect

BValidation fails due to unexpected extra column

CValidation passes because extra columns are allowed by default

DValidation passes only if <code>status</code> column is nullable

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Debugging a Failed Source Contract Test

A dbt source test for a raw data contract fails with the error: Column 'user_id' contains null values. The contract expects user_id to be non-nullable. Which option correctly fixes the issue?

ARemove the <code>user_id</code> column from the contract

BUpdate the source data to remove or fill nulls in <code>user_id</code>

CChange the contract to allow <code>user_id</code> to be nullable

DIgnore the test failure and proceed with transformations

Attempts:

2 left

🚀 Application

expert

3:00remaining

Designing a Raw Data Contract for a New Source

You are adding a new source table orders to your dbt project. The raw data contract must ensure the following:

order_id is unique and non-null
order_date is non-null and recent (within last 30 days)
customer_id is non-null

Which dbt source test configurations correctly enforce these rules?

AUnique test on <code>order_id</code>, not_null tests on <code>order_id</code>, <code>order_date</code>, <code>customer_id</code>, and a custom freshness test on <code>order_date</code>

BUnique test on <code>order_id</code> and <code>customer_id</code>, no freshness test

CNot_null tests on all columns and a unique test on <code>customer_id</code>

DOnly a freshness test on <code>order_date</code> and not_null on <code>order_id</code>

Attempts:

2 left

Practice

(1/5)

1. Why do we define raw data contracts in dbt sources?

easy

A. To set clear expectations for the raw data coming into the system

B. To speed up the data loading process

C. To automatically fix data errors

D. To create visual reports from raw data

Why sources define raw data contracts in dbt - Challenge Your Understanding

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of raw data contracts

Step 2: Identify the main benefit in dbt context

Final Answer:

Quick Check:

Solution

Step 1: Recall dbt source YAML structure

Step 2: Match correct indentation and keys

Final Answer:

Quick Check:

Solution

Step 1: Understand the 'not_null' test in dbt

Step 2: Predict test behavior on null data

Final Answer:

Quick Check:

Solution

Step 1: Check YAML syntax for tests

Step 2: Identify the error in tests format

Final Answer:

Quick Check:

Solution

Step 1: Identify required tests for 'order_id'

Step 2: Define tests for 'order_date'

Step 3: Combine tests in source YAML

Final Answer:

Quick Check: