0
0
dbtdata~20 mins

Why sources define raw data contracts in dbt - Challenge Your Understanding

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Raw Data Contract Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Purpose of Raw Data Contracts in Sources

Why do data teams define raw data contracts when setting up sources in dbt?

ATo replace the need for data testing in later stages
BTo speed up the data loading process by skipping validations
CTo automatically generate visualizations from raw data
DTo ensure the incoming data meets expected formats and quality before transformations
Attempts:
2 left
💡 Hint

Think about how contracts help keep data reliable and consistent.

Predict Output
intermediate
2:00remaining
Output of dbt Source Freshness Check

Given this dbt source freshness configuration, what will be the output status if the source data was last updated 3 hours ago?

sources:
  - name: sales_data
    freshness:
      warn_after:
        count: 2
        period: hour
      error_after:
        count: 4
        period: hour
AStatus: warn (data is older than 2 hours but less than 4 hours)
BStatus: pass (data is fresh within 2 hours)
CStatus: error (data is older than 4 hours)
DStatus: unknown (freshness not configured properly)
Attempts:
2 left
💡 Hint

Compare the last update time with the warn and error thresholds.

data_output
advanced
2:00remaining
Result of Source Schema Validation

Consider a source defined with a raw data contract expecting columns id (integer) and date (date). If the actual source data has an extra column status (string), what will be the result of the schema validation in dbt?

AValidation fails because <code>status</code> column type is incorrect
BValidation fails due to unexpected extra column
CValidation passes because extra columns are allowed by default
DValidation passes only if <code>status</code> column is nullable
Attempts:
2 left
💡 Hint

Think about whether dbt schema tests reject extra columns by default.

🔧 Debug
advanced
2:00remaining
Debugging a Failed Source Contract Test

A dbt source test for a raw data contract fails with the error: Column 'user_id' contains null values. The contract expects user_id to be non-nullable. Which option correctly fixes the issue?

ARemove the <code>user_id</code> column from the contract
BUpdate the source data to remove or fill nulls in <code>user_id</code>
CChange the contract to allow <code>user_id</code> to be nullable
DIgnore the test failure and proceed with transformations
Attempts:
2 left
💡 Hint

Consider data quality versus contract expectations.

🚀 Application
expert
3:00remaining
Designing a Raw Data Contract for a New Source

You are adding a new source table orders to your dbt project. The raw data contract must ensure the following:

  • order_id is unique and non-null
  • order_date is non-null and recent (within last 30 days)
  • customer_id is non-null

Which dbt source test configurations correctly enforce these rules?

AUnique test on <code>order_id</code>, not_null tests on <code>order_id</code>, <code>order_date</code>, <code>customer_id</code>, and a custom freshness test on <code>order_date</code>
BUnique test on <code>order_id</code> and <code>customer_id</code>, no freshness test
CNot_null tests on all columns and a unique test on <code>customer_id</code>
DOnly a freshness test on <code>order_date</code> and not_null on <code>order_id</code>
Attempts:
2 left
💡 Hint

Think about which tests enforce uniqueness, non-null, and freshness.