Why do data teams define raw data contracts when setting up sources in dbt?
Think about how contracts help keep data reliable and consistent.
Raw data contracts define expectations on the source data's structure and quality. This helps catch issues early and ensures transformations work with clean, predictable data.
Given this dbt source freshness configuration, what will be the output status if the source data was last updated 3 hours ago?
sources:
- name: sales_data
freshness:
warn_after:
count: 2
period: hour
error_after:
count: 4
period: hourCompare the last update time with the warn and error thresholds.
The data was last updated 3 hours ago, which is more than the warn_after threshold of 2 hours but less than the error_after threshold of 4 hours, so the status is warn.
Consider a source defined with a raw data contract expecting columns id (integer) and date (date). If the actual source data has an extra column status (string), what will be the result of the schema validation in dbt?
Think about whether dbt schema tests reject extra columns by default.
dbt source schema validations check for required columns and their types but do not fail if extra columns exist. Extra columns are ignored unless explicitly tested.
A dbt source test for a raw data contract fails with the error: Column 'user_id' contains null values. The contract expects user_id to be non-nullable. Which option correctly fixes the issue?
Consider data quality versus contract expectations.
The contract expects user_id to never be null. The correct fix is to clean the source data to meet this expectation, not to weaken the contract or ignore the problem.
You are adding a new source table orders to your dbt project. The raw data contract must ensure the following:
order_idis unique and non-nullorder_dateis non-null and recent (within last 30 days)customer_idis non-null
Which dbt source test configurations correctly enforce these rules?
Think about which tests enforce uniqueness, non-null, and freshness.
Uniqueness and non-null tests ensure data integrity for IDs. Freshness tests check if dates are recent. Only option A covers all requirements correctly.