What if a simple test could save you hours of painful data cleanup?
Why Built-in tests (unique, not_null, accepted_values, relationships) in dbt? - Purpose & Use Cases
Start learning this pattern below
Jump into concepts and practice - no test required
Imagine you have a huge spreadsheet with thousands of rows of sales data. You want to make sure every order ID is unique, no customer ID is missing, and all product categories are valid. Doing this by scanning rows one by one or writing complex manual checks is exhausting and easy to mess up.
Manually checking data quality is slow and error-prone. You might miss duplicates or forget to check some columns. It's hard to keep track of all rules, and fixing errors after analysis wastes time and causes wrong decisions.
Built-in tests in dbt let you quickly and reliably check your data with simple commands. They automatically verify uniqueness, missing values, allowed values, and relationships between tables. This saves time, reduces mistakes, and keeps your data trustworthy.
SELECT order_id, COUNT(*) FROM sales GROUP BY order_id HAVING COUNT(*) > 1;tests:
- unique:
column_name: order_idWith built-in tests, you can confidently trust your data and focus on insights instead of hunting errors.
A retail company uses dbt built-in tests to ensure every transaction has a unique ID, no customer info is missing, and product categories match the approved list before running sales reports.
Manual data checks are slow and risky.
Built-in tests automate common data validations.
This leads to faster, more reliable data analysis.
Practice
unique test in dbt check for in a column?Solution
Step 1: Understand the purpose of the unique test
The unique test ensures that each value in the specified column appears only once, meaning no duplicates.Step 2: Compare with other test types
Other tests like not_null check for missing values, accepted_values check for allowed values, and relationships check for foreign key matches.Final Answer:
It checks that all values in the column are different with no duplicates. -> Option DQuick Check:
unique test = no duplicates [OK]
- Confusing unique with not_null test
- Thinking unique checks accepted values
- Mixing unique with relationships test
not_null test on the column user_id in a dbt model YAML file?Solution
Step 1: Recall YAML structure for dbt tests
Tests are added under the columns list, each column has a name and a tests list with test names.Step 2: Identify correct indentation and keys
columns: - name: user_id tests: - not_null correctly uses 'name' for the column and 'tests' as a list with '- not_null'. Other options have wrong keys or structure.Final Answer:
columns: - name: user_id tests: - not_null -> Option AQuick Check:
YAML tests under columns with name and tests list [OK]
- Using 'test' instead of 'tests'
- Incorrect indentation breaking YAML
- Placing tests outside columns section
columns:
- name: status
tests:
- accepted_values:
values: ['active', 'inactive', 'pending']
What happens if the status column contains the value 'deleted' when you run dbt test?Solution
Step 1: Understand accepted_values test behavior
The accepted_values test checks if all column values are within the specified list.Step 2: Check if 'deleted' is in the list
'deleted' is not in ['active', 'inactive', 'pending'], so the test will fail.Final Answer:
The test fails because 'deleted' is not in the accepted values list. -> Option BQuick Check:
accepted_values rejects values outside list [OK]
- Assuming test passes if value is a string
- Confusing accepted_values with not_null
- Thinking test skips unknown values
columns:
- name: order_id
tests:
- relationships:
to: ref('orders')
But running dbt test gives an error. What is the most likely cause?Solution
Step 1: Understand relationships test syntax
The relationships test requires both 'to' (target table) and 'field' (target column).Step 2: Identify the error cause
The YAML is missing the 'field' key, causing a configuration error when running dbt test.Final Answer:
The 'field' key is missing in the relationships test. -> Option AQuick Check:
relationships 'to' + 'field' required [OK]
- Using ref() in YAML instead of table name string
- Omitting the 'field' key
- Assuming 'field' must match column name
customer_id column in your orders model is unique, not null, and only contains values that exist in the customers table's id column. Which combination of built-in tests should you add in your YAML?Solution
Step 1: Identify tests for uniqueness and non-null
Use 'unique' to ensure no duplicates and 'not_null' to prevent missing values.Step 2: Ensure foreign key relationship
Use 'relationships' test with 'to' as 'customers' table and 'field' as 'id' to check existence.Step 3: Verify other options
Options B, C, and D misuse accepted_values or mix concepts incorrectly.Final Answer:
- unique - not_null - relationships: to: customers field: id -> Option CQuick Check:
unique + not_null + relationships = correct tests [OK]
- Using accepted_values to check null or uniqueness
- Misconfiguring relationships test
- Missing one of the required tests
