Bird
Raised Fist0
dbtdata~30 mins

Built-in tests (unique, not_null, accepted_values, relationships) in dbt - Mini Project: Build & Apply

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Using dbt Built-in Tests for Data Quality
📖 Scenario: You work as a data analyst for an online store. You want to make sure your sales data is clean and reliable before using it for reports. You will use dbt built-in tests to check your data quality.
🎯 Goal: Learn how to apply dbt built-in tests: unique, not_null, accepted_values, and relationships to a sales data model.
📋 What You'll Learn
Create a dbt model with sales data
Add a unique test on order_id
Add a not_null test on customer_id
Add an accepted_values test on order_status
Add a relationships test between customer_id in sales and customers table
💡 Why This Matters
🌍 Real World
Data analysts and engineers use dbt tests to catch data errors early and maintain trust in reports.
💼 Career
Knowing dbt built-in tests is essential for roles involving data modeling, data quality, and analytics engineering.
Progress0 / 4 steps
1
Create sales data model
Create a dbt model file called sales.sql with a SELECT statement that returns these columns: order_id, customer_id, order_status. Use the exact column names.
dbt
Hint

Use select order_id, customer_id, order_status from raw.sales_data.

2
Add unique and not_null tests
In your schema.yml file, add a unique test on order_id and a not_null test on customer_id for the sales model.
dbt
Hint

Use tests: - unique under order_id and tests: - not_null under customer_id.

3
Add accepted_values test
In the schema.yml file, add an accepted_values test on order_status with accepted values ['pending', 'shipped', 'delivered', 'cancelled'] for the sales model.
dbt
Hint

Use accepted_values: values: ['pending', 'shipped', 'delivered', 'cancelled'] under order_status.

4
Add relationships test
In the schema.yml file, add a relationships test on customer_id in the sales model. It should reference the id column in the customers model.
dbt
Hint

Use relationships: to: ref('customers') field: id under customer_id.

Practice

(1/5)
1. What does the built-in unique test in dbt check for in a column?
easy
A. It checks that the column has no missing (null) values.
B. It checks that the column values exist in another table's column.
C. It checks that the column values match a predefined list of accepted values.
D. It checks that all values in the column are different with no duplicates.

Solution

  1. Step 1: Understand the purpose of the unique test

    The unique test ensures that each value in the specified column appears only once, meaning no duplicates.
  2. Step 2: Compare with other test types

    Other tests like not_null check for missing values, accepted_values check for allowed values, and relationships check for foreign key matches.
  3. Final Answer:

    It checks that all values in the column are different with no duplicates. -> Option D
  4. Quick Check:

    unique test = no duplicates [OK]
Hint: Unique means no duplicates allowed in the column [OK]
Common Mistakes:
  • Confusing unique with not_null test
  • Thinking unique checks accepted values
  • Mixing unique with relationships test
2. Which of the following is the correct syntax to add a not_null test on the column user_id in a dbt model YAML file?
easy
A. columns: - name: user_id tests: - not_null
B. columns: - user_id: tests: - not_null
C. tests: - not_null: user_id
D. columns: - name: user_id test: not_null

Solution

  1. Step 1: Recall YAML structure for dbt tests

    Tests are added under the columns list, each column has a name and a tests list with test names.
  2. Step 2: Identify correct indentation and keys

    columns: - name: user_id tests: - not_null correctly uses 'name' for the column and 'tests' as a list with '- not_null'. Other options have wrong keys or structure.
  3. Final Answer:

    columns: - name: user_id tests: - not_null -> Option A
  4. Quick Check:

    YAML tests under columns with name and tests list [OK]
Hint: Use 'name' and 'tests' keys with proper indentation [OK]
Common Mistakes:
  • Using 'test' instead of 'tests'
  • Incorrect indentation breaking YAML
  • Placing tests outside columns section
3. Given this YAML snippet in a dbt model:
columns:
  - name: status
    tests:
      - accepted_values:
          values: ['active', 'inactive', 'pending']
What happens if the status column contains the value 'deleted' when you run dbt test?
medium
A. The test passes because 'deleted' is a valid string.
B. The test fails because 'deleted' is not in the accepted values list.
C. The test is skipped because accepted_values only checks for nulls.
D. The test throws a syntax error due to incorrect YAML.

Solution

  1. Step 1: Understand accepted_values test behavior

    The accepted_values test checks if all column values are within the specified list.
  2. Step 2: Check if 'deleted' is in the list

    'deleted' is not in ['active', 'inactive', 'pending'], so the test will fail.
  3. Final Answer:

    The test fails because 'deleted' is not in the accepted values list. -> Option B
  4. Quick Check:

    accepted_values rejects values outside list [OK]
Hint: Accepted_values fails if any value is outside the list [OK]
Common Mistakes:
  • Assuming test passes if value is a string
  • Confusing accepted_values with not_null
  • Thinking test skips unknown values
4. You wrote this test in your dbt model YAML:
columns:
  - name: order_id
    tests:
      - relationships:
          to: ref('orders')
But running dbt test gives an error. What is the most likely cause?
medium
A. The 'field' key is missing in the relationships test.
B. The 'to' value should be a string, not a ref function.
C. The relationships test requires the 'field' to be the same as the column name.
D. The 'to' value must be a table name string, not a ref function.

Solution

  1. Step 1: Understand relationships test syntax

    The relationships test requires both 'to' (target table) and 'field' (target column).
  2. Step 2: Identify the error cause

    The YAML is missing the 'field' key, causing a configuration error when running dbt test.
  3. Final Answer:

    The 'field' key is missing in the relationships test. -> Option A
  4. Quick Check:

    relationships 'to' + 'field' required [OK]
Hint: relationships test requires 'to' and 'field' keys [OK]
Common Mistakes:
  • Using ref() in YAML instead of table name string
  • Omitting the 'field' key
  • Assuming 'field' must match column name
5. You want to ensure that the customer_id column in your orders model is unique, not null, and only contains values that exist in the customers table's id column. Which combination of built-in tests should you add in your YAML?
hard
A. - not_null - accepted_values: values: [unique] - relationships: to: customers field: id
B. - unique - accepted_values: values: [not null] - relationships: to: customers field: id
C. - unique - not_null - relationships: to: customers field: id
D. - unique - not_null - accepted_values: values: [customer_id]

Solution

  1. Step 1: Identify tests for uniqueness and non-null

    Use 'unique' to ensure no duplicates and 'not_null' to prevent missing values.
  2. Step 2: Ensure foreign key relationship

    Use 'relationships' test with 'to' as 'customers' table and 'field' as 'id' to check existence.
  3. Step 3: Verify other options

    Options B, C, and D misuse accepted_values or mix concepts incorrectly.
  4. Final Answer:

    - unique - not_null - relationships: to: customers field: id -> Option C
  5. Quick Check:

    unique + not_null + relationships = correct tests [OK]
Hint: Combine unique, not_null, and relationships for full check [OK]
Common Mistakes:
  • Using accepted_values to check null or uniqueness
  • Misconfiguring relationships test
  • Missing one of the required tests