0
0
dbtdata~30 mins

Built-in tests (unique, not_null, accepted_values, relationships) in dbt - Mini Project: Build & Apply

Choose your learning style9 modes available
Using dbt Built-in Tests for Data Quality
📖 Scenario: You work as a data analyst for an online store. You want to make sure your sales data is clean and reliable before using it for reports. You will use dbt built-in tests to check your data quality.
🎯 Goal: Learn how to apply dbt built-in tests: unique, not_null, accepted_values, and relationships to a sales data model.
📋 What You'll Learn
Create a dbt model with sales data
Add a unique test on order_id
Add a not_null test on customer_id
Add an accepted_values test on order_status
Add a relationships test between customer_id in sales and customers table
💡 Why This Matters
🌍 Real World
Data analysts and engineers use dbt tests to catch data errors early and maintain trust in reports.
💼 Career
Knowing dbt built-in tests is essential for roles involving data modeling, data quality, and analytics engineering.
Progress0 / 4 steps
1
Create sales data model
Create a dbt model file called sales.sql with a SELECT statement that returns these columns: order_id, customer_id, order_status. Use the exact column names.
dbt
Need a hint?

Use select order_id, customer_id, order_status from raw.sales_data.

2
Add unique and not_null tests
In your schema.yml file, add a unique test on order_id and a not_null test on customer_id for the sales model.
dbt
Need a hint?

Use tests: - unique under order_id and tests: - not_null under customer_id.

3
Add accepted_values test
In the schema.yml file, add an accepted_values test on order_status with accepted values ['pending', 'shipped', 'delivered', 'cancelled'] for the sales model.
dbt
Need a hint?

Use accepted_values: values: ['pending', 'shipped', 'delivered', 'cancelled'] under order_status.

4
Add relationships test
In the schema.yml file, add a relationships test on customer_id in the sales model. It should reference the id column in the customers model.
dbt
Need a hint?

Use relationships: to: ref('customers') field: id under customer_id.