dbtdata~15 mins

Built-in tests (unique, not_null, accepted_values, relationships) in dbt - Deep Dive

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Overview - Built-in tests (unique, not_null, accepted_values, relationships)

What is it?

Built-in tests in dbt are pre-made checks that help you find problems in your data automatically. They check if values are unique, not missing, within allowed options, or if relationships between tables are correct. These tests run every time you build your data models to keep data trustworthy. They save time by catching errors early without writing complex code.

Why it matters

Without these tests, bad data can silently cause wrong decisions, wasted effort, and lost trust. Built-in tests make it easy to spot data issues before they reach reports or dashboards. This means teams can rely on data confidently and fix problems quickly, improving overall data quality and business outcomes.

Where it fits

Before using built-in tests, you should understand basic SQL and how dbt models work. After mastering these tests, you can learn custom tests and advanced data quality frameworks. This topic fits in the data validation and quality assurance part of the data engineering and analytics workflow.

Mental Model

Core Idea

Built-in tests are automatic data quality checks that verify key rules about your data to keep it accurate and reliable.

Think of it like...

Imagine a factory quality inspector who checks every product for defects like missing parts, duplicates, or wrong sizes before shipping. Built-in tests do the same for your data, catching problems early.

┌───────────────┐      ┌───────────────┐      ┌───────────────┐
│   Data Table  │─────▶│ Built-in Test │─────▶│ Test Result   │
│ (your data)   │      │ (unique,      │      │ (pass/fail)   │
│               │      │  not_null,    │      │               │
│               │      │  accepted_    │      │               │
│               │      │  values,      │      │               │
│               │      │  relationships)│      │               │
└───────────────┘      └───────────────┘      └───────────────┘

Build-Up - 6 Steps

FoundationUnderstanding Unique Test Basics

Concept: The unique test checks if all values in a column are different, with no repeats.

In dbt, the unique test ensures that a column meant to identify records uniquely, like an ID, has no duplicates. You add it by specifying the column in your model's schema file under tests: unique. When run, dbt queries the data to find any repeated values.

Result

If duplicates exist, the test fails and shows which values repeat. If all values are unique, the test passes.

Understanding uniqueness is key because duplicate IDs or keys can cause wrong joins and data confusion downstream.

FoundationGrasping Not Null Test Purpose

IntermediateUsing Accepted Values Test

IntermediateExploring Relationships Test

AdvancedCombining Multiple Built-in Tests

ExpertOptimizing Built-in Tests for Large Data

Under the Hood

Built-in tests in dbt are SQL queries generated from simple YAML configurations. When you run dbt test, it compiles these configurations into SQL that checks conditions like duplicates, nulls, or foreign key matches. The database executes these queries and returns results indicating pass or fail. dbt then summarizes these results for you.

Why designed this way?

dbt uses SQL because it works directly with your data warehouse, leveraging its power and avoiding data movement. The YAML config makes tests easy to write and maintain without complex coding. This design balances simplicity for users and efficiency in execution.

┌───────────────┐
│ YAML Test     │
│ Configuration │
└──────┬────────┘
       │ compiles to
┌──────▼────────┐
│ SQL Query     │
│ (test logic)  │
└──────┬────────┘
       │ runs on
┌──────▼────────┐
│ Data Warehouse│
│ (your data)   │
└──────┬────────┘
       │ returns
┌──────▼────────┐
│ Test Results  │
│ (pass/fail)   │
└───────────────┘

Myth Busters - 3 Common Misconceptions

Quick: Do you think the unique test checks for null values as duplicates? Commit yes or no.

Common Belief:The unique test treats null values as duplicates and fails if multiple nulls exist.

Tap to reveal reality

Quick: Do you think accepted_values test can check numeric ranges? Commit yes or no.

Common Belief:Accepted_values test can validate numeric ranges like 'between 1 and 10'.

Tap to reveal reality

Quick: Do you think relationships test ensures referential integrity like database foreign keys? Commit yes or no.

Common Belief:Relationships test enforces the same strict referential integrity as database foreign keys.

Tap to reveal reality

Expert Zone

Built-in tests generate SQL that can behave differently depending on your database's null handling and indexing, affecting test results and performance.

Tests run in the context of your dbt model's compiled SQL, so changes in model logic can affect test outcomes unexpectedly.

Combining tests with custom SQL or macros allows flexible validation beyond built-in capabilities, but requires careful design to maintain clarity.

When NOT to use

Built-in tests are limited to simple checks and may not suit complex validation logic like pattern matching or cross-dataset consistency. In such cases, use custom tests or external data quality tools like Great Expectations or Deequ.

Production Patterns

Teams integrate built-in tests into CI/CD pipelines to automatically validate data on every change. They combine tests with alerting systems to notify data engineers immediately on failures, enabling fast issue resolution and maintaining trust.

Connections

Data Validation

Built-in tests are a practical implementation of data validation principles.

Understanding general data validation helps grasp why built-in tests focus on uniqueness, completeness, and correctness.

Database Constraints

Built-in tests mimic some database constraints but run as queries rather than enforced rules.

Knowing database constraints clarifies the limits of built-in tests and when to rely on database-level enforcement.

Quality Control in Manufacturing

Built-in tests serve a similar role as quality control checks in factories, ensuring products meet standards before shipping.

Seeing data tests as quality control highlights their role in preventing defects and maintaining trust.

Common Pitfalls

#1Applying unique test on columns that allow nulls without understanding null behavior.

Wrong approach:tests: - unique: column_name: user_id where: user_id IS NOT NULL

Correct approach:tests: - unique: column_name: user_id

Root cause:Misunderstanding that unique test treats nulls as distinct, so filtering nulls is unnecessary and may hide issues.

#2Using accepted_values test to check numeric ranges instead of exact values.

Wrong approach:tests: - accepted_values: column_name: age values: [18, 19, 20, 21, 22, 23, 24, 25]

Correct approach:Use a custom test with SQL condition: age BETWEEN 18 AND 25

Root cause:Confusing accepted_values as a range checker rather than a list membership test.

#3Assuming relationships test enforces referential integrity continuously.

Wrong approach:Relying only on dbt relationships test without database foreign keys.

Correct approach:Add foreign key constraints in the database alongside dbt relationships tests.

Root cause:Not realizing that dbt tests run on demand and do not prevent bad data insertion.

Key Takeaways

Built-in tests in dbt are simple, reusable checks that help keep your data accurate and trustworthy.

They cover common data quality rules like uniqueness, completeness, allowed values, and relationships between tables.

These tests run as SQL queries generated from easy YAML configurations, making them accessible without deep coding.

Understanding their behavior and limits helps you apply them effectively and avoid common mistakes.

Combining built-in tests with database constraints and custom tests creates a strong data quality framework in production.

Practice

(1/5)

1. What does the built-in unique test in dbt check for in a column?

easy

A. It checks that the column has no missing (null) values.

B. It checks that the column values exist in another table's column.

C. It checks that the column values match a predefined list of accepted values.

D. It checks that all values in the column are different with no duplicates.

Built-in tests (unique, not_null, accepted_values, relationships) in dbt - Deep Dive

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of the unique test

Step 2: Compare with other test types

Final Answer:

Quick Check:

Solution

Step 1: Recall YAML structure for dbt tests

Step 2: Identify correct indentation and keys

Final Answer:

Quick Check:

Solution

Step 1: Understand accepted_values test behavior

Step 2: Check if 'deleted' is in the list

Final Answer:

Quick Check:

Solution

Step 1: Understand relationships test syntax

Step 2: Identify the error cause

Final Answer:

Quick Check:

Solution

Step 1: Identify tests for uniqueness and non-null

Step 2: Ensure foreign key relationship

Step 3: Verify other options

Final Answer:

Quick Check: