Schema Test vs Data Test in dbt: Key Differences and Usage
schema tests check the structure and constraints of your data like uniqueness or null values, while data tests validate the actual data content using custom SQL queries. Schema tests are simpler and reusable, whereas data tests offer more flexibility for complex validations.Quick Comparison
This table summarizes the main differences between schema tests and data tests in dbt.
| Aspect | Schema Test | Data Test |
|---|---|---|
| Purpose | Validate table structure and constraints | Validate data content with custom logic |
| Implementation | YAML configuration with built-in tests | Custom SQL query files |
| Complexity | Simple and reusable | Flexible and complex |
| Examples | Unique, Not Null, Relationships | Business rules, data ranges, custom conditions |
| Failure Output | Fails if constraints violated | Fails if SQL query returns rows |
| Reusability | High, defined once per model | Low, specific to test case |
Key Differences
Schema tests in dbt are predefined tests that check the shape and integrity of your data model. They are declared in YAML files alongside your models and include checks like unique, not_null, and relationships. These tests run automatically and fail if the data violates these constraints.
On the other hand, data tests are custom SQL queries written by you to validate specific data conditions that schema tests cannot cover. They return rows when the test fails, allowing you to write complex business logic or data quality rules. Data tests require more effort but provide greater flexibility.
In summary, schema tests are best for standard, structural validations, while data tests handle detailed content checks that need custom SQL logic.
Code Comparison
Here is an example of a schema test checking that the email column in a users model is unique and not null.
version: 2
models:
- name: users
columns:
- name: email
tests:
- unique
- not_nullData Test Equivalent
The equivalent data test to check for duplicate or null emails uses a SQL query that returns rows violating the rule.
select email from users where email is null or email in ( select email from users group by email having count(*) > 1 )
When to Use Which
Choose schema tests when you want quick, standard checks on your data model like uniqueness or null constraints. They are easy to write and maintain, perfect for common data quality rules.
Choose data tests when you need to validate complex business logic or data conditions that schema tests cannot express. Use them for custom rules, cross-table checks, or detailed content validation.