0
0
DbtComparisonBeginner · 4 min read

Schema Test vs Data Test in dbt: Key Differences and Usage

In dbt, schema tests check the structure and constraints of your data like uniqueness or null values, while data tests validate the actual data content using custom SQL queries. Schema tests are simpler and reusable, whereas data tests offer more flexibility for complex validations.
⚖️

Quick Comparison

This table summarizes the main differences between schema tests and data tests in dbt.

AspectSchema TestData Test
PurposeValidate table structure and constraintsValidate data content with custom logic
ImplementationYAML configuration with built-in testsCustom SQL query files
ComplexitySimple and reusableFlexible and complex
ExamplesUnique, Not Null, RelationshipsBusiness rules, data ranges, custom conditions
Failure OutputFails if constraints violatedFails if SQL query returns rows
ReusabilityHigh, defined once per modelLow, specific to test case
⚖️

Key Differences

Schema tests in dbt are predefined tests that check the shape and integrity of your data model. They are declared in YAML files alongside your models and include checks like unique, not_null, and relationships. These tests run automatically and fail if the data violates these constraints.

On the other hand, data tests are custom SQL queries written by you to validate specific data conditions that schema tests cannot cover. They return rows when the test fails, allowing you to write complex business logic or data quality rules. Data tests require more effort but provide greater flexibility.

In summary, schema tests are best for standard, structural validations, while data tests handle detailed content checks that need custom SQL logic.

⚖️

Code Comparison

Here is an example of a schema test checking that the email column in a users model is unique and not null.

yaml
version: 2
models:
  - name: users
    columns:
      - name: email
        tests:
          - unique
          - not_null
Output
Runs tests that fail if any email is duplicated or null in the users table.
↔️

Data Test Equivalent

The equivalent data test to check for duplicate or null emails uses a SQL query that returns rows violating the rule.

sql
select email
from users
where email is null
   or email in (
       select email
       from users
       group by email
       having count(*) > 1
   )
Output
Returns rows with null or duplicate emails; test fails if any rows are returned.
🎯

When to Use Which

Choose schema tests when you want quick, standard checks on your data model like uniqueness or null constraints. They are easy to write and maintain, perfect for common data quality rules.

Choose data tests when you need to validate complex business logic or data conditions that schema tests cannot express. Use them for custom rules, cross-table checks, or detailed content validation.

Key Takeaways

Schema tests validate data structure and constraints using simple YAML declarations.
Data tests use custom SQL queries for flexible, complex data validations.
Schema tests are reusable and easy to maintain; data tests require more effort but offer more power.
Use schema tests for common integrity checks and data tests for specific business rules.
Both test types help ensure data quality but serve different validation needs.