How to Use dbt_expectations Package for Data Testing in dbt
The
dbt_expectations package adds pre-built data quality tests to your dbt models using simple macros. You install it, add it as a dependency, then call its macros in your schema.yml files to validate your data with tests like expect_column_values_to_not_be_null or expect_table_row_count_to_be_between.Syntax
The dbt_expectations package provides macros that you use inside your schema.yml files to define tests on your models or sources. Each test macro starts with expect_ and takes parameters like column, min_value, max_value, or value_set depending on the test.
Example syntax for a test in schema.yml:
- dbt_expectations.expect_column_values_to_not_be_null:tests that a column has no nulls.- dbt_expectations.expect_column_values_to_be_in_set:tests that column values belong to a specific set.
You add these tests under the tests: key for a model or source.
yaml
models/my_model/schema.yml: version: 2 models: - name: my_model columns: - name: id tests: - dbt_expectations.expect_column_values_to_not_be_null - dbt_expectations.expect_column_values_to_be_in_set: value_set: ['A', 'B', 'C']
Example
This example shows how to add the dbt_expectations package to your packages.yml, install it, and use a test macro in your model's schema.yml file.
It demonstrates testing that the status column in orders model has only allowed values.
yaml
packages.yml:
packages:
- package: calogica/dbt_expectations
version: [">=0.7.0", "<0.8.0"]
# After adding, run:
# dbt deps
models/orders/schema.yml:
version: 2
models:
- name: orders
columns:
- name: status
tests:
- dbt_expectations.expect_column_values_to_be_in_set:
value_set: ['pending', 'shipped', 'delivered', 'cancelled']Output
Running dbt tests will check if all values in 'status' column are within the allowed set ['pending', 'shipped', 'delivered', 'cancelled']. If any value is outside this set, the test fails.
Common Pitfalls
Common mistakes when using dbt_expectations include:
- Not adding the package to
packages.ymland runningdbt deps. - Using incorrect test macro names or missing required parameters.
- Placing tests outside the
columnssection or misnaming columns. - Expecting tests to run without running
dbt test.
Always check the macro names and parameters in the official dbt_expectations docs.
yaml
Incorrect test usage example: models/my_model/schema.yml: version: 2 models: - name: my_model columns: - name: id tests: - dbt_expectations.expect_column_values_to_be_in_set # Missing parameters Correct usage: models/my_model/schema.yml: version: 2 models: - name: my_model columns: - name: id tests: - dbt_expectations.expect_column_values_to_be_in_set: value_set: [1, 2, 3]
Quick Reference
| Test Macro | Purpose | Key Parameters |
|---|---|---|
| expect_column_values_to_not_be_null | Check column has no null values | column |
| expect_column_values_to_be_in_set | Check column values belong to a set | column, value_set |
| expect_table_row_count_to_be_between | Check row count is within range | min_value, max_value |
| expect_column_values_to_be_unique | Check column values are unique | column |
| expect_column_mean_to_be_between | Check mean of column values | column, min_value, max_value |
Key Takeaways
Add dbt_expectations to your packages.yml and run dbt deps before using its tests.
Use dbt_expectations macros inside schema.yml under tests for columns or models.
Provide required parameters like value_set or min_value depending on the test macro.
Run dbt test to execute these expectation tests and validate your data quality.
Check official docs for correct macro names and parameters to avoid errors.