How to Use Unique Test in dbt for Data Quality
In dbt, use the built-in
unique test to check that a column contains only unique values. Add this test in your model's YAML file under the tests key for the column you want to validate.Syntax
The unique test in dbt is applied in your model's schema YAML file. You specify the model name, then the columns, and under each column, list unique as a test.
This tells dbt to check that the column has no duplicate values.
yaml
version: 2
models:
- name: your_model_name
columns:
- name: column_to_test
tests:
- uniqueExample
This example shows how to add a unique test on the user_id column in a model called users. When you run dbt test, dbt will check that user_id has no duplicates.
yaml
version: 2
models:
- name: users
columns:
- name: user_id
tests:
- uniqueOutput
Running with dbt=1.4.6
Found 1 model, 1 test
1 of 1 START test unique_users_user_id................... [RUN]
1 of 1 PASS unique_users_user_id.......................... [PASS in 0.12s]
All tests passed!
Common Pitfalls
- Missing YAML indentation: YAML is sensitive to spaces; incorrect indentation can cause errors.
- Testing wrong column name: Ensure the column name matches exactly the column in your model.
- Not running
dbt testafter adding tests: Adding tests alone does not check data; you must rundbt test.
yaml
version: 2
models:
- name: users
columns:
- name: user_id
tests:
- unique
# Wrong indentation example (will cause error):
# models:
# - name: users
# columns:
# - name: user_id
# tests:
# - uniqueQuick Reference
| Element | Description |
|---|---|
| version | YAML version, usually 2 |
| models | List of models to test |
| name | Model name as defined in dbt |
| columns | List of columns in the model |
| tests | List of tests to run on the column |
| unique | Test to ensure column values are unique |
Key Takeaways
Add the unique test in your model's YAML file under the column's tests section.
Run dbt test to execute the unique test and validate data uniqueness.
Ensure correct YAML indentation to avoid syntax errors.
Match the column name exactly as it appears in your model.
The unique test helps catch duplicate values that can cause data issues.