What is Schema Test in dbt: Definition and Usage
schema test in dbt is a way to check the structure and quality of your data by validating columns against rules like uniqueness or non-null values. It helps catch data issues early by automatically testing your data model's columns based on defined conditions.How It Works
Think of a schema test in dbt like a checklist you create for your data columns. Just like you might check if all the ingredients are in a recipe before cooking, schema tests check if your data columns meet certain rules. For example, you can test if a column has no missing values or if all values are unique.
When you run dbt, it automatically runs these tests on your data tables. If any data breaks the rules, dbt will tell you exactly which columns failed. This helps you find and fix data problems quickly, keeping your data trustworthy.
Example
This example shows how to add schema tests to a dbt model to check that the id column is unique and the email column has no null values.
version: 2
models:
- name: users
columns:
- name: id
tests:
- unique
- not_null
- name: email
tests:
- not_nullWhen to Use
Use schema tests whenever you want to make sure your data columns follow important rules. For example, if you have a user ID column, you want it to be unique so you don’t mix up users. Or if you have an email column, you want to make sure it’s never empty.
Schema tests are great for catching data problems early in your data pipeline. They help teams trust their data and avoid errors in reports or dashboards caused by bad data.
Key Points
- Schema tests check data column rules like uniqueness and non-null values.
- They run automatically when you run dbt models.
- Failures show exactly where data breaks the rules.
- They help keep data clean and reliable.