0
0
DbtConceptBeginner · 3 min read

What is Schema Test in dbt: Definition and Usage

A schema test in dbt is a way to check the structure and quality of your data by validating columns against rules like uniqueness or non-null values. It helps catch data issues early by automatically testing your data model's columns based on defined conditions.
⚙️

How It Works

Think of a schema test in dbt like a checklist you create for your data columns. Just like you might check if all the ingredients are in a recipe before cooking, schema tests check if your data columns meet certain rules. For example, you can test if a column has no missing values or if all values are unique.

When you run dbt, it automatically runs these tests on your data tables. If any data breaks the rules, dbt will tell you exactly which columns failed. This helps you find and fix data problems quickly, keeping your data trustworthy.

💻

Example

This example shows how to add schema tests to a dbt model to check that the id column is unique and the email column has no null values.

yaml
version: 2
models:
  - name: users
    columns:
      - name: id
        tests:
          - unique
          - not_null
      - name: email
        tests:
          - not_null
Output
Running 2 tests for model users: - unique: PASSED - not_null: PASSED
🎯

When to Use

Use schema tests whenever you want to make sure your data columns follow important rules. For example, if you have a user ID column, you want it to be unique so you don’t mix up users. Or if you have an email column, you want to make sure it’s never empty.

Schema tests are great for catching data problems early in your data pipeline. They help teams trust their data and avoid errors in reports or dashboards caused by bad data.

Key Points

  • Schema tests check data column rules like uniqueness and non-null values.
  • They run automatically when you run dbt models.
  • Failures show exactly where data breaks the rules.
  • They help keep data clean and reliable.

Key Takeaways

Schema tests in dbt validate data columns against rules like unique and not null.
They run automatically during dbt model execution to catch data issues early.
Use schema tests to ensure data quality and trust in your analytics.
Failures provide clear feedback on which data breaks the rules.
Schema tests are simple to add and maintain in your dbt project.