dbtdata~15 mins

Custom singular tests in dbt - Deep Dive

Choose your learning style10 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Overview - Custom singular tests

What is it?

Custom singular tests in dbt are user-defined checks that validate specific conditions on your data. Unlike generic tests that check for common issues like uniqueness or nulls, singular tests let you write custom SQL queries to verify any rule you want. They return a pass or fail based on whether the query returns any rows. This helps ensure your data meets precise business or quality requirements.

Why it matters

Without custom singular tests, you might miss important data problems unique to your project. They let you catch errors early by enforcing rules that generic tests can't cover. This improves trust in your data and prevents bad decisions based on faulty information. Imagine relying on data reports that silently contain errors because no test checked those specific cases.

Where it fits

Before learning custom singular tests, you should understand basic dbt tests like uniqueness and not_null. After mastering singular tests, you can explore advanced testing strategies like test macros and automated test suites. This topic fits in the data quality and validation part of the dbt learning path.

Mental Model

Core Idea

A custom singular test is a special SQL query that returns rows only when data breaks a specific rule, signaling a test failure.

Think of it like...

It's like a smoke detector set to sense a very specific kind of smoke that only happens when a particular appliance malfunctions, alerting you only to that exact problem.

┌─────────────────────────────┐
│      Custom Singular Test    │
├─────────────────────────────┤
│ 1. Write SQL query checking  │
│    a specific data rule      │
│                             │
│ 2. Run query on dataset      │
│                             │
│ 3. If query returns rows →   │
│    test FAILS (problem found)│
│    else → test PASSES        │
└─────────────────────────────┘

Build-Up - 7 Steps

FoundationUnderstanding basic dbt tests

Concept: Learn what dbt tests are and how they check data quality with simple built-in tests.

dbt provides built-in tests like unique and not_null that check common data issues. For example, a unique test ensures no duplicate values in a column. These tests run SQL queries behind the scenes and fail if they find any problem rows.

Result

You can run dbt tests and see which columns or tables have data issues.

Knowing how basic tests work helps you understand why custom tests are needed for rules that built-in tests can't cover.

FoundationWhat makes a test singular in dbt

IntermediateWriting your first custom singular test

IntermediateIntegrating singular tests in dbt project

IntermediateUsing variables and macros in singular tests

AdvancedHandling test failures and error messages

ExpertAdvanced patterns and pitfalls in singular tests

Under the Hood

When you run a custom singular test, dbt executes the SQL query you wrote against your data warehouse. The query returns zero or more rows. dbt interprets zero rows as a passing test and any rows as a failure. Internally, dbt collects the query results and reports them in the test output. This mechanism leverages the database's query engine to perform the validation.

Why designed this way?

This design lets users write any SQL logic needed to check complex or unique data rules without being limited to predefined test types. It uses the database's power directly, avoiding extra layers or custom code. The simplicity of 'rows means failure' makes test results easy to interpret and consistent.

┌───────────────┐      ┌───────────────┐      ┌───────────────────────────┐
│ Custom SQL    │─────▶│ Database      │─────▶│ Result Rows              │
│ Singular Test │      │ Query Engine  │      │ (0 = pass, >0 = fail)    │
└───────────────┘      └───────────────┘      └───────────────────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Does a singular test pass when its query returns rows? Commit yes or no.

Common Belief:If the test query returns rows, the test passes because it found data.

Tap to reveal reality

Quick: Can singular tests be declared in schema.yml like generic tests? Commit yes or no.

Common Belief:Singular tests are added in schema.yml just like built-in tests.

Tap to reveal reality

Quick: Do singular tests always run fast regardless of query complexity? Commit yes or no.

Common Belief:Singular tests are lightweight and won't affect performance much.

Tap to reveal reality

Quick: Can singular tests only check one row at a time? Commit yes or no.

Common Belief:Singular tests only check single rows or simple conditions.

Tap to reveal reality

Expert Zone

Singular tests can leverage dbt's Jinja context to dynamically adapt to environments or model changes, making tests more maintainable.

Because singular tests return rows on failure, selecting meaningful columns and messages is crucial for quick debugging in large datasets.

Singular tests can be combined with hooks or run as part of CI/CD pipelines to enforce data quality gates before deployment.

When NOT to use

Avoid singular tests for very simple checks like uniqueness or nulls where built-in generic tests are more efficient and standardized. Also, do not use singular tests for extremely large datasets without optimization, as they can cause performance issues. Instead, use incremental tests or sampling strategies.

Production Patterns

In production, teams use singular tests to enforce business rules like 'no orders with negative amounts' or 'all active users must have verified emails.' They integrate these tests into automated pipelines, failing builds if tests fail, ensuring only clean data moves downstream.

Connections

Unit Testing in Software Development

Both involve writing specific checks that pass or fail based on conditions.

Understanding singular tests as data unit tests helps grasp their role in catching precise errors early, just like code unit tests catch bugs.

Database Constraints

Singular tests complement database constraints by checking rules that constraints cannot enforce.

Knowing database constraints clarifies why singular tests are needed for complex or business-specific validations beyond schema rules.

Quality Control in Manufacturing

Both use targeted inspections to detect defects that general checks might miss.

Seeing singular tests as quality control steps helps appreciate their role in maintaining high data standards.

Common Pitfalls

#1Writing a singular test query that returns rows when data is correct.

Wrong approach:SELECT * FROM {{ ref('orders') }} WHERE amount >= 0

Correct approach:SELECT * FROM {{ ref('orders') }} WHERE amount < 0

Root cause:Misunderstanding that singular tests fail when rows are returned, so the query must select only bad data.

#2Declaring singular tests inside schema.yml like generic tests.

Wrong approach:tests: - singular_test_name

Correct approach:Place singular test SQL files in the tests directory and run with 'dbt test'.

Root cause:Confusing singular tests with generic tests and their different integration methods.

#3Writing complex joins in singular tests without optimization causing slow runs.

Wrong approach:SELECT * FROM big_table a JOIN big_table b ON a.id = b.ref_id WHERE a.status = 'bad'

Correct approach:Limit data scanned or pre-aggregate in models before testing, or add filters to reduce rows.

Root cause:Not considering query performance impact of test SQL on large datasets.

Key Takeaways

Custom singular tests let you write any SQL query that returns rows only when data breaks a specific rule, signaling failure.

They are separate from generic tests and run as standalone SQL files, not declared in schema.yml.

Writing effective singular tests requires understanding that zero rows means pass and any rows mean fail.

Using macros and clear error messages in singular tests improves maintainability and debugging.

Performance matters: inefficient singular test queries can slow down your dbt runs and cause flaky failures.

Practice

(1/5)

1. What is the main purpose of a custom singular test in dbt?

easy

A. To automatically generate documentation for your models

B. To write your own SQL query that checks data quality and returns rows only if there are issues

C. To schedule dbt runs at specific times

D. To create new tables from existing data

Custom singular tests in dbt - Deep Dive

Start learning this pattern below

Practice

Solution

Step 1: Understand the role of custom singular tests

Step 2: Compare options with this definition

Final Answer:

Quick Check:

Solution

Step 1: Recall the schema.yml syntax for custom singular tests

Step 2: Match options to this syntax

Final Answer:

Quick Check:

Solution

Step 1: Understand the test SQL logic

Step 2: Analyze the data condition

Final Answer:

Quick Check:

Solution

Step 1: Identify causes of SQL syntax errors

Step 2: Evaluate options for syntax error causes

Final Answer:

Quick Check:

Solution

Step 1: Understand the test goal

Step 2: Choose the SQL that returns rows with NULL emails

Final Answer:

Quick Check: