Bird
Raised Fist0
dbtdata~3 mins

Why Custom singular tests in dbt? - Purpose & Use Cases

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
The Big Idea

What if you could catch hidden data problems with just one simple test?

The Scenario

Imagine you have a big data table and you want to check if a specific rule is true, like "Are all user emails unique?" or "Is the total sales number above zero?" Doing this by looking at raw data or writing many manual queries can be confusing and slow.

The Problem

Manually writing queries for each check takes a lot of time and can easily lead to mistakes. You might forget to check some important rules or write inconsistent queries. It's hard to keep track of all these checks as your data grows.

The Solution

Custom singular tests let you write one clear, reusable test that checks exactly what you want. You can run these tests automatically every time your data changes, so you catch problems early without extra work.

Before vs After
Before
SELECT COUNT(*) FROM users WHERE email IS NULL OR email IN (SELECT email FROM users GROUP BY email HAVING COUNT(*) > 1);
After
tests:
  - unique_email_check
What It Enables

It makes data quality checks simple, consistent, and automatic, so you trust your data without spending hours on manual queries.

Real Life Example

A company uses a custom singular test to ensure their daily sales data always has positive totals before running reports, preventing wrong business decisions.

Key Takeaways

Manual checks are slow and error-prone.

Custom singular tests automate and simplify data validation.

They help keep data trustworthy and save time.

Practice

(1/5)
1. What is the main purpose of a custom singular test in dbt?
easy
A. To automatically generate documentation for your models
B. To write your own SQL query that checks data quality and returns rows only if there are issues
C. To schedule dbt runs at specific times
D. To create new tables from existing data

Solution

  1. Step 1: Understand the role of custom singular tests

    Custom singular tests are SQL queries that check data quality by returning rows only when problems exist.
  2. Step 2: Compare options with this definition

    Only To write your own SQL query that checks data quality and returns rows only if there are issues describes writing a SQL query that returns rows if there are data issues, matching the purpose of custom singular tests.
  3. Final Answer:

    To write your own SQL query that checks data quality and returns rows only if there are issues -> Option B
  4. Quick Check:

    Custom singular test = SQL check returning problem rows [OK]
Hint: Custom singular tests return rows only when data has problems [OK]
Common Mistakes:
  • Confusing tests with documentation generation
  • Thinking tests create tables
  • Assuming tests schedule runs
2. Which of the following is the correct way to define a custom singular test in your schema.yml file?
easy
A. tests: - my_custom_test.sql
B. tests: - my_custom_test: sql: my_custom_test.sql
C. tests: - name: my_custom_test test: my_custom_test
D. tests: - my_custom_test

Solution

  1. Step 1: Recall the schema.yml syntax for custom singular tests

    Custom singular tests are referenced by their filename (without .sql) in the tests list of schema.yml.
  2. Step 2: Match options to this syntax

    tests: - my_custom_test correctly references the test file tests/my_custom_test.sql. Other options use incorrect structure, extra keys, or include .sql.
  3. Final Answer:

    tests: - my_custom_test -> Option D
  4. Quick Check:

    schema.yml test syntax = - test_filename_without_sql [OK]
Hint: Reference tests by name (no .sql) in tests: list [OK]
Common Mistakes:
  • Using 'name' or 'test' keys
  • Including .sql extension
  • Using map/dict structure
3. Given the following custom singular test SQL in tests/check_positive_values.sql:
SELECT * FROM {{ ref('orders') }} WHERE amount <= 0
What will be the output if all amounts in the orders table are positive?
medium
A. An empty result with zero rows
B. A table with all rows where amount is less than or equal to zero
C. An error because of invalid SQL syntax
D. A count of rows with amount less than or equal to zero

Solution

  1. Step 1: Understand the test SQL logic

    The test selects rows where amount is less than or equal to zero.
  2. Step 2: Analyze the data condition

    If all amounts are positive, no rows satisfy the condition, so the query returns zero rows.
  3. Final Answer:

    An empty result with zero rows -> Option A
  4. Quick Check:

    All positive amounts means zero rows returned [OK]
Hint: No matching rows means test passes with empty output [OK]
Common Mistakes:
  • Expecting a count instead of rows
  • Thinking it returns all rows
  • Assuming SQL syntax error
4. You wrote a custom singular test SQL file but when running dbt test, it fails with a syntax error. Which of the following is the most likely cause?
medium
A. The model referenced in {{ ref() }} does not exist
B. The test SQL returns zero rows
C. The SQL file is missing the required SELECT statement
D. The test is not listed in schema.yml

Solution

  1. Step 1: Identify causes of SQL syntax errors

    Syntax errors happen when SQL is malformed, such as missing SELECT statements.
  2. Step 2: Evaluate options for syntax error causes

    The SQL file is missing the required SELECT statement directly relates to SQL syntax. Other options cause runtime or configuration errors, not syntax errors.
  3. Final Answer:

    The SQL file is missing the required SELECT statement -> Option C
  4. Quick Check:

    Syntax error = malformed SQL like missing SELECT [OK]
Hint: Syntax errors usually mean SQL is incomplete or malformed [OK]
Common Mistakes:
  • Confusing missing test listing with syntax error
  • Assuming zero rows cause syntax errors
  • Ignoring missing model references
5. You want to create a custom singular test that checks if any user has a NULL email in the users table. Which SQL query should you write in your test file?
hard
A. SELECT * FROM {{ ref('users') }} WHERE email IS NULL
B. SELECT COUNT(*) FROM {{ ref('users') }} WHERE email IS NULL
C. SELECT email FROM {{ ref('users') }} WHERE email IS NOT NULL
D. SELECT * FROM {{ ref('users') }} WHERE email = ''

Solution

  1. Step 1: Understand the test goal

    The test should return rows where email is NULL to detect missing emails.
  2. Step 2: Choose the SQL that returns rows with NULL emails

    SELECT * FROM {{ ref('users') }} WHERE email IS NULL returns rows only when there are NULL emails (0 rows = pass). COUNT(*) always returns one row, failing even with zero NULLs. IS NOT NULL selects good rows (opposite). = '' checks empty strings, not NULLs.
  3. Final Answer:

    SELECT * FROM {{ ref('users') }} WHERE email IS NULL -> Option A
  4. Quick Check:

    Return rows with NULL email = SELECT * FROM {{ ref('users') }} WHERE email IS NULL [OK]
Hint: Use SELECT * WHERE column IS NULL to find missing values [OK]
Common Mistakes:
  • Using COUNT(*) instead of returning rows
  • Checking for empty string instead of NULL
  • Selecting non-NULL emails