What if you could catch hidden data problems with just one simple test?
Why Custom singular tests in dbt? - Purpose & Use Cases
Start learning this pattern below
Jump into concepts and practice - no test required
Imagine you have a big data table and you want to check if a specific rule is true, like "Are all user emails unique?" or "Is the total sales number above zero?" Doing this by looking at raw data or writing many manual queries can be confusing and slow.
Manually writing queries for each check takes a lot of time and can easily lead to mistakes. You might forget to check some important rules or write inconsistent queries. It's hard to keep track of all these checks as your data grows.
Custom singular tests let you write one clear, reusable test that checks exactly what you want. You can run these tests automatically every time your data changes, so you catch problems early without extra work.
SELECT COUNT(*) FROM users WHERE email IS NULL OR email IN (SELECT email FROM users GROUP BY email HAVING COUNT(*) > 1);tests: - unique_email_check
It makes data quality checks simple, consistent, and automatic, so you trust your data without spending hours on manual queries.
A company uses a custom singular test to ensure their daily sales data always has positive totals before running reports, preventing wrong business decisions.
Manual checks are slow and error-prone.
Custom singular tests automate and simplify data validation.
They help keep data trustworthy and save time.
Practice
Solution
Step 1: Understand the role of custom singular tests
Custom singular tests are SQL queries that check data quality by returning rows only when problems exist.Step 2: Compare options with this definition
Only To write your own SQL query that checks data quality and returns rows only if there are issues describes writing a SQL query that returns rows if there are data issues, matching the purpose of custom singular tests.Final Answer:
To write your own SQL query that checks data quality and returns rows only if there are issues -> Option BQuick Check:
Custom singular test = SQL check returning problem rows [OK]
- Confusing tests with documentation generation
- Thinking tests create tables
- Assuming tests schedule runs
schema.yml file?Solution
Step 1: Recall the schema.yml syntax for custom singular tests
Custom singular tests are referenced by their filename (without .sql) in the tests list of schema.yml.Step 2: Match options to this syntax
tests: - my_custom_test correctly references the test file tests/my_custom_test.sql. Other options use incorrect structure, extra keys, or include .sql.Final Answer:
tests: - my_custom_test -> Option DQuick Check:
schema.yml test syntax = - test_filename_without_sql [OK]
- Using 'name' or 'test' keys
- Including .sql extension
- Using map/dict structure
tests/check_positive_values.sql:
SELECT * FROM {{ ref('orders') }} WHERE amount <= 0
What will be the output if all amounts in the orders table are positive?Solution
Step 1: Understand the test SQL logic
The test selects rows where amount is less than or equal to zero.Step 2: Analyze the data condition
If all amounts are positive, no rows satisfy the condition, so the query returns zero rows.Final Answer:
An empty result with zero rows -> Option AQuick Check:
All positive amounts means zero rows returned [OK]
- Expecting a count instead of rows
- Thinking it returns all rows
- Assuming SQL syntax error
Solution
Step 1: Identify causes of SQL syntax errors
Syntax errors happen when SQL is malformed, such as missing SELECT statements.Step 2: Evaluate options for syntax error causes
The SQL file is missing the requiredSELECTstatement directly relates to SQL syntax. Other options cause runtime or configuration errors, not syntax errors.Final Answer:
The SQL file is missing the requiredSELECTstatement -> Option CQuick Check:
Syntax error = malformed SQL like missing SELECT [OK]
- Confusing missing test listing with syntax error
- Assuming zero rows cause syntax errors
- Ignoring missing model references
users table. Which SQL query should you write in your test file?Solution
Step 1: Understand the test goal
The test should return rows where email is NULL to detect missing emails.Step 2: Choose the SQL that returns rows with NULL emails
SELECT * FROM {{ ref('users') }} WHERE email IS NULL returns rows only when there are NULL emails (0 rows = pass). COUNT(*) always returns one row, failing even with zero NULLs. IS NOT NULL selects good rows (opposite). = '' checks empty strings, not NULLs.Final Answer:
SELECT * FROM {{ ref('users') }} WHERE email IS NULL -> Option AQuick Check:
Return rows with NULL email = SELECT * FROM {{ ref('users') }} WHERE email IS NULL [OK]
- Using COUNT(*) instead of returning rows
- Checking for empty string instead of NULL
- Selecting non-NULL emails
