0
0
dbtdata~20 mins

Why advanced testing catches subtle data issues in dbt - Challenge Your Understanding

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Advanced Data Testing Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
1:30remaining
Why do advanced tests catch subtle data issues?

In data projects, simple tests check for obvious problems like missing values. Advanced tests go deeper. Why do advanced tests catch subtle data issues better?

ABecause they check complex patterns and relationships in data beyond basic errors.
BBecause they only check if data files exist and have correct names.
CBecause they ignore data quality and focus on performance speed.
DBecause they only verify the number of rows without checking content.
Attempts:
2 left
💡 Hint

Think about what subtle data issues might look like beyond missing or null values.

Predict Output
intermediate
1:30remaining
Output of a dbt test on unique user IDs

Given this dbt test SQL snippet, what is the output if the 'user_id' column has duplicates?

dbt
select user_id, count(*) as count from {{ ref('users') }} group by user_id having count > 1
AA syntax error due to missing parentheses.
BA list of user_id values that appear more than once.
CAn empty result set because duplicates are ignored.
DA count of all user_id values including unique ones.
Attempts:
2 left
💡 Hint

Look at the GROUP BY and HAVING clause to understand what rows are returned.

data_output
advanced
2:00remaining
Result of an advanced dbt test for data drift detection

Consider a dbt test that compares the average order amount this month to last month and flags if the difference is over 20%. What would the test output be if the average changed from 100 to 130?

dbt
with last_month as (select avg(order_amount) as avg_last from orders where order_date between '2024-04-01' and '2024-04-30'), this_month as (select avg(order_amount) as avg_this from orders where order_date between '2024-05-01' and '2024-05-31') select case when abs(avg_this - avg_last)/avg_last > 0.2 then 'fail' else 'pass' end as test_result from last_month, this_month
AAn error because the date ranges overlap.
B'pass' because the average increased, which is good.
C'fail' because the average decreased by 30%.
D'fail' because the increase is 30%, which is more than 20%.
Attempts:
2 left
💡 Hint

Calculate the percentage change between 100 and 130.

🔧 Debug
advanced
1:30remaining
Identify the error in this dbt test SQL

Find the error in this dbt test SQL that checks for nulls in the 'email' column:

select * from {{ ref('customers') }} where email = NULL
AThe test will return no rows because 'email = NULL' is always false; use 'email IS NULL' instead.
BThe test will return all rows because NULL matches everything.
CThe test has a syntax error due to missing quotes around NULL.
DThe test will run correctly and find null emails.
Attempts:
2 left
💡 Hint

Remember how SQL treats NULL comparisons.

🚀 Application
expert
2:30remaining
Choosing the best advanced test for subtle data issues

You want to detect subtle data issues like unexpected shifts in customer demographics over time. Which advanced dbt test approach is best?

AImplement a test that counts total rows in the customer table each day.
BImplement a test that only checks for missing values in demographic fields.
CImplement a test that compares statistical distributions (e.g., mean, median) of demographic fields month-over-month.
DImplement a test that verifies the presence of primary key constraints.
Attempts:
2 left
💡 Hint

Think about how to detect subtle changes, not just obvious errors.