Challenge - 5 Problems

🎖️

Advanced Data Testing Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

1:30remaining

Why do advanced tests catch subtle data issues?

In data projects, simple tests check for obvious problems like missing values. Advanced tests go deeper. Why do advanced tests catch subtle data issues better?

ABecause they check complex patterns and relationships in data beyond basic errors.

BBecause they only check if data files exist and have correct names.

CBecause they ignore data quality and focus on performance speed.

DBecause they only verify the number of rows without checking content.

Attempts:

2 left

❓ Predict Output

intermediate

1:30remaining

Output of a dbt test on unique user IDs

Given this dbt test SQL snippet, what is the output if the 'user_id' column has duplicates?

dbt

select user_id, count(*) as count from {{ ref('users') }} group by user_id having count > 1

AA syntax error due to missing parentheses.

BA list of user_id values that appear more than once.

CAn empty result set because duplicates are ignored.

DA count of all user_id values including unique ones.

Attempts:

2 left

❓ data_output

advanced

2:00remaining

Result of an advanced dbt test for data drift detection

Consider a dbt test that compares the average order amount this month to last month and flags if the difference is over 20%. What would the test output be if the average changed from 100 to 130?

dbt

with last_month as (select avg(order_amount) as avg_last from orders where order_date between '2024-04-01' and '2024-04-30'), this_month as (select avg(order_amount) as avg_this from orders where order_date between '2024-05-01' and '2024-05-31') select case when abs(avg_this - avg_last)/avg_last > 0.2 then 'fail' else 'pass' end as test_result from last_month, this_month

AAn error because the date ranges overlap.

B'pass' because the average increased, which is good.

C'fail' because the average decreased by 30%.

D'fail' because the increase is 30%, which is more than 20%.

Attempts:

2 left

🔧 Debug

advanced

1:30remaining

Identify the error in this dbt test SQL

Find the error in this dbt test SQL that checks for nulls in the 'email' column:

select * from {{ ref('customers') }} where email = NULL

AThe test will return no rows because 'email = NULL' is always false; use 'email IS NULL' instead.

BThe test will return all rows because NULL matches everything.

CThe test has a syntax error due to missing quotes around NULL.

DThe test will run correctly and find null emails.

Attempts:

2 left

🚀 Application

expert

2:30remaining

Choosing the best advanced test for subtle data issues

You want to detect subtle data issues like unexpected shifts in customer demographics over time. Which advanced dbt test approach is best?

AImplement a test that counts total rows in the customer table each day.

BImplement a test that only checks for missing values in demographic fields.

CImplement a test that compares statistical distributions (e.g., mean, median) of demographic fields month-over-month.

DImplement a test that verifies the presence of primary key constraints.

Attempts:

2 left