Challenge - 5 Problems

🎖️

Test Failure Analysis Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

What is the output of this SQL query storing test failures?

Consider a dbt test that checks for null values in a column user_id. The following SQL stores failures in a table. What rows will be stored?

dbt

INSERT INTO test_failures (test_name, failed_row_id)
SELECT 'not_null_user_id', id
FROM users
WHERE user_id IS NULL;

ANo rows inserted because <code>id</code> is not selected.

BRows where <code>user_id</code> is NOT NULL, with test name 'not_null_user_id'.

CRows from <code>users</code> where <code>user_id</code> is NULL, with test name 'not_null_user_id'.

DAll rows from <code>users</code> regardless of <code>user_id</code> value.

Attempts:

2 left

❓ data_output

intermediate

1:30remaining

How many rows will be stored after this test failure query?

Given the table orders with 1000 rows, 50 have order_date NULL. The test stores failures with:

INSERT INTO failure_log (test_name, order_id)
SELECT 'not_null_order_date', order_id
FROM orders
WHERE order_date IS NULL;

How many rows will failure_log have after this?

A1000 rows, all orders regardless of <code>order_date</code>.

B50 rows, one for each order with NULL <code>order_date</code>.

C0 rows, because <code>order_date</code> is not selected.

D950 rows, orders with non-NULL <code>order_date</code>.

Attempts:

2 left

🔧 Debug

advanced

2:30remaining

Why does this query fail to store test failures correctly?

Look at this SQL snippet intended to store failures where email is NULL:

INSERT INTO test_failures (test_name, failed_row_id)
SELECT 'not_null_email', user_id
FROM users
WHERE email = NULL;

Why does it not store any rows?

ABecause the <code>SELECT</code> statement is missing a <code>GROUP BY</code> clause.

BBecause <code>user_id</code> is not a valid column in <code>users</code>.

CBecause the table <code>test_failures</code> does not exist.

DBecause <code>email = NULL</code> always returns false; use <code>IS NULL</code> instead.

Attempts:

2 left

🚀 Application

advanced

3:00remaining

Which SQL query correctly stores failures for a uniqueness test on customer_id?

You want to store rows where customer_id is duplicated in sales. Which query does this correctly?

INSERT INTO failure_log (test_name, failed_customer_id)
SELECT 'unique_customer_id', customer_id
FROM sales
GROUP BY customer_id
HAVING COUNT(*) &gt; 1;

INSERT INTO failure_log (test_name, failed_customer_id)
SELECT 'unique_customer_id', customer_id
FROM sales
WHERE customer_id IS NULL;

INSERT INTO failure_log (test_name, failed_customer_id)
SELECT 'unique_customer_id', customer_id
FROM sales
WHERE COUNT(customer_id) &gt; 1;

INSERT INTO failure_log (test_name, failed_customer_id)
SELECT 'unique_customer_id', customer_id
FROM sales
GROUP BY customer_id
HAVING COUNT(customer_id) = 1;

Attempts:

2 left

🧠 Conceptual

expert

2:00remaining

What is the main benefit of storing test failures in a separate table for analysis?

Why do data teams store test failures in dedicated tables instead of just logging errors?

AIt allows easy querying and tracking of failing data rows over time for root cause analysis.

BIt reduces the size of the main data tables by removing failed rows permanently.

CIt automatically fixes the data errors without manual intervention.

DIt prevents the data pipeline from running if any test fails.

Attempts:

2 left