Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Testing model outputs
📖 Scenario: You are working with a data transformation project using dbt. You want to ensure that your model outputs are correct by writing tests that check the data quality and expected results.
🎯 Goal: Build a simple dbt model and write tests to verify the output data meets expected conditions.
📋 What You'll Learn
Create a dbt model with sample data
Add a configuration variable for a threshold
Write a dbt test to check model output against the threshold
Display the test results
💡 Why This Matters
🌍 Real World
Data engineers and analysts use dbt to transform data and ensure quality by writing tests that catch errors early.
💼 Career
Knowing how to write and run dbt tests is essential for roles involving data pipeline development and maintenance.
Progress0 / 4 steps
1
Create a dbt model with sample data
Create a dbt model file called models/sample_model.sql with a SELECT statement that returns these exact rows: id and value columns with values (1, 10), (2, 20), and (3, 30).
dbt
Hint
Use union all to combine multiple SELECT statements with exact values.
2
Add a threshold configuration variable
In the same model file models/sample_model.sql, add a variable called threshold and set it to 15 using a with clause or CTE.
dbt
Hint
Use a CTE to define the threshold value before the main SELECT statements.
3
Write a dbt test to check values above threshold
Create a test file tests/test_values_above_threshold.sql that selects rows from ref('sample_model') where value is less than or equal to 15. This test should fail if any such rows exist.
dbt
Hint
The test query should return rows that violate the condition. Use ref('sample_model') to reference the model.
4
Display the test results
Run the dbt test command and print the output showing if the test passed or failed.
dbt
Hint
Use the terminal command dbt test to execute tests and see results.
Practice
(1/5)
1. What is the main purpose of testing model outputs in dbt?
easy
A. To ensure the data is accurate and reliable
B. To speed up the data loading process
C. To create new tables automatically
D. To delete old data from the database
Solution
Step 1: Understand the goal of testing in dbt
Testing checks if the data produced by models is correct and trustworthy.
Step 2: Identify the main benefit of testing outputs
Accurate and reliable data helps users make good decisions and trust reports.
Final Answer:
To ensure the data is accurate and reliable -> Option A
Quick Check:
Testing = Accurate data [OK]
Hint: Testing checks data correctness, not speed or deletion [OK]
Common Mistakes:
Thinking tests speed up loading
Confusing testing with table creation
Assuming tests delete data
2. Which of the following is the correct syntax to define a uniqueness test on a column user_id in a dbt model's schema.yml file?
But when you run dbt test, you get an error saying Invalid test configuration. What is the likely cause?
medium
A. The test name should be unique and not_null without dashes
B. The indentation of the tests list is incorrect
C. The model name should be under models key in schema.yml
D. The email column does not exist in the model
Solution
Step 1: Check the structure of schema.yml
Tests must be defined under the models: key in schema.yml.
Step 2: Identify missing models: key
The snippet misses the models: root key, causing invalid configuration.
Final Answer:
The model name should be under models key in schema.yml -> Option C
Quick Check:
Missing 'models:' key = config error [OK]
Hint: Always start schema.yml tests under 'models:' key [OK]
Common Mistakes:
Removing dashes from test names
Incorrect indentation of tests list
Not placing model under 'models:'
5. You want to test that the status column in your transactions model only contains the values 'pending', 'completed', or 'failed'. Which test definition in schema.yml correctly enforces this?
hard
A. - name: transactions
columns:
- name: status
tests:
- values_in: ['pending', 'completed', 'failed']
B. - name: transactions
columns:
- name: status
tests:
- accepted_values:
values: ['pending', 'completed', 'failed']
C. - name: transactions
columns:
- name: status
tests:
- accepted_values: ['pending', 'completed', 'failed']
D. - name: transactions
columns:
- name: status
tests:
- unique
- not_null
Solution
Step 1: Identify the correct test for allowed values
The accepted_values test checks if column values are in a list.
Step 2: Check correct syntax for accepted_values
The test requires a dictionary with key values listing allowed values.
Final Answer:
- name: transactions
columns:
- name: status
tests:
- accepted_values:
values: ['pending', 'completed', 'failed'] -> Option B
Quick Check:
accepted_values with 'values' key = - name: transactions
columns:
- name: status
tests:
- accepted_values:
values: ['pending', 'completed', 'failed'] [OK]
Hint: Use accepted_values with 'values' list for allowed values [OK]
Common Mistakes:
Using unique or not_null instead of accepted_values