Bird
Raised Fist0
dbtdata~5 mins

Testing model outputs in dbt - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is the main purpose of testing model outputs in dbt?
To ensure that the data produced by your models is accurate, consistent, and meets expected quality standards before using it for analysis or reporting.
Click to reveal answer
beginner
Name two common types of tests used in dbt to validate model outputs.
1. Unique tests - to check if a column has unique values.
2. Not null tests - to ensure no missing values in a column.
Click to reveal answer
intermediate
How does a dbt 'schema test' help in testing model outputs?
Schema tests check the structure and data quality of your model outputs by validating constraints like uniqueness, non-null values, and referential integrity.
Click to reveal answer
intermediate
What is a 'custom test' in dbt and when would you use it?
A custom test is a user-defined SQL query that checks specific conditions in your data. Use it when built-in tests don't cover your unique data quality rules.
Click to reveal answer
beginner
Why is it important to run tests regularly on model outputs in dbt?
Regular testing catches data issues early, prevents bad data from spreading, and maintains trust in your data pipelines and reports.
Click to reveal answer
Which dbt test checks if a column contains no missing values?
AReferential integrity test
BUnique test
CNot null test
DCustom test
What does a unique test in dbt verify?
AThat data types are correct
BThat no values are missing
CThat values match another table
DThat all values in a column are different
When should you create a custom test in dbt?
AWhen you want to speed up model runs
BWhen built-in tests do not cover your specific data rules
CWhen you want to change data types
DWhen you want to create new models
What is the benefit of running tests on model outputs before analysis?
ATo ensure data accuracy and reliability
BTo reduce storage space
CTo speed up SQL queries
DTo create visualizations
Which of the following is NOT a typical dbt test?
AData type test
BUnique test
CNot null test
DReferential integrity test
Explain how you would use dbt tests to ensure your model outputs are trustworthy.
Think about built-in tests and when to create your own.
You got /5 concepts.
    Describe the difference between a schema test and a custom test in dbt.
    Consider what each test type validates.
    You got /3 concepts.

      Practice

      (1/5)
      1. What is the main purpose of testing model outputs in dbt?
      easy
      A. To ensure the data is accurate and reliable
      B. To speed up the data loading process
      C. To create new tables automatically
      D. To delete old data from the database

      Solution

      1. Step 1: Understand the goal of testing in dbt

        Testing checks if the data produced by models is correct and trustworthy.
      2. Step 2: Identify the main benefit of testing outputs

        Accurate and reliable data helps users make good decisions and trust reports.
      3. Final Answer:

        To ensure the data is accurate and reliable -> Option A
      4. Quick Check:

        Testing = Accurate data [OK]
      Hint: Testing checks data correctness, not speed or deletion [OK]
      Common Mistakes:
      • Thinking tests speed up loading
      • Confusing testing with table creation
      • Assuming tests delete data
      2. Which of the following is the correct syntax to define a uniqueness test on a column user_id in a dbt model's schema.yml file?
      easy
      A. - model: users columns: - name: user_id tests: - unique
      B. - name: users columns: - name: user_id test: unique
      C. - name: users columns: - user_id tests: - unique
      D. - name: users columns: - name: user_id tests: - unique

      Solution

      1. Step 1: Check the correct key for model name

        The key to specify the model is name, not model.
      2. Step 2: Verify column and test syntax

        Each column uses name and tests are listed under tests as a list.
      3. Final Answer:

        - name: users columns: - name: user_id tests: - unique -> Option D
      4. Quick Check:

        Correct schema.yml syntax = - name: users columns: - name: user_id tests: - unique [OK]
      Hint: Use 'name' for model and column, 'tests' as list [OK]
      Common Mistakes:
      • Using 'model' instead of 'name' for model
      • Writing 'test' instead of 'tests'
      • Omitting 'name' for column
      3. Given this dbt test defined in schema.yml for the model orders:
      - name: orders
        columns:
          - name: order_id
            tests:
              - unique
              - not_null
      What will happen if the orders table has two rows with the same order_id and one row with order_id as NULL when you run dbt test?
      medium
      A. The test will fail because of duplicate and NULL values in order_id
      B. The test will pass because only one test can fail at a time
      C. The test will fail only for duplicate values, NULLs are ignored
      D. The test will pass because NULLs are allowed in unique tests

      Solution

      1. Step 1: Understand the tests applied

        The tests are unique and not_null on order_id.
      2. Step 2: Analyze the data issues

        Two rows have the same order_id (violates uniqueness) and one row has NULL order_id (violates not_null).
      3. Final Answer:

        The test will fail because of duplicate and NULL values in order_id -> Option A
      4. Quick Check:

        Duplicates + NULLs = test fail [OK]
      Hint: Both unique and not_null must pass for success [OK]
      Common Mistakes:
      • Assuming NULLs are allowed in unique tests
      • Thinking only one test failure causes pass
      • Ignoring NULL violation in not_null test
      4. You wrote this test in your schema.yml file:
      - name: customers
        columns:
          - name: email
            tests:
              - unique
              - not_null
      But when you run dbt test, you get an error saying Invalid test configuration. What is the likely cause?
      medium
      A. The test name should be unique and not_null without dashes
      B. The indentation of the tests list is incorrect
      C. The model name should be under models key in schema.yml
      D. The email column does not exist in the model

      Solution

      1. Step 1: Check the structure of schema.yml

        Tests must be defined under the models: key in schema.yml.
      2. Step 2: Identify missing models: key

        The snippet misses the models: root key, causing invalid configuration.
      3. Final Answer:

        The model name should be under models key in schema.yml -> Option C
      4. Quick Check:

        Missing 'models:' key = config error [OK]
      Hint: Always start schema.yml tests under 'models:' key [OK]
      Common Mistakes:
      • Removing dashes from test names
      • Incorrect indentation of tests list
      • Not placing model under 'models:'
      5. You want to test that the status column in your transactions model only contains the values 'pending', 'completed', or 'failed'. Which test definition in schema.yml correctly enforces this?
      hard
      A. - name: transactions columns: - name: status tests: - values_in: ['pending', 'completed', 'failed']
      B. - name: transactions columns: - name: status tests: - accepted_values: values: ['pending', 'completed', 'failed']
      C. - name: transactions columns: - name: status tests: - accepted_values: ['pending', 'completed', 'failed']
      D. - name: transactions columns: - name: status tests: - unique - not_null

      Solution

      1. Step 1: Identify the correct test for allowed values

        The accepted_values test checks if column values are in a list.
      2. Step 2: Check correct syntax for accepted_values

        The test requires a dictionary with key values listing allowed values.
      3. Final Answer:

        - name: transactions columns: - name: status tests: - accepted_values: values: ['pending', 'completed', 'failed'] -> Option B
      4. Quick Check:

        accepted_values with 'values' key = - name: transactions columns: - name: status tests: - accepted_values: values: ['pending', 'completed', 'failed'] [OK]
      Hint: Use accepted_values with 'values' list for allowed values [OK]
      Common Mistakes:
      • Using unique or not_null instead of accepted_values
      • Omitting 'values:' key under accepted_values
      • Using wrong test name like values_in