0
0
dbtdata~20 mins

dbt-expectations for data quality - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
dbt-expectations Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of a dbt-expectations test on null values
Given the following dbt-expectations test configuration, what will be the result when run on a table where 5% of the rows have null values in the email column?

tests:
  - dbt_expectations.expect_column_values_to_not_be_null:
      column: email
      mostly: 0.95
ATest passes because 95% of <code>email</code> values are not null
BTest fails because any null value causes failure
CTest passes only if no null values exist
DTest fails because <code>mostly</code> parameter is ignored
Attempts:
2 left
💡 Hint
The mostly parameter allows some exceptions up to the given threshold.
data_output
intermediate
2:00remaining
Resulting data from a uniqueness test
You run the following dbt-expectations test on the user_id column:

tests:
  - dbt_expectations.expect_column_values_to_be_unique:
      column: user_id

What is the expected output if the table contains duplicate user_id values?
ATest passes and returns all rows
BTest passes but returns duplicate rows
CTest fails but returns no rows
DTest fails and returns rows with duplicate <code>user_id</code> values
Attempts:
2 left
💡 Hint
Uniqueness tests identify duplicates and fail if any exist.
visualization
advanced
2:00remaining
Visualizing distribution with dbt-expectations
You want to check if the age column in your dataset follows a normal distribution using dbt-expectations. Which visualization would best help you confirm this after running the test?
APie chart showing percentage of null vs non-null <code>age</code> values
BHistogram of <code>age</code> values with a normal curve overlay
CScatter plot of <code>age</code> vs <code>user_id</code>
DBar chart of counts of unique <code>age</code> values
Attempts:
2 left
💡 Hint
Normal distribution is best visualized with histograms and density curves.
🧠 Conceptual
advanced
2:00remaining
Understanding the use of mostly in dbt-expectations
What is the main purpose of the mostly parameter in dbt-expectations tests?
ATo specify the minimum proportion of rows that must meet the expectation for the test to pass
BTo set the maximum number of rows to test
CTo define the exact number of rows that must fail the test
DTo ignore null values in the tested column
Attempts:
2 left
💡 Hint
Think about tolerance for exceptions in data quality tests.
🔧 Debug
expert
2:00remaining
Identifying error in a dbt-expectations test configuration
You wrote this test in your dbt model:

tests:
  - dbt_expectations.expect_column_values_to_be_in_set:
      column: status
      value_set: ['active', 'inactive', 'pending']

When running dbt, you get an error. What is the cause?
AThe parameter name should be <code>value_set</code> not <code>value_set</code> (typo in option name)
BThe correct parameter name is <code>value_set</code> but it should be <code>value_set</code> (case sensitive)
CThe correct parameter name is <code>value_set</code> but it should be <code>value_set</code> (wrong parameter name)
DThe parameter name should be <code>value_set</code> but it must be <code>value_set</code> as a string, not a list
Attempts:
2 left
💡 Hint
Check the exact parameter name required by dbt-expectations for this test.