0
0
dbtdata~20 mins

Staging, intermediate, and marts pattern in dbt - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Master of Staging, Intermediate, and Marts Pattern
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Purpose of the Staging Layer in dbt

In the Staging, intermediate, and marts pattern used in dbt, what is the main purpose of the staging layer?

ATo clean and standardize raw source data for easier use in later models
BTo create final business reports and dashboards directly from raw data
CTo aggregate data into summary tables for quick analysis
DTo store backup copies of raw data without transformation
Attempts:
2 left
💡 Hint

Think about what you do first when you get messy data from different sources.

Predict Output
intermediate
2:00remaining
Output of a dbt Model SQL Query in the Intermediate Layer

Given this dbt model SQL in the intermediate layer:

select
  user_id,
  count(order_id) as total_orders
from {{ ref('stg_orders') }}
group by user_id

What will be the output schema of this model?

dbt
select
  user_id,
  count(order_id) as total_orders
from {{ ref('stg_orders') }}
group by user_id
AColumns: user_id (int), total_orders (int) with one row per order
BColumns: order_id (int), total_orders (int) with one row per order
CColumns: user_id (int), total_orders (int) with one row per user
DColumns: user_id (int), order_id (int) with one row per order
Attempts:
2 left
💡 Hint

Look at the GROUP BY clause and the aggregation function.

data_output
advanced
2:00remaining
Resulting DataFrame Shape After Running a Mart Model

Consider a mart model that joins the intermediate user orders model with a user demographics model on user_id. If the intermediate model has 1000 users and the demographics model has 950 users, what is the number of rows in the resulting mart model after an inner join?

A1000 rows, because all users from the intermediate model are kept
B50 rows, because only users missing in demographics are kept
C1950 rows, because rows from both models are combined
D950 rows, because only users present in both models are kept
Attempts:
2 left
💡 Hint

Remember what an inner join does to rows when keys don't match.

🔧 Debug
advanced
2:00remaining
Identifying the Error in a dbt Model Reference

Look at this dbt model SQL snippet:

select * from {{ ref('intermediate_user_orders') }}

When running dbt, it fails with an error saying the referenced model does not exist. What is the most likely cause?

AThe model name in ref is misspelled or does not match any model file
BThe SQL syntax is invalid because of missing semicolon
CThe {{ ref() }} function cannot be used inside select statements
DThe model file is missing a schema declaration
Attempts:
2 left
💡 Hint

Check the spelling and existence of the referenced model.

🚀 Application
expert
3:00remaining
Choosing the Correct Layer for a New Data Transformation

You have raw sales data coming from multiple sources with inconsistent date formats and missing values. You want to create a model that standardizes dates and fills missing values before calculating monthly sales totals. According to the staging, intermediate, and marts pattern, where should you place the model that standardizes dates and fills missing values?

AIn the marts layer, because it creates final reports
BIn the staging layer, because it prepares raw data for further use
COutside dbt, because data cleaning should not be done in dbt
DIn the intermediate layer, because it aggregates data for analysis
Attempts:
2 left
💡 Hint

Think about where raw data is first cleaned and standardized.