What is the main purpose of dbt in data workflows?
Think about what dbt does after data is loaded into a warehouse.
dbt focuses on transforming and testing data inside the warehouse using SQL and version control, not on storage, visualization, or data collection.
In the ELT (Extract, Load, Transform) process, where does dbt fit?
dbt works after data is already loaded into the warehouse.
dbt is designed to transform data inside the warehouse after extraction and loading are complete.
Given a dbt model that selects customer_id and total_sales from a raw sales table, what will be the structure of the resulting table after running dbt?
select customer_id, sum(sales_amount) as total_sales from raw_sales group by customer_id
dbt models create tables or views based on the SQL query you write.
The model runs the SQL query and creates a table or view with the selected columns and aggregation as specified.
What error will occur when running this dbt model SQL?
select customer_id, total_sales from raw_sales group by customer_id
Check if all selected columns are either grouped or aggregated.
Since total_sales is selected but not aggregated or grouped, SQL will raise an error.
After running dbt tests on a model, you see this output:
PASS test_not_null_customer_id (1 passed)
FAIL test_unique_order_id (2 failed)
What does this mean?
Look at what PASS and FAIL indicate for each test.
PASS means the test succeeded (no nulls in customer_id). FAIL means duplicates found in order_id for 2 rows.