0
0
dbtdata~20 mins

Query profiling and optimization in dbt - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Query Profiling Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
1:30remaining
What is the output of this dbt model's run result summary?

Given a dbt run with the following model configuration, what will be the value of execution_time in seconds if the model took 3 minutes and 45 seconds to run?

dbt
models/my_model.sql
-- model config:
-- materialized: table
-- run_started_at: 2024-04-01 10:00:00
-- run_finished_at: 2024-04-01 10:03:45
A345
B180
C225
D375
Attempts:
2 left
💡 Hint

Convert minutes and seconds to total seconds.

data_output
intermediate
1:30remaining
Which option shows the correct profile output for a slow query?

You ran a dbt model and got the following profile output snippet showing the query took 120 seconds and scanned 1 billion rows. Which option correctly represents this profiling data as a dictionary?

A{"execution_time_sec": 120, "rows_scanned": 1000000000}
B{"execution_time_sec": "120 seconds", "rows_scanned": 1000000000}
C{"execution_time_sec": 120, "rows_scanned": 1000000}
D{"execution_time_sec": 120, "rows_scanned": "1 billion"}
Attempts:
2 left
💡 Hint

Use numeric values for time and rows scanned.

🔧 Debug
advanced
2:00remaining
Identify the cause of slow query in this dbt model

Consider this dbt model SQL snippet:

select
  user_id,
  count(*) as total_orders
from orders
where order_date >= '2023-01-01'
group by user_id
order by total_orders desc
limit 10

Which of the following is the most likely cause of slow performance?

AMissing index on order_date column causing full table scan
BUsing limit without order by causes slow sorting
CGrouping by user_id is invalid syntax
DFiltering on order_date is ignored by the database
Attempts:
2 left
💡 Hint

Think about what helps speed up filtering on date columns.

🚀 Application
advanced
2:30remaining
Choose the best dbt configuration to optimize incremental model runs

You have a large table that updates daily. You want your dbt incremental model to only process new rows based on a timestamp column updated_at. Which configuration snippet will achieve this?

A
incremental_strategy: append
unique_key: id
where: updated_at > (select max(updated_at) from {{ this }})
B
incremental_strategy: merge
unique_key: id
where: updated_at > (select max(updated_at) from {{ this }})
C
incremental_strategy: insert_overwrite
unique_key: id
where: updated_at < (select max(updated_at) from {{ this }})
D
incremental_strategy: append
unique_key: id
where: updated_at < (select min(updated_at) from {{ this }})
Attempts:
2 left
💡 Hint

Use a strategy that updates existing rows and inserts new ones.

🧠 Conceptual
expert
1:30remaining
What is the primary benefit of using dbt's query profiling features?

Choose the best explanation for why dbt's query profiling is important in data projects.

AIt generates visual dashboards for business users without coding.
BIt automatically fixes SQL syntax errors in models.
CIt replaces the need for database indexes by caching results.
DIt helps identify slow queries and resource-heavy operations to optimize performance.
Attempts:
2 left
💡 Hint

Think about what profiling means in computing.