0
0
dbtdata~20 mins

Warehouse-specific optimizations in dbt - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Warehouse Optimization Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Effect of clustering keys on query performance

Given a table clustered by customer_id, what is the expected effect on query performance when filtering by customer_id?

dbt
select * from sales where customer_id = 12345;
AThe query will perform slower because clustering keys add overhead.
BThe query will scan all micro-partitions, no performance change.
CThe query will scan fewer micro-partitions, improving performance.
DThe query will fail due to clustering key restrictions.
Attempts:
2 left
💡 Hint

Clustering keys group data physically by the key, reducing scan scope.

data_output
intermediate
2:00remaining
Result of using incremental model with unique key

What will be the number of rows in the target table after running this incremental model if the source has 1000 rows and 200 rows have changed?

dbt
incremental model with unique_key='id' that updates changed rows only
A1000 rows
B1200 rows
C800 rows
D200 rows
Attempts:
2 left
💡 Hint

Incremental models update existing rows based on unique keys.

🔧 Debug
advanced
2:00remaining
Identify the cause of slow query in dbt model

A dbt model runs slowly despite using clustering keys. Which option explains the likely cause?

dbt
select * from orders where order_date between '2023-01-01' and '2023-01-31'; -- clustered on customer_id
AThe clustering key is on customer_id, but the filter is on order_date, so clustering is ineffective.
BThe model is missing a primary key, causing slow performance.
CThe query is slow because clustering keys always slow down queries.
DThe query is slow because the table is not partitioned by order_date.
Attempts:
2 left
💡 Hint

Clustering helps only when filtering on the clustered columns.

🚀 Application
advanced
2:00remaining
Choosing partitioning strategy for large fact table

You have a large sales fact table with billions of rows. Which partitioning strategy is best to optimize queries filtering by date and region?

ADo not partition or cluster; rely on full table scans.
BPartition by region column and cluster by date column.
CCluster by both date and region columns without partitioning.
DPartition by date column and cluster by region column.
Attempts:
2 left
💡 Hint

Partitioning works best on low-cardinality columns used in filters.

🧠 Conceptual
expert
2:00remaining
Impact of materialized views on warehouse costs

How do materialized views affect compute costs in a cloud data warehouse?

AMaterialized views increase storage costs but do not affect compute costs.
BMaterialized views reduce compute costs by precomputing and storing results, reducing query time.
CMaterialized views have no impact on compute costs as they are just query shortcuts.
DMaterialized views increase compute costs because they require recomputation on every query.
Attempts:
2 left
💡 Hint

Think about how precomputed data affects query execution.