0
0
dbtdata~10 mins

Query profiling and optimization in dbt - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Query profiling and optimization
Write SQL Model in dbt
Run dbt compile & run
Capture Query Profile
Analyze Query Steps
Identify Bottlenecks
Apply Optimizations
Re-run & Compare Performance
Repeat if needed or Finalize
This flow shows how dbt runs a query, profiles it, finds slow parts, applies fixes, and checks improvements.
Execution Sample
dbt
select user_id, count(*) as orders_count
from {{ ref('orders') }}
group by user_id
order by orders_count desc
limit 5
This dbt model counts orders per user, orders by count descending, and limits to top 5 users.
Execution Table
StepActionQuery PartProfile InfoResult/Next Step
1Compile dbt modelFull SQL with ref resolvedSQL ready to runProceed to run query
2Run querySELECT with GROUP BY and ORDER BYExecution time: 1200ms, Rows scanned: 100000Collect query profile
3Profile analysisGROUP BY aggregationHigh CPU usage, 800ms spent hereConsider indexing or pre-aggregation
4Profile analysisORDER BY and LIMITSort takes 300msCheck if sorting can be optimized
5Optimization appliedAdd index on user_idExpected to reduce GROUP BY timeRe-run query
6Re-run querySame SQLExecution time: 700ms, Rows scanned: 100000Performance improved
7Final checkCompare before and after43% faster executionOptimization successful
💡 Query optimized and performance improved, no further changes needed
Variable Tracker
VariableStartAfter Step 3After Step 5Final
Execution Time (ms)N/A1200N/A700
Rows ScannedN/A100000N/A100000
CPU Usage (ms)N/A800N/A500
Sort Time (ms)N/A300N/A200
Key Moments - 3 Insights
Why does the GROUP BY step take most of the time?
Because it processes many rows and aggregates them, as shown in execution_table row 3 where CPU usage is high.
How does adding an index help the query?
Adding an index on user_id speeds up grouping by allowing faster data access, reflected in execution_table row 6 with reduced execution time.
Why do we re-run the query after optimization?
To verify if the changes improved performance, as shown in execution_table rows 6 and 7 comparing before and after times.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what was the execution time before optimization?
A700 ms
B1200 ms
C800 ms
D300 ms
💡 Hint
Check the 'Execution time' in row 2 of the execution_table.
At which step was the index added to optimize the query?
AStep 5
BStep 3
CStep 4
DStep 6
💡 Hint
Look for 'Optimization applied' in the execution_table.
If the sort time was not reduced after optimization, what would likely happen?
AExecution time would decrease more
BExecution time would increase
COverall execution time would remain the same
DRows scanned would increase
💡 Hint
Refer to variable_tracker columns for Sort Time and Execution Time changes.
Concept Snapshot
Query profiling in dbt means running your SQL models,
checking how long each part takes,
finding slow steps like GROUP BY or ORDER BY,
then applying fixes like indexes or pre-aggregations.
Re-run to confirm improvements.
Repeat until performance is good.
Full Transcript
In dbt, query profiling and optimization involves writing SQL models and running them to see how long each part takes. We start by compiling the model and running the query. Then we collect profile data like execution time and CPU usage. We analyze which parts, such as GROUP BY or ORDER BY, take the most time. Next, we apply optimizations like adding indexes. After that, we re-run the query to check if performance improved. This process repeats until the query runs efficiently. The execution table shows each step with timings and actions. Variable tracking helps us see how execution time and CPU usage change. Key moments clarify why grouping is slow, how indexes help, and why re-running is important. The quiz tests understanding of these steps and their effects. This method helps make dbt queries faster and more efficient.