0
0
dbtdata~10 mins

Why incremental models save time and cost in dbt - Visual Breakdown

Choose your learning style9 modes available
Concept Flow - Why incremental models save time and cost
Start Full Model Build
Process All Data Rows
Store Full Result
Next Run: Incremental Model
Identify New/Changed Rows
Process Only New/Changed Rows
Append/Update Incremental Result
Save Time and Cost
End
Incremental models first build full data, then on next runs process only new or changed data, saving time and cost.
Execution Sample
dbt
select * from source_table
where updated_at > (select max(updated_at) from target_table)
This query selects only new or updated rows since last run to process incrementally.
Execution Table
StepActionData ProcessedTime TakenCost Impact
1Full model buildAll rows (100,000)10 minutesHigh cost
2Store full result100,000 rows savedInstantNo extra cost
3Next run startsTrigger incrementalInstantNo cost yet
4Identify new rows5,000 new rows1 minuteLow cost
5Process new rows only5,000 rows processed1 minuteLow cost
6Append new rowsResult updatedInstantNo extra cost
7End incremental runTotal rows 105,0002 minutes totalLow cost
💡 Incremental run stops after processing only new rows, saving time and cost compared to full build.
Variable Tracker
VariableStart (Full Build)After Incremental Run
Rows Processed100,0005,000
Time Taken (minutes)102
CostHighLow
Key Moments - 3 Insights
Why does the incremental model process fewer rows on subsequent runs?
Because it only processes rows that are new or have changed since the last run, as shown in execution_table steps 4 and 5.
How does processing fewer rows save cost?
Less data processed means less compute time and resources used, lowering cost as seen in the reduced time and cost columns in the execution_table.
What happens if the incremental model runs a full build instead?
It would process all rows again, taking more time and cost, like step 1 in the execution_table.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, how many rows are processed during the incremental run?
A105,000
B100,000
C5,000
D0
💡 Hint
Check the 'Data Processed' column at step 5 in the execution_table.
At which step does the incremental model identify new or changed rows?
AStep 2
BStep 4
CStep 6
DStep 1
💡 Hint
Look at the 'Action' column in the execution_table for identifying new rows.
If the incremental model processed all rows every time, how would the 'Time Taken' change?
AIt would increase to 10 minutes
BIt would stay the same as 2 minutes
CIt would decrease to 1 minute
DIt would be zero
💡 Hint
Compare 'Time Taken' at step 1 and step 7 in the execution_table.
Concept Snapshot
Incremental models process only new or changed data after the first full build.
This reduces the amount of data processed each run.
Less data means less compute time and lower cost.
Use a filter on updated timestamps to select new rows.
Incremental saves resources and speeds up data workflows.
Full Transcript
Incremental models in dbt start by building the full dataset once. On later runs, they only process new or updated rows since the last run. This is done by filtering data using a timestamp or similar marker. Processing fewer rows saves time and reduces compute costs. The execution table shows a full build processing 100,000 rows taking 10 minutes, while the incremental run processes only 5,000 rows in 2 minutes. This approach is efficient and cost-effective for large datasets.