We use full refresh and incremental methods to update data efficiently. Full refresh reloads everything, while incremental updates only add new or changed data.
0
0
Full refresh vs incremental in dbt
Introduction
When you want to reload all data from scratch to ensure accuracy.
When your data source is small or changes completely often.
When you want to save time by only adding new data instead of reloading all.
When your data grows large and full reloads take too long.
When you want to keep your data warehouse up to date with recent changes.
Syntax
dbt
models:
my_model:
materialized: incremental
incremental_strategy: insert_overwrite
unique_key: id
-- SQL inside the model
{{ config(materialized='incremental', unique_key='id') }}
SELECT * FROM source_table
{% if is_incremental() %}
WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
{% endif %}materialized: defines how dbt builds the model (table, view, incremental).
is_incremental(): lets you write SQL that runs only during incremental runs.
Examples
This is a full refresh example. The whole table is rebuilt every time.
dbt
{{ config(materialized='table') }}
SELECT * FROM source_tableThis is an incremental model. It adds only new or updated rows based on
updated_at.dbt
{{ config(materialized='incremental', unique_key='id') }}
SELECT * FROM source_table
{% if is_incremental() %}
WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
{% endif %}Sample Program
This dbt model uses incremental materialization. On first run, it loads all data. On later runs, it adds only rows with newer updated_at values.
dbt
-- dbt model: incremental_example.sql
{{ config(materialized='incremental', unique_key='id') }}
SELECT id, name, updated_at FROM source_table
{% if is_incremental() %}
WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
{% endif %}OutputSuccess
Important Notes
Full refresh rebuilds the entire table and can be slow for large data.
Incremental models need a unique key to avoid duplicates.
Use is_incremental() to write SQL that runs only during incremental updates.
Summary
Full refresh reloads all data every time.
Incremental updates add only new or changed data.
Incremental saves time and resources for large datasets.