How to Use unique_key in Incremental Models in dbt
In dbt, use
unique_key in your incremental model configuration to specify the column(s) that uniquely identify rows. This helps dbt update only changed or new rows during incremental runs, avoiding duplicates and improving performance.Syntax
The unique_key is set inside the config block of an incremental model. It defines the column or list of columns that uniquely identify each row in your data.
Example parts:
unique_key='id': Single column as unique key.unique_key=['id', 'date']: Composite key with multiple columns.materialized='incremental': Specifies the model is incremental.
jinja
{{ config(
materialized='incremental',
unique_key='id'
) }}Example
This example shows a simple incremental model using unique_key to update rows based on the user_id column. It inserts new rows and updates existing ones without duplicates.
jinja
{{ config(
materialized='incremental',
unique_key='user_id'
) }}
select
user_id,
max(event_date) as last_event_date,
count(*) as event_count
from {{ source('app', 'events') }}
{% if is_incremental() %}
where event_date > (select max(last_event_date) from {{ this }})
{% endif %}
group by user_idOutput
A table with unique user_id rows, updated incrementally with new or changed data.
Common Pitfalls
Common mistakes when using unique_key in incremental models include:
- Not specifying
unique_key, causing duplicates on incremental runs. - Using a non-unique column as
unique_key, which breaks updates. - Forgetting to filter new data with
is_incremental(), leading to full reloads.
Always ensure your unique_key truly identifies each row uniquely.
jinja
{{ config(
materialized='incremental'
# Missing unique_key causes duplicates
) }}
-- Correct way:
{{ config(
materialized='incremental',
unique_key='id'
) }}Quick Reference
| Property | Description | Example |
|---|---|---|
| materialized | Sets model type to incremental | 'incremental' |
| unique_key | Column(s) that uniquely identify rows | 'id' or ['id', 'date'] |
| is_incremental() | Function to filter new data in incremental runs | if is_incremental() then filter new rows |
Key Takeaways
Set unique_key in incremental models to identify rows uniquely for updates.
Use is_incremental() to filter only new or changed data during incremental runs.
Without unique_key, incremental models may create duplicate rows.
Ensure unique_key columns truly have unique values to avoid update errors.
Incremental models improve performance by processing only new data.