0
0
DbtHow-ToBeginner ยท 3 min read

How to Use unique_key in Incremental Models in dbt

In dbt, use unique_key in your incremental model configuration to specify the column(s) that uniquely identify rows. This helps dbt update only changed or new rows during incremental runs, avoiding duplicates and improving performance.
๐Ÿ“

Syntax

The unique_key is set inside the config block of an incremental model. It defines the column or list of columns that uniquely identify each row in your data.

Example parts:

  • unique_key='id': Single column as unique key.
  • unique_key=['id', 'date']: Composite key with multiple columns.
  • materialized='incremental': Specifies the model is incremental.
jinja
{{ config(
    materialized='incremental',
    unique_key='id'
) }}
๐Ÿ’ป

Example

This example shows a simple incremental model using unique_key to update rows based on the user_id column. It inserts new rows and updates existing ones without duplicates.

jinja
{{ config(
    materialized='incremental',
    unique_key='user_id'
) }}

select
  user_id,
  max(event_date) as last_event_date,
  count(*) as event_count
from {{ source('app', 'events') }}
{% if is_incremental() %}
  where event_date > (select max(last_event_date) from {{ this }})
{% endif %}
group by user_id
Output
A table with unique user_id rows, updated incrementally with new or changed data.
โš ๏ธ

Common Pitfalls

Common mistakes when using unique_key in incremental models include:

  • Not specifying unique_key, causing duplicates on incremental runs.
  • Using a non-unique column as unique_key, which breaks updates.
  • Forgetting to filter new data with is_incremental(), leading to full reloads.

Always ensure your unique_key truly identifies each row uniquely.

jinja
{{ config(
    materialized='incremental'
    # Missing unique_key causes duplicates
) }}

-- Correct way:
{{ config(
    materialized='incremental',
    unique_key='id'
) }}
๐Ÿ“Š

Quick Reference

PropertyDescriptionExample
materializedSets model type to incremental'incremental'
unique_keyColumn(s) that uniquely identify rows'id' or ['id', 'date']
is_incremental()Function to filter new data in incremental runsif is_incremental() then filter new rows
โœ…

Key Takeaways

Set unique_key in incremental models to identify rows uniquely for updates.
Use is_incremental() to filter only new or changed data during incremental runs.
Without unique_key, incremental models may create duplicate rows.
Ensure unique_key columns truly have unique values to avoid update errors.
Incremental models improve performance by processing only new data.