0
0
DbtHow-ToBeginner ยท 4 min read

How to Use dbt Snapshot: Syntax, Example, and Tips

Use dbt snapshot to track changes in source data by defining a snapshot file with unique keys and a strategy. Run dbt snapshot to capture data changes over time in a snapshot table.
๐Ÿ“

Syntax

A dbt snapshot is defined in a .sql file inside the snapshots/ folder. It requires specifying the unique key to identify records, the strategy for change detection, and the updated_at column to track changes.

  • unique_key: The column(s) that uniquely identify a record.
  • strategy: Either timestamp or check. timestamp uses a timestamp column to detect changes. check compares all columns for changes.
  • updated_at: The timestamp column used with timestamp strategy.
sql
snapshot my_snapshot {
  target_schema = 'snapshots'
  unique_key = 'id'
  strategy = 'timestamp'
  updated_at = 'last_updated'
}

select * from source_table
๐Ÿ’ป

Example

This example shows a snapshot that tracks changes in a customers table using the timestamp strategy. It captures changes based on the updated_at column and stores snapshots in the snapshots schema.

sql
snapshot customers_snapshot {
  target_schema = 'snapshots'
  unique_key = 'customer_id'
  strategy = 'timestamp'
  updated_at = 'updated_at'
}

select
  customer_id,
  name,
  email,
  updated_at
from raw.customers
Output
A new table named snapshots.customers_snapshot is created or updated with historical versions of customer records showing changes over time.
โš ๏ธ

Common Pitfalls

  • Not specifying a unique_key causes errors because dbt cannot identify records uniquely.
  • Using the wrong strategy for your data can miss changes or create unnecessary snapshots.
  • For timestamp strategy, the updated_at column must be accurate and updated on source changes.
  • Running dbt snapshot without configuring your snapshot file properly will fail.
sql
/* Wrong: Missing unique_key */
snapshot bad_snapshot {
  strategy = 'timestamp'
  updated_at = 'last_modified'
}

select * from source_table

/* Right: Include unique_key */
snapshot good_snapshot {
  unique_key = 'id'
  strategy = 'timestamp'
  updated_at = 'last_modified'
}

select * from source_table
๐Ÿ“Š

Quick Reference

PropertyDescriptionExample
unique_keyColumn(s) that uniquely identify each record'id'
strategyMethod to detect changes: 'timestamp' or 'check''timestamp'
updated_atTimestamp column used with 'timestamp' strategy'last_updated'
target_schemaSchema where snapshots are stored'snapshots'
select statementQuery selecting source data to snapshotselect * from source_table
โœ…

Key Takeaways

Define a snapshot with unique_key, strategy, and updated_at to track data changes.
Use the timestamp strategy when you have a reliable updated_at column.
Run dbt snapshot command to create or update snapshot tables.
Avoid missing unique_key or incorrect strategy to prevent errors.
Snapshots help keep historical versions of your source data for auditing and analysis.