How to Create Snapshot in dbt: Syntax, Example, and Tips
To create a snapshot in
dbt, define a .sql snapshot file in the snapshots/ directory with a snapshot block specifying the source table and unique key. Use dbt snapshot command to capture changes over time in your data.Syntax
A dbt snapshot is defined in a .sql file inside the snapshots/ folder. The main parts are:
- snapshot name: Unique name for the snapshot.
- config block: Defines the target schema, unique key, and strategy.
- select statement: The source data to snapshot.
- strategy: How changes are detected (e.g.,
timestamporcheck).
sql
snapshot my_snapshot {
config(
target_schema = 'snapshots',
unique_key = 'id',
strategy = 'timestamp',
updated_at = 'last_updated'
)
select * from source_schema.source_table
}Example
This example creates a snapshot named customer_snapshot that tracks changes in the customers table using the updated_at timestamp column.
sql
snapshot customer_snapshot {
config(
target_schema = 'snapshots',
unique_key = 'customer_id',
strategy = 'timestamp',
updated_at = 'updated_at'
)
select * from raw.customers
}Output
Running dbt snapshot will create or update the snapshots.customer_snapshot table with historical versions of each customer row based on changes in the updated_at column.
Common Pitfalls
- Not specifying a
unique_keycauses errors because dbt needs a way to identify rows uniquely. - Using
timestampstrategy without a properupdated_atcolumn will not track changes correctly. - For
checkstrategy, forgetting to list all columns to check can miss changes. - Snapshot files must be in the
snapshots/directory; placing them elsewhere will not work.
sql
/* Wrong: Missing unique_key */ snapshot bad_snapshot { config( strategy = 'timestamp', updated_at = 'updated_at' ) select * from raw.customers } /* Correct: Added unique_key */ snapshot good_snapshot { config( unique_key = 'customer_id', strategy = 'timestamp', updated_at = 'updated_at' ) select * from raw.customers }
Quick Reference
| Property | Description | Example |
|---|---|---|
| unique_key | Column(s) that uniquely identify a row | 'customer_id' |
| strategy | Method to detect changes: 'timestamp' or 'check' | 'timestamp' |
| updated_at | Column with last update timestamp (for 'timestamp' strategy) | 'updated_at' |
| check_cols | Columns to check for changes (for 'check' strategy) | ['name', 'email'] |
| target_schema | Schema where snapshot table is created | 'snapshots' |
Key Takeaways
Create snapshots by defining a snapshot file in the snapshots/ directory with a unique name.
Specify a unique_key to identify rows and a strategy to detect changes.
Use the timestamp strategy with an updated_at column or the check strategy with columns to monitor.
Run dbt snapshot command to capture and store historical data versions.
Avoid common mistakes like missing unique_key or incorrect strategy configuration.