0
0
DbtHow-ToBeginner ยท 3 min read

How to Create Snapshot in dbt: Syntax, Example, and Tips

To create a snapshot in dbt, define a .sql snapshot file in the snapshots/ directory with a snapshot block specifying the source table and unique key. Use dbt snapshot command to capture changes over time in your data.
๐Ÿ“

Syntax

A dbt snapshot is defined in a .sql file inside the snapshots/ folder. The main parts are:

  • snapshot name: Unique name for the snapshot.
  • config block: Defines the target schema, unique key, and strategy.
  • select statement: The source data to snapshot.
  • strategy: How changes are detected (e.g., timestamp or check).
sql
snapshot my_snapshot {
  config(
    target_schema = 'snapshots',
    unique_key = 'id',
    strategy = 'timestamp',
    updated_at = 'last_updated'
  )

  select * from source_schema.source_table
}
๐Ÿ’ป

Example

This example creates a snapshot named customer_snapshot that tracks changes in the customers table using the updated_at timestamp column.

sql
snapshot customer_snapshot {
  config(
    target_schema = 'snapshots',
    unique_key = 'customer_id',
    strategy = 'timestamp',
    updated_at = 'updated_at'
  )

  select * from raw.customers
}
Output
Running dbt snapshot will create or update the snapshots.customer_snapshot table with historical versions of each customer row based on changes in the updated_at column.
โš ๏ธ

Common Pitfalls

  • Not specifying a unique_key causes errors because dbt needs a way to identify rows uniquely.
  • Using timestamp strategy without a proper updated_at column will not track changes correctly.
  • For check strategy, forgetting to list all columns to check can miss changes.
  • Snapshot files must be in the snapshots/ directory; placing them elsewhere will not work.
sql
/* Wrong: Missing unique_key */
snapshot bad_snapshot {
  config(
    strategy = 'timestamp',
    updated_at = 'updated_at'
  )

  select * from raw.customers
}

/* Correct: Added unique_key */
snapshot good_snapshot {
  config(
    unique_key = 'customer_id',
    strategy = 'timestamp',
    updated_at = 'updated_at'
  )

  select * from raw.customers
}
๐Ÿ“Š

Quick Reference

PropertyDescriptionExample
unique_keyColumn(s) that uniquely identify a row'customer_id'
strategyMethod to detect changes: 'timestamp' or 'check''timestamp'
updated_atColumn with last update timestamp (for 'timestamp' strategy)'updated_at'
check_colsColumns to check for changes (for 'check' strategy)['name', 'email']
target_schemaSchema where snapshot table is created'snapshots'
โœ…

Key Takeaways

Create snapshots by defining a snapshot file in the snapshots/ directory with a unique name.
Specify a unique_key to identify rows and a strategy to detect changes.
Use the timestamp strategy with an updated_at column or the check strategy with columns to monitor.
Run dbt snapshot command to capture and store historical data versions.
Avoid common mistakes like missing unique_key or incorrect strategy configuration.