0
0
dbtdata~10 mins

Snapshot tables for historical tracking in dbt - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Snapshot tables for historical tracking
Start: Define source data
Create snapshot config
Run snapshot command
dbt compares current source data with snapshot
If changes detected
Insert new record
Snapshot table updated with history
End
This flow shows how dbt creates and updates snapshot tables by comparing source data and storing changes over time.
Execution Sample
dbt
{{ config(
  strategy = 'timestamp',
  updated_at = 'last_updated',
  unique_key = 'customer_id',
  check_cols = ['name', 'email']
)}}

select * from raw.customers
This snapshot config tracks changes in the customers table using the last_updated timestamp.
Execution Table
StepActionSource Data RowSnapshot Table StateChange DetectedResult
1Initial snapshot run{customer_id: 1, name: 'Alice', email: 'a@example.com', last_updated: '2024-01-01'}EmptyYes (new record)Insert record into snapshot
2Initial snapshot run{customer_id: 2, name: 'Bob', email: 'b@example.com', last_updated: '2024-01-01'}Contains AliceYes (new record)Insert record into snapshot
3Second run with no changes{customer_id: 1, name: 'Alice', email: 'a@example.com', last_updated: '2024-01-01'}Contains Alice, BobNoNo insert
4Second run with change{customer_id: 1, name: 'Alice', email: 'alice_new@example.com', last_updated: '2024-02-01'}Contains Alice, BobYes (email changed)Insert new version of Alice
5Second run with unchanged Bob{customer_id: 2, name: 'Bob', email: 'b@example.com', last_updated: '2024-01-01'}Contains Alice (2 versions), BobNoNo insert
💡 All source rows processed; snapshot table updated with new versions only when changes detected.
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3After Step 4After Step 5
Snapshot Table RowsEmptyAlice v1Alice v1, Bob v1Alice v1, Bob v1Alice v1, Alice v2, Bob v1Alice v1, Alice v2, Bob v1
Key Moments - 2 Insights
Why does the snapshot insert a new record for Alice in step 4 even though customer_id is the same?
Because the email changed, dbt detects a difference in the tracked columns and inserts a new version to keep history, as shown in execution_table step 4.
Why is there no insert for Bob in step 5?
Bob's data did not change since last snapshot, so dbt skips inserting a new record, as shown in execution_table step 5.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the state of the snapshot table after step 2?
AContains only Alice's first version
BContains Alice and Bob's first versions
CEmpty
DContains Alice's second version
💡 Hint
Check the Snapshot Table State column in row for step 2.
At which step does dbt detect a change in Alice's data?
AStep 4
BStep 3
CStep 1
DStep 5
💡 Hint
Look for 'Change Detected' column showing 'email changed' in execution_table.
If the 'check_cols' included 'last_updated' only, what would happen at step 4?
ANo new version inserted because email changed but not last_updated
BSnapshot table would be emptied
CNew version inserted because last_updated changed
DAll records would be duplicated
💡 Hint
Consider how 'check_cols' controls which columns trigger snapshot inserts, see execution_table step 4.
Concept Snapshot
Snapshot tables track changes in source data over time.
Use dbt snapshot with unique_key and strategy.
Changes in tracked columns insert new records.
Unchanged rows are skipped to save space.
Snapshot tables keep full history for analysis.
Full Transcript
This visual execution shows how dbt snapshot tables work for historical tracking. First, source data is defined. Then a snapshot configuration specifies the unique key, strategy, and columns to check for changes. When the snapshot runs, dbt compares current source rows with existing snapshot records. If a row is new or changed, dbt inserts a new record version into the snapshot table. If no changes are detected, no insert happens. The execution table traces each step with source data, snapshot state, and actions. Variable tracking shows how snapshot table rows grow only when changes occur. Key moments clarify why new versions are inserted or skipped. The quiz tests understanding of snapshot state and change detection. The snapshot table approach helps keep a full history of data changes for analysis and auditing.