0
0
DbtConceptBeginner · 3 min read

What is Snapshot in dbt: Definition and Usage Explained

In dbt, a snapshot is a way to capture and store changes in your source data over time. It helps you track how records evolve by saving historical versions, enabling analysis of data changes and trends.
⚙️

How It Works

A snapshot in dbt works like taking a photo of your data at regular intervals. Imagine you have a list of customers and their details that can change over time, like their address or status. Instead of just keeping the latest version, a snapshot saves each change as a new record with a timestamp.

This means you can see not only the current state but also how the data looked in the past. dbt manages this by comparing the current data with the last snapshot and storing only the rows that have changed, making it efficient and easy to track history.

💻

Example

This example shows a simple dbt snapshot that tracks changes in a customers table by comparing the email and status fields.

sql
snapshot customers_snapshot {
  target_schema = 'snapshots'
  unique_key = 'customer_id'
  strategy = 'check'
  check_cols = ['email', 'status']

  sql = """
    select customer_id, email, status, updated_at
    from raw.customers
  """
}
Output
A new table 'snapshots.customers_snapshot' is created that stores each version of customer records whenever 'email' or 'status' changes, with timestamps showing when changes occurred.
🎯

When to Use

Use snapshots when you need to track how data changes over time but your source system does not keep history. For example, tracking customer status changes, product price updates, or employee role changes.

This is useful for audits, trend analysis, or building slowly changing dimensions in data warehouses where you want to keep a full history of changes.

Key Points

  • Snapshots capture historical changes by storing versions of records.
  • dbt compares current data with previous snapshots to detect changes.
  • They are useful for tracking slowly changing data without native history in source systems.
  • Snapshots create tables that include start and end timestamps for each record version.

Key Takeaways

A snapshot in dbt saves historical versions of data to track changes over time.
It compares current data with previous snapshots to store only changed records.
Use snapshots to audit or analyze data that changes slowly without built-in history.
Snapshots create tables with timestamps showing when each record version was valid.