0
0
Snowflakecloud~15 mins

Querying historical data (AT, BEFORE) in Snowflake - Deep Dive

Choose your learning style9 modes available
Overview - Querying historical data (AT, BEFORE)
What is it?
Querying historical data (AT, BEFORE) means looking at how your data looked at a specific past time or before that time. It lets you see previous versions of your data, even if it has changed or been deleted since then. This is useful for audits, troubleshooting, or understanding trends over time. Snowflake provides special features to easily access this past data without complex backups.
Why it matters
Without the ability to query historical data, you would lose important information about how your data changed over time. This makes it hard to fix mistakes, verify past reports, or comply with regulations. Being able to see data as it was at a certain time helps businesses trust their data and make better decisions based on accurate history.
Where it fits
Before learning this, you should understand basic SQL queries and how data is stored in Snowflake tables. After mastering historical querying, you can explore advanced data versioning, time travel features, and data recovery techniques in Snowflake.
Mental Model
Core Idea
Querying historical data lets you travel back in time to see exactly what your data looked like at or before a specific moment.
Think of it like...
It's like looking at old photos in an album to see how a place or person looked on a certain day, even if things have changed since then.
┌───────────────────────────────┐
│          Current Data          │
│  (Latest version in table)    │
├──────────────┬────────────────┤
│  Time Travel │ Historical View│
│   Query     │  (Past snapshot)│
└──────────────┴────────────────┘
Build-Up - 7 Steps
1
FoundationUnderstanding Snowflake Tables
🤔
Concept: Learn what a Snowflake table is and how data is stored in it.
A Snowflake table holds rows and columns of data, like a spreadsheet. When you add, change, or delete data, the table updates to show the latest version. Normally, you only see the current data when you query the table.
Result
You know that querying a table shows the newest data by default.
Understanding that tables hold the latest data sets the stage for why we need special methods to see past data.
2
FoundationBasics of Time Travel in Snowflake
🤔
Concept: Introduce Snowflake's Time Travel feature that stores past data versions.
Snowflake keeps a history of changes for a set period (default 1 day). This lets you query the table as it was at any point in that time window. This history is automatic and doesn't require extra backups.
Result
You realize Snowflake can show past data without manual snapshots.
Knowing that Snowflake stores past data versions automatically explains how historical queries are possible.
3
IntermediateQuerying Data AT a Specific Time
🤔Before reading on: Do you think querying data AT a time returns data exactly as it was then, or just data changed since then? Commit to your answer.
Concept: Learn how to write queries that show data exactly as it was at a given timestamp.
Use the 'AT' clause with a timestamp or offset in your SELECT statement. For example: SELECT * FROM table_name AT (TIMESTAMP => '2024-06-01 10:00:00'); This returns the table's state exactly at that moment.
Result
You get a snapshot of the table as it existed at the chosen time.
Understanding the 'AT' clause lets you pinpoint the exact data state, crucial for precise audits or reports.
4
IntermediateQuerying Data BEFORE a Specific Time
🤔Before reading on: Does querying data BEFORE a time include data changes at that time or only strictly earlier? Commit to your answer.
Concept: Learn how to query data as it was before a certain timestamp, excluding changes at that time.
Use the 'BEFORE' clause with a timestamp. For example: SELECT * FROM table_name BEFORE (TIMESTAMP => '2024-06-01 10:00:00'); This returns the table state just before that time, excluding changes made exactly at that timestamp.
Result
You see the data state immediately before the specified time.
Knowing the difference between 'AT' and 'BEFORE' helps avoid off-by-one errors in historical data analysis.
5
IntermediateUsing Offsets for Relative Time Travel
🤔
Concept: Learn to query data relative to the current time using offsets like minutes or hours.
Instead of exact timestamps, you can use offsets like: SELECT * FROM table_name AT (OFFSET => -3600); This shows data as it was one hour ago. Offsets are handy when you don't know exact timestamps.
Result
You can easily query recent past data without precise time knowledge.
Using offsets makes historical queries flexible and practical for monitoring recent changes.
6
AdvancedLimits and Retention of Time Travel Data
🤔Before reading on: Do you think Snowflake keeps historical data forever or only for a limited time? Commit to your answer.
Concept: Understand how long Snowflake retains historical data and what affects this retention.
By default, Snowflake keeps historical data for 1 day, but this can be extended up to 90 days with Enterprise editions. After this period, old data is permanently removed and cannot be queried.
Result
You know the time window for querying historical data and plan accordingly.
Knowing retention limits prevents surprises when historical queries return no data.
7
ExpertPerformance and Cost Implications of Time Travel
🤔Before reading on: Do you think querying historical data costs the same as querying current data? Commit to your answer.
Concept: Explore how querying historical data affects performance and billing in Snowflake.
Time Travel queries may scan additional storage to reconstruct past states, which can increase query time and cost. Also, longer retention periods increase storage costs. Efficient use involves balancing retention needs with cost and performance.
Result
You can optimize your use of Time Travel to control costs and maintain performance.
Understanding cost and performance tradeoffs helps design sustainable data retention strategies.
Under the Hood
Snowflake stores data changes as micro-partitions with metadata timestamps. When you query historical data, Snowflake reconstructs the table state by combining relevant micro-partitions valid at the requested time. This avoids full backups and enables fast, flexible time travel.
Why designed this way?
This design allows efficient storage and retrieval of historical data without duplicating entire tables. It balances storage cost and query speed, unlike traditional backup systems that are slow and bulky.
┌───────────────┐
│  Micro-Partition│
│  Storage with  │
│  Timestamps   │
└──────┬────────┘
       │
       ▼
┌─────────────────────────────┐
│ Query with AT/BEFORE clause │
│  Snowflake finds partitions │
│  valid at requested time    │
└──────┬──────────────────────┘
       │
       ▼
┌─────────────────────────────┐
│ Reconstructs table snapshot │
│  from relevant partitions   │
└─────────────────────────────┘
Myth Busters - 4 Common Misconceptions
Quick: Does querying data AT a time include changes made exactly at that time? Commit to yes or no.
Common Belief:Querying data AT a time excludes changes made exactly at that timestamp.
Tap to reveal reality
Reality:Querying AT a time includes all changes up to and including that exact timestamp.
Why it matters:Misunderstanding this causes missing or extra data in reports, leading to incorrect conclusions.
Quick: Do you think Snowflake keeps historical data forever? Commit to yes or no.
Common Belief:Snowflake stores all historical data indefinitely for any table.
Tap to reveal reality
Reality:Snowflake retains historical data only for a limited retention period (default 1 day, max 90 days).
Why it matters:Assuming infinite retention can cause failed queries when data is older than retention, risking data loss.
Quick: Does querying historical data cost the same as querying current data? Commit to yes or no.
Common Belief:Historical queries cost the same as normal queries and have no performance impact.
Tap to reveal reality
Reality:Historical queries may scan more data and take longer, increasing cost and query time.
Why it matters:Ignoring this can lead to unexpected billing spikes and slow performance in production.
Quick: Can you use Time Travel to recover data deleted months ago? Commit to yes or no.
Common Belief:Time Travel lets you recover any deleted data regardless of age.
Tap to reveal reality
Reality:Time Travel only works within the retention window; older data requires Fail-safe or backups.
Why it matters:Relying solely on Time Travel for long-term recovery risks permanent data loss.
Expert Zone
1
Time Travel queries internally merge micro-partitions dynamically, which can cause variable query performance depending on data change frequency.
2
Retention periods can be set per table or database, allowing fine-grained control over historical data availability and cost.
3
Fail-safe is a separate Snowflake feature that provides an additional 7-day recovery window after Time Travel expires, but it is not queryable.
When NOT to use
Avoid relying on Time Travel for long-term archival or compliance data retention; use dedicated data archiving or backup solutions instead.
Production Patterns
In production, teams use Time Travel for quick error recovery and auditing, combined with scheduled snapshots for long-term storage and compliance.
Connections
Version Control Systems
Similar pattern of storing changes over time to reconstruct past states.
Understanding how Git tracks file changes helps grasp how Snowflake tracks data changes for time travel.
Backup and Restore
Builds on the idea of preserving past data states but with different mechanisms and use cases.
Knowing traditional backups clarifies why Snowflake's time travel is faster and more flexible for recent data recovery.
Memory Snapshotting in Operating Systems
Shares the concept of capturing system state at points in time for rollback or analysis.
Recognizing snapshotting in OS memory management deepens understanding of how Snowflake snapshots data states.
Common Pitfalls
#1Querying historical data without specifying AT or BEFORE clause returns current data, not past data.
Wrong approach:SELECT * FROM sales_data WHERE sale_date = '2024-06-01';
Correct approach:SELECT * FROM sales_data AT (TIMESTAMP => '2024-06-01 00:00:00') WHERE sale_date = '2024-06-01';
Root cause:Assuming normal queries automatically access historical data without using Time Travel syntax.
#2Using BEFORE clause expecting to include changes made exactly at the timestamp.
Wrong approach:SELECT * FROM customers BEFORE (TIMESTAMP => '2024-06-01 12:00:00'); -- expects data including changes at noon
Correct approach:SELECT * FROM customers AT (TIMESTAMP => '2024-06-01 12:00:00'); -- includes changes at noon
Root cause:Confusing the difference between BEFORE (strictly before) and AT (up to and including) clauses.
#3Setting Time Travel retention too short and losing ability to query needed historical data.
Wrong approach:ALTER TABLE orders SET DATA_RETENTION_TIME_IN_DAYS = 0;
Correct approach:ALTER TABLE orders SET DATA_RETENTION_TIME_IN_DAYS = 7;
Root cause:Not understanding the impact of retention settings on historical data availability.
Key Takeaways
Snowflake's Time Travel feature lets you query data as it was at or before a specific past time, enabling powerful historical analysis.
The AT clause includes changes made exactly at the timestamp, while BEFORE excludes them, so choose carefully to get accurate results.
Historical data is retained only for a limited time, so plan retention settings based on your recovery and audit needs.
Querying historical data may cost more and run slower than current data queries due to extra processing.
Understanding Snowflake's micro-partition storage and time travel mechanics helps optimize your use of historical queries.