0
0
dbtdata~3 mins

Full refresh vs incremental in dbt - When to Use Which

Choose your learning style9 modes available
The Big Idea

What if you could update huge datasets in minutes instead of hours without risking mistakes?

The Scenario

Imagine you have a huge spreadsheet that tracks daily sales. Every day, you add new sales data. Now, you want to update your report. You can either rewrite the entire spreadsheet from scratch or just add the new sales data.

The Problem

Rewriting the whole spreadsheet every day takes a lot of time and computer power. It can also cause mistakes if you accidentally delete or overwrite data. On the other hand, adding new data manually can be confusing and easy to miss, leading to incomplete reports.

The Solution

Using full refresh vs incremental methods in dbt helps automate this process. Full refresh rebuilds the entire dataset when needed, ensuring everything is fresh. Incremental updates only add or change new data, saving time and reducing errors.

Before vs After
Before
DELETE FROM sales_report;
INSERT INTO sales_report SELECT * FROM daily_sales;
After
SELECT * FROM daily_sales
WHERE date > (SELECT COALESCE(MAX(date), '1900-01-01') FROM sales_report);
What It Enables

This concept lets you keep your data up-to-date efficiently, handling large datasets without wasting time or resources.

Real Life Example

A retail company updates its sales dashboard daily. Using incremental updates, they only process new sales data each day, making the dashboard fast and reliable.

Key Takeaways

Full refresh rebuilds all data, ensuring completeness.

Incremental updates add only new or changed data, saving time.

Choosing the right method improves data freshness and efficiency.