0
0
dbtdata~30 mins

Slowly changing dimensions (SCD Type 2) in dbt - Mini Project: Build & Apply

Choose your learning style9 modes available
Slowly Changing Dimensions (SCD Type 2) with dbt
📖 Scenario: You work as a data analyst for an online store. The store keeps track of customer information like their city and membership level. Sometimes customers move to a new city or change their membership level. You want to keep a history of these changes in your data warehouse.This is called Slowly Changing Dimensions Type 2 (SCD Type 2). It means you keep old records and add new records when customer info changes.
🎯 Goal: Build a dbt model that implements SCD Type 2 for a customers table. Your model will detect changes in customer city or membership level and keep historical records with start and end dates.
📋 What You'll Learn
Create a source table called raw_customers with customer_id, city, membership_level, and effective_date columns.
Create a variable current_date to represent the current processing date.
Write a dbt model SQL query that compares new data with existing data to detect changes in city or membership_level.
Output the full SCD Type 2 table with columns: customer_id, city, membership_level, start_date, end_date, and is_current.
💡 Why This Matters
🌍 Real World
SCD Type 2 is used in data warehouses to keep full history of changes in important data like customers, products, or employees.
💼 Career
Data engineers and analysts use SCD Type 2 to build reliable historical datasets for reporting and analysis.
Progress0 / 4 steps
1
Create the source table raw_customers
Create a source table called raw_customers with these exact rows:
(1, 'New York', 'Gold', '2024-01-01'), (2, 'Los Angeles', 'Silver', '2024-01-01'), (1, 'Boston', 'Gold', '2024-06-01').
Use columns customer_id, city, membership_level, and effective_date.
dbt
Need a hint?

Use a CTE named raw_customers with union all to combine rows.

2
Define the current_date variable
Create a variable called current_date and set it to the date '2024-07-01'.
dbt
Need a hint?

Create a CTE named current_date with a single date value.

3
Write the SCD Type 2 logic to detect changes
Write a SQL query that:
- Joins raw_customers with the existing SCD table (assume empty for first run)
- Detects changes in city or membership_level
- Assigns start_date as effective_date from raw_customers
- Sets end_date to NULL for current records
- Sets is_current to TRUE for current records
Use customer_id, city, membership_level, start_date, end_date, and is_current columns.
dbt
Need a hint?

For this beginner project, assume the existing SCD table is empty and just select from raw_customers with added columns.

4
Output the final SCD Type 2 table
Write a select statement to output all columns from the scd CTE: customer_id, city, membership_level, start_date, end_date, and is_current.
dbt
Need a hint?

Use select customer_id, city, membership_level, start_date, end_date, is_current from scd.