0
0
dbtdata~5 mins

dbt-utils (surrogate_key, pivot, unpivot)

Choose your learning style9 modes available
Introduction

These dbt-utils macros help you easily create unique IDs and reshape your data tables. They save time and avoid errors when working with data.

When you need a unique ID for each row combining multiple columns.
When you want to turn rows into columns to summarize data.
When you want to turn columns into rows to analyze data differently.
When preparing data for reports that need a specific layout.
When cleaning data to make it easier to work with in SQL.
Syntax
dbt
-- surrogate_key
{{ dbt_utils.surrogate_key(['column1', 'column2']) }}

-- pivot
{{ dbt_utils.pivot(
    relation=ref('your_table'),
    column='column_to_pivot',
    value='value_column',
    aggregate='sum'
) }}

-- unpivot
{{ dbt_utils.unpivot(
    relation=ref('your_table'),
    columns=['col1', 'col2'],
    unpivot_column_name='new_column_name',
    unpivot_value_name='new_value_name'
) }}

surrogate_key creates a unique string ID by combining columns.

pivot turns row values into columns, summarizing data.

unpivot turns columns into rows, making data longer.

Examples
Creates a unique ID by combining user_id and order_id.
dbt
{{ dbt_utils.surrogate_key(['user_id', 'order_id']) }}
Turns product categories into columns showing total sales per category.
dbt
{{ dbt_utils.pivot(
    relation=ref('sales_data'),
    column='product_category',
    value='sales_amount',
    aggregate='sum'
) }}
Turns monthly sales columns into rows with month and sales columns.
dbt
{{ dbt_utils.unpivot(
    relation=ref('monthly_sales'),
    columns=['jan', 'feb', 'mar'],
    unpivot_column_name='month',
    unpivot_value_name='sales'
) }}
Sample Program

This example shows how to create a surrogate key combining user_id and product_category. Then it shows how to pivot sales by product category and unpivot monthly sales columns into rows.

dbt
with source_data as (
    select * from (values
        (1, 'A', 100, 10, 20),
        (2, 'B', 200, 15, 25),
        (3, 'A', 150, 12, 22)
    ) as t(user_id, product_category, sales, jan, feb)
),

-- Create surrogate key combining user_id and product_category
keyed as (
    select
        user_id,
        product_category,
        sales,
        {{ dbt_utils.surrogate_key(['user_id', 'product_category']) }} as surrogate_id,
        jan,
        feb
    from source_data
),

-- Pivot product_category to columns with sum of sales
pivoted as (
    select * from (
        select user_id, product_category, sales from keyed
    )
    pivot (
        sum(sales) for product_category in ('A' as A_sales, 'B' as B_sales)
    )
),

-- Unpivot jan and feb sales into month and sales columns
unpivoted as (
    select * from (
        select user_id, jan, feb from keyed
    )
    unpivot (
        sales for month in (jan, feb)
    )
)

select * from keyed;

-- To see pivoted and unpivoted results, replace last select with:
-- select * from pivoted;
-- select * from unpivoted;
OutputSuccess
Important Notes

The surrogate_key macro returns a string combining columns, useful for unique IDs.

Pivot and unpivot help reshape data but require careful column and value selection.

Always check your data after pivot/unpivot to ensure it matches your analysis needs.

Summary

surrogate_key creates unique IDs by combining columns.

pivot turns rows into columns to summarize data.

unpivot turns columns into rows to reshape data.