Concept Flow - dbt-utils (surrogate_key, pivot, unpivot)

Start with raw data

↓

Use surrogate_key to create unique IDs

↓

Apply pivot to transform rows into columns

↓

Apply unpivot to transform columns back to rows

↓

Final transformed dataset

This flow shows how raw data is first given unique IDs with surrogate_key, then reshaped wide with pivot, and finally reshaped long with unpivot.

Execution Sample

dbt

with source as (
  select * from raw_data
),
keyed as (
  select {{ dbt_utils.surrogate_key(['user_id', 'date']) }} as id, * from source
),
pivoted as (
  select * from {{ dbt_utils.pivot(
    source=ref('keyed'),
    pivot_column='metric',
    value_column='value'
  ) }}
),
unpivoted as (
  select * from {{ dbt_utils.unpivot(
    source=ref('pivoted'),
    column_name='metric',
    value_name='value',
    columns=['clicks', 'views']
  ) }}
)
select * from unpivoted

This code creates unique keys, pivots metrics into columns, then unpivots them back to rows.

Execution Table

Step	Action	Input Data Sample	Output Data Sample	Notes
1	Read raw_data	[{user_id:1, date:'2024-01-01', metric:'clicks', value:10}, {user_id:1, date:'2024-01-01', metric:'views', value:100}]	Same as input	Starting raw data with user metrics
2	Apply surrogate_key(['user_id','date'])	Same as step 1	[{id:'abc123', user_id:1, date:'2024-01-01', metric:'clicks', value:10}, {id:'abc123', user_id:1, date:'2024-01-01', metric:'views', value:100}]	Unique id 'abc123' created for user_id and date
3	Pivot on 'metric' column	Step 2 output	[{id:'abc123', user_id:1, date:'2024-01-01', clicks:10, views:100}]	Rows turned into columns by metric names
4	Unpivot columns ['clicks','views']	Step 3 output	[{id:'abc123', user_id:1, date:'2024-01-01', metric:'clicks', value:10}, {id:'abc123', user_id:1, date:'2024-01-01', metric:'views', value:100}]	Columns turned back into rows
5	End	Step 4 output	Same as step 4	Transformation complete

💡 All steps executed, data transformed from raw to keyed, pivoted, then unpivoted.

Variable Tracker

Variable	Start	After Step 2	After Step 3	After Step 4	Final
data_rows	Raw rows with user_id, date, metric, value	Rows with added 'id' column (surrogate_key)	Rows pivoted to wide format with metric columns	Rows unpivoted back to long format	Final unpivoted rows

Key Moments - 3 Insights

Why does surrogate_key create the same id for rows with the same user_id and date?

What happens to the 'metric' column after pivot?

Why do we use unpivot after pivot?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at step 2, what does the 'id' column represent?

AThe original user_id value

BA unique identifier created by combining user_id and date

CA random number assigned to each row

DThe metric name

Concept Snapshot

dbt-utils surrogate_key creates unique IDs by hashing columns.
Pivot turns row values into columns.
Unpivot reverses pivot, turning columns back to rows.
Use surrogate_key for stable keys.
Pivot/unpivot reshape data for analysis.

Full Transcript

This lesson shows how to use dbt-utils macros surrogate_key, pivot, and unpivot. We start with raw data containing user metrics. Surrogate_key creates a unique ID by combining user_id and date. Then pivot reshapes the data by turning metric names into columns, making it wide format. Finally, unpivot reverses this, turning columns back into rows for long format. The execution table traces each step with sample data. Key moments clarify why surrogate_key produces the same ID for matching columns, how pivot changes the metric column into headers, and why unpivot is used after pivot. The visual quiz tests understanding of these steps. The snapshot summarizes the key points for quick reference.