0
0
dbtdata~10 mins

dbt-utils (surrogate_key, pivot, unpivot) - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - dbt-utils (surrogate_key, pivot, unpivot)
Start with raw data
Use surrogate_key to create unique IDs
Apply pivot to transform rows into columns
Apply unpivot to transform columns back to rows
Final transformed dataset
This flow shows how raw data is first given unique IDs with surrogate_key, then reshaped wide with pivot, and finally reshaped long with unpivot.
Execution Sample
dbt
with source as (
  select * from raw_data
),
keyed as (
  select {{ dbt_utils.surrogate_key(['user_id', 'date']) }} as id, * from source
),
pivoted as (
  select * from {{ dbt_utils.pivot(
    source=ref('keyed'),
    pivot_column='metric',
    value_column='value'
  ) }}
),
unpivoted as (
  select * from {{ dbt_utils.unpivot(
    source=ref('pivoted'),
    column_name='metric',
    value_name='value',
    columns=['clicks', 'views']
  ) }}
)
select * from unpivoted
This code creates unique keys, pivots metrics into columns, then unpivots them back to rows.
Execution Table
StepActionInput Data SampleOutput Data SampleNotes
1Read raw_data[{user_id:1, date:'2024-01-01', metric:'clicks', value:10}, {user_id:1, date:'2024-01-01', metric:'views', value:100}]Same as inputStarting raw data with user metrics
2Apply surrogate_key(['user_id','date'])Same as step 1[{id:'abc123', user_id:1, date:'2024-01-01', metric:'clicks', value:10}, {id:'abc123', user_id:1, date:'2024-01-01', metric:'views', value:100}]Unique id 'abc123' created for user_id and date
3Pivot on 'metric' columnStep 2 output[{id:'abc123', user_id:1, date:'2024-01-01', clicks:10, views:100}]Rows turned into columns by metric names
4Unpivot columns ['clicks','views']Step 3 output[{id:'abc123', user_id:1, date:'2024-01-01', metric:'clicks', value:10}, {id:'abc123', user_id:1, date:'2024-01-01', metric:'views', value:100}]Columns turned back into rows
5EndStep 4 outputSame as step 4Transformation complete
💡 All steps executed, data transformed from raw to keyed, pivoted, then unpivoted.
Variable Tracker
VariableStartAfter Step 2After Step 3After Step 4Final
data_rowsRaw rows with user_id, date, metric, valueRows with added 'id' column (surrogate_key)Rows pivoted to wide format with metric columnsRows unpivoted back to long formatFinal unpivoted rows
Key Moments - 3 Insights
Why does surrogate_key create the same id for rows with the same user_id and date?
Because surrogate_key hashes the combination of specified columns (user_id and date), so rows sharing those values get the same unique id, as shown in execution_table step 2.
What happens to the 'metric' column after pivot?
The 'metric' values become new column names, so the original 'metric' column disappears and its values become column headers, as seen in execution_table step 3.
Why do we use unpivot after pivot?
Unpivot reverses pivot by turning columns back into rows, useful to restore long format for analysis, shown in execution_table step 4.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 2, what does the 'id' column represent?
AThe original user_id value
BA unique identifier created by combining user_id and date
CA random number assigned to each row
DThe metric name
💡 Hint
Check step 2 in execution_table where surrogate_key creates 'id' from user_id and date
At which step does the data change from long format to wide format?
AStep 3
BStep 2
CStep 1
DStep 4
💡 Hint
Look at execution_table step 3 where pivot turns rows into columns
If we skip the unpivot step, what will the final data look like?
AData will remain in long format with metric and value columns
BData will lose the surrogate_key id
CData will be in wide format with metric columns
DData will be empty
💡 Hint
Refer to execution_table step 3 output before unpivot
Concept Snapshot
dbt-utils surrogate_key creates unique IDs by hashing columns.
Pivot turns row values into columns.
Unpivot reverses pivot, turning columns back to rows.
Use surrogate_key for stable keys.
Pivot/unpivot reshape data for analysis.
Full Transcript
This lesson shows how to use dbt-utils macros surrogate_key, pivot, and unpivot. We start with raw data containing user metrics. Surrogate_key creates a unique ID by combining user_id and date. Then pivot reshapes the data by turning metric names into columns, making it wide format. Finally, unpivot reverses this, turning columns back into rows for long format. The execution table traces each step with sample data. Key moments clarify why surrogate_key produces the same ID for matching columns, how pivot changes the metric column into headers, and why unpivot is used after pivot. The visual quiz tests understanding of these steps. The snapshot summarizes the key points for quick reference.