0
0
Snowflakecloud~30 mins

Data lineage tracking in Snowflake - Mini Project: Build & Apply

Choose your learning style9 modes available
Data Lineage Tracking in Snowflake
📖 Scenario: You work as a data engineer in a company that uses Snowflake for data warehousing. Your team wants to track the flow of data between tables to understand where data comes from and where it goes. This helps in debugging and auditing data pipelines.
🎯 Goal: Build a simple data lineage tracking setup in Snowflake by creating tables, adding a configuration variable, writing a query to capture lineage, and finally creating a view to show the lineage.
📋 What You'll Learn
Create a source table with sample data
Create a target table to receive data
Add a configuration variable to hold the current timestamp
Write a query to insert data from source to target while capturing lineage
Create a view that shows the lineage information
💡 Why This Matters
🌍 Real World
Tracking data lineage helps companies understand data flow, troubleshoot issues, and comply with regulations.
💼 Career
Data engineers and analysts use lineage tracking to maintain data quality and transparency in data pipelines.
Progress0 / 4 steps
1
Create source and target tables
Create a table called source_data with columns id INT and value STRING. Insert three rows with ids 1, 2, 3 and values 'A', 'B', 'C'. Then create an empty table called target_data with the same columns.
Snowflake
Need a hint?

Use CREATE OR REPLACE TABLE to create tables and INSERT INTO to add rows.

2
Add a configuration variable for timestamp
Create a variable called current_time and set it to the current timestamp using CURRENT_TIMESTAMP().
Snowflake
Need a hint?

Use SET variable_name = value; to create a variable in Snowflake.

3
Insert data with lineage tracking
Insert data from source_data into target_data. Add a new column load_time to target_data to store the value of current_time for lineage tracking. Alter target_data to add load_time TIMESTAMP before inserting.
Snowflake
Need a hint?

Use ALTER TABLE to add a column. Use INSERT INTO ... SELECT ... to copy data with the timestamp.

4
Create a view to show data lineage
Create a view called data_lineage_view that selects id, value, and load_time from target_data to show the lineage information.
Snowflake
Need a hint?

Use CREATE OR REPLACE VIEW view_name AS SELECT ... to create the view.