0
0
Snowflakecloud~20 mins

Data lineage tracking in Snowflake - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Data Lineage Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding Data Lineage Purpose

What is the primary purpose of data lineage tracking in a Snowflake data warehouse?

ATo automatically scale compute resources based on query load.
BTo encrypt data at rest and in transit within Snowflake.
CTo monitor and visualize the flow of data from source to destination for auditing and troubleshooting.
DTo create backups of data at regular intervals.
Attempts:
2 left
💡 Hint

Think about why knowing where data comes from and where it goes is important.

Architecture
intermediate
2:00remaining
Snowflake Data Lineage Components

Which Snowflake feature is essential to capture detailed data lineage information automatically?

ASnowflake Access History
BSnowflake Query History
CSnowflake Tasks
DSnowflake Streams
Attempts:
2 left
💡 Hint

Consider which feature records executed queries and their details.

Configuration
advanced
3:00remaining
Query to Extract Table Lineage

Given Snowflake's QUERY_HISTORY view, which SQL query correctly extracts the source tables used in queries for lineage tracking?

Snowflake
SELECT query_id, array_agg(DISTINCT referenced_object_name) AS source_tables FROM snowflake.account_usage.query_history WHERE start_time > CURRENT_TIMESTAMP() - INTERVAL '1 DAY' GROUP BY query_id;
ASELECT query_id, array_agg(DISTINCT referenced_object_name) AS source_tables FROM snowflake.account_usage.query_history WHERE start_time > CURRENT_TIMESTAMP() - INTERVAL '1 DAY' GROUP BY query_id;
BSELECT query_id, array_agg(DISTINCT table_name) AS source_tables FROM snowflake.account_usage.query_history WHERE start_time > CURRENT_TIMESTAMP() - INTERVAL '1 DAY' GROUP BY query_id;
CSELECT query_id, array_agg(DISTINCT referenced_table) AS source_tables FROM snowflake.account_usage.query_history WHERE start_time > CURRENT_TIMESTAMP() - INTERVAL '1 DAY' GROUP BY query_id;
DSELECT query_id, array_agg(DISTINCT source_table) AS source_tables FROM snowflake.account_usage.query_history WHERE start_time > CURRENT_TIMESTAMP() - INTERVAL '1 DAY' GROUP BY query_id;
Attempts:
2 left
💡 Hint

Check the exact column names in the QUERY_HISTORY view for referenced tables.

security
advanced
2:00remaining
Securing Data Lineage Metadata

Which Snowflake feature should you use to restrict access to sensitive data lineage metadata to only authorized users?

ARow Access Policies
BNetwork Policies
CResource Monitors
DVirtual Warehouses
Attempts:
2 left
💡 Hint

Think about controlling access to specific rows or data within tables.

service_behavior
expert
2:30remaining
Impact of Query History Retention on Lineage

Snowflake retains query history for 365 days by default. What is the impact on data lineage tracking if this retention period is reduced to 7 days?

ALineage tracking will stop completely because query history is mandatory for lineage.
BLineage tracking will continue unaffected because lineage data is stored separately.
CLineage tracking will improve because less data means faster processing.
DLineage tracking will only be possible for queries executed within the last 7 days, limiting historical analysis.
Attempts:
2 left
💡 Hint

Consider how query history retention affects available lineage data.