Query history and profiling in Snowflake - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When we look at query history and profiling in Snowflake, we want to know how the time to get this data changes as we ask for more records.
We ask: How does the effort grow when we request more query history entries?
Analyze the time complexity of the following operation sequence.
SELECT query_id, user_name, start_time, total_elapsed_time
FROM snowflake.account_usage.query_history
WHERE start_time > DATEADD(day, -7, CURRENT_TIMESTAMP())
ORDER BY start_time DESC
LIMIT 1000;
This query fetches the last 1000 queries run in the past week, ordered by start time.
Identify the API calls, resource provisioning, data transfers that repeat.
- Primary operation: Reading rows from the query_history table in the account_usage schema.
- How many times: The system scans or filters rows for the past 7 days, then returns up to the requested limit (e.g., 1000 rows).
As you ask for more query history rows, the system reads more data to find and return those rows.
| Input Size (n) | Approx. API Calls/Operations |
|---|---|
| 10 | Reads enough rows to find 10 recent queries. |
| 100 | Reads more rows to find 100 recent queries. |
| 1000 | Reads even more rows to find 1000 recent queries. |
Pattern observation: The work grows roughly in direct proportion to how many rows you request.
Time Complexity: O(n)
This means the time to get query history grows linearly with the number of rows requested.
[X] Wrong: "Getting query history is always instant no matter how many rows I ask for."
[OK] Correct: More rows mean more data to scan and transfer, so it takes more time.
Understanding how data retrieval scales helps you design efficient monitoring and troubleshooting tools in real projects.
"What if we added a filter on user_name to only get queries from one user? How would the time complexity change?"
Practice
QUERY_HISTORY view in Snowflake?Solution
Step 1: Understand the role of QUERY_HISTORY
The QUERY_HISTORY view stores information about queries that have already run, including their text, execution time, and status. Its main purpose is to see details of past queries executed in the system.Final Answer:
To see details of past queries executed in the system -> Option AQuick Check:
QUERY_HISTORY = past query details [OK]
- Confusing QUERY_HISTORY with user management
- Thinking it manages network or security settings
- Assuming it creates or modifies database objects
QUERY_HISTORY view?Solution
Step 1: Recall SQL filtering syntax
To filter rows in SQL, the WHERE clause is used with a condition like USER_NAME = 'value'. WHERE USER_NAME = 'john_doe' is valid and standard SQL syntax.Final Answer:
WHERE USER_NAME = 'john_doe' -> Option BQuick Check:
Filter with WHERE clause = WHERE USER_NAME = 'john_doe' [OK]
- Using FILTER BY instead of WHERE
- Misplacing HAVING without GROUP BY
- Incorrect SELECT syntax without WHERE
SELECT query_text, total_elapsed_time FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY()) WHERE execution_status = 'FAILED' ORDER BY start_time DESC LIMIT 1;
What does this query return?
Solution
Step 1: Analyze the query clauses
The query uses TABLE(INFORMATION_SCHEMA.QUERY_HISTORY()) to access query history, filters for execution_status = 'FAILED', orders by start_time DESC (most recent first), and limits to 1 row, returning the most recent failed query's text and total elapsed time.Final Answer:
The most recent failed query's text and its total elapsed time -> Option AQuick Check:
Filter failed + order desc + limit 1 = most recent failed query [OK]
- Thinking QUERY_HISTORY is a normal table
- Confusing oldest vs most recent due to ORDER BY
- Ignoring the WHERE filter on execution_status
SELECT query_text FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY()) WHERE total_elapsed_time > 1000;
What is the likely issue?
Solution
Step 1: Check the unit of total_elapsed_time
In Snowflake, total_elapsed_time is measured in microseconds, so 1000 microseconds (1 millisecond) is too small a threshold, and few or no queries exceed it, resulting in no results.Final Answer:
total_elapsed_time is in microseconds, so 1000 is too small a threshold -> Option DQuick Check:
Elapsed time unit = microseconds, threshold too low [OK]
- Assuming elapsed time is in seconds or milliseconds
- Thinking QUERY_HISTORY lacks columns
- Misusing TABLE() function syntax
Solution
Step 1: Identify correct aggregation and grouping
To get average execution time per user, use AVG(total_elapsed_time) with GROUP BY user_name from TABLE(INFORMATION_SCHEMA.QUERY_HISTORY()), as in SELECT user_name, AVG(total_elapsed_time) AS avg_time ... GROUP BY user_name ORDER BY avg_time DESC.Final Answer:
SELECT user_name, AVG(total_elapsed_time) AS avg_time FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY()) GROUP BY user_name ORDER BY avg_time DESC; -> Option CQuick Check:
Group by user + AVG + ORDER BY avg_time = SELECT user_name, AVG(total_elapsed_time) AS avg_time FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY()) GROUP BY user_name ORDER BY avg_time DESC; [OK]
- Missing GROUP BY when using aggregation
- Selecting columns without aggregation
- Not using TABLE() function for QUERY_HISTORY
