Data classification and tagging in Snowflake - Time & Space Complexity
When we classify and tag data in Snowflake, we want to know how the time it takes changes as we add more data.
We ask: How does the work grow when the amount of data grows?
Analyze the time complexity of the following operation sequence.
-- Create a tag
CREATE TAG IF NOT EXISTS sensitive_data;
-- Apply tag to multiple tables
DECLARE tables CURSOR FOR SELECT TABLE_NAME FROM information_schema.tables WHERE table_schema = 'SALES';
DECLARE table_name STRING;
BEGIN
OPEN tables;
LOOP
FETCH tables INTO table_name;
IF (SQLCODE != 0) THEN
LEAVE;
END IF;
EXECUTE IMMEDIATE 'ALTER TABLE IDENTIFIER(?, ?) SET TAG sensitive_data = ''true''' USING 'SALES', table_name;
END LOOP;
CLOSE tables;
END;
This sequence creates a tag and applies it to every table in a schema.
Identify the API calls, resource provisioning, data transfers that repeat.
- Primary operation: Applying the tag to each table with ALTER TABLE SET TAG.
- How many times: Once for each table in the schema.
As the number of tables grows, the number of tag applications grows the same way.
| Input Size (n) | Approx. Api Calls/Operations |
|---|---|
| 10 | 10 ALTER TABLE SET TAG calls |
| 100 | 100 ALTER TABLE SET TAG calls |
| 1000 | 1000 ALTER TABLE SET TAG calls |
Pattern observation: The number of operations grows directly with the number of tables.
Time Complexity: O(n)
This means the time to tag all tables grows in a straight line as the number of tables increases.
[X] Wrong: "Applying a tag to many tables happens all at once, so time stays the same no matter how many tables."
[OK] Correct: Each table needs its own tag operation, so more tables mean more work and more time.
Understanding how tagging scales helps you plan and explain data governance tasks clearly in real projects.
"What if we tagged columns instead of tables? How would the time complexity change?"