0
0
Snowflakecloud~30 mins

Clustering keys for large tables in Snowflake - Mini Project: Build & Apply

Choose your learning style9 modes available
Clustering Keys for Large Tables in Snowflake
📖 Scenario: You work as a data engineer managing a large Snowflake table that stores sales data. Queries on this table are slow because the data is not organized efficiently. To improve query performance, you will add clustering keys to the table.
🎯 Goal: Build a Snowflake table with clustering keys on specific columns to optimize query performance on large datasets.
📋 What You'll Learn
Create a table named sales_data with columns sale_id, sale_date, region, and amount.
Add a clustering key on the sale_date column.
Add a clustering key on both sale_date and region columns.
Alter the table to add clustering keys after creation.
💡 Why This Matters
🌍 Real World
Clustering keys help speed up queries on large tables by organizing data physically on disk, reducing scan times.
💼 Career
Data engineers and cloud architects use clustering keys to optimize data warehouse performance and reduce costs.
Progress0 / 4 steps
1
Create the sales_data table
Create a table called sales_data with these columns and types exactly: sale_id INT, sale_date DATE, region STRING, and amount FLOAT.
Snowflake
Need a hint?

Use the CREATE TABLE statement with the exact column names and types.

2
Add a clustering key on sale_date
Create a new table called sales_data_clustered with the same columns as sales_data and add a clustering key on the sale_date column using the CLUSTER BY clause.
Snowflake
Need a hint?

Use CLUSTER BY (sale_date) after the column definitions.

3
Add a clustering key on sale_date and region
Create a table called sales_data_multi_clustered with the same columns and add a clustering key on both sale_date and region columns using CLUSTER BY.
Snowflake
Need a hint?

List both columns inside CLUSTER BY separated by a comma.

4
Alter the sales_data table to add clustering keys
Write an ALTER TABLE statement to add clustering keys on sale_date and region columns to the existing sales_data table using ALTER TABLE ... CLUSTER BY.
Snowflake
Need a hint?

Use ALTER TABLE sales_data CLUSTER BY (sale_date, region); to add clustering keys after table creation.