0
0
Snowflakecloud~5 mins

ML model training in Snowflake - Commands & Configuration

Choose your learning style9 modes available
Introduction
Training machine learning models inside Snowflake lets you use your data directly without moving it. This saves time and keeps data safe while building models.
When you want to build a prediction model using data stored in Snowflake without exporting it.
When you need to quickly test a simple machine learning model on your data inside Snowflake.
When you want to automate model training as part of your data pipeline in Snowflake.
When you want to keep your data secure by not moving it outside Snowflake during model training.
When you want to use SQL commands to create and train machine learning models.
Commands
This command creates and trains a linear regression model named 'my_model' using the features and target column from the 'training_data' table.
Terminal
CREATE OR REPLACE MODEL my_model
  OPTIONS(
    model_type = 'linear_regression',
    input_label_cols = ('target')
  ) AS
  SELECT feature1, feature2, target
  FROM training_data;
Expected OutputExpected
Model MY_MODEL created successfully.
model_type - Specifies the type of machine learning model to train.
input_label_cols - Defines the target column(s) the model will predict.
This command lists all the machine learning models currently stored in your Snowflake account.
Terminal
SHOW MODELS;
Expected OutputExpected
name | database | schema | created_on MY_MODEL | MY_DB | PUBLIC | 2024-06-01 12:00:00
This command uses the trained model 'my_model' to predict the target value for a new data point with feature1=5.1 and feature2=3.2.
Terminal
SELECT MODEL_PREDICT('my_model', OBJECT_CONSTRUCT('feature1', 5.1, 'feature2', 3.2)) AS prediction;
Expected OutputExpected
prediction 42.7
Key Concept

If you remember nothing else from this pattern, remember: Snowflake lets you train and use machine learning models directly with SQL on your data without moving it.

Common Mistakes
Not specifying the correct target column in input_label_cols option.
The model won't know which column to predict, causing training to fail or produce wrong results.
Always set input_label_cols to the exact name of the target column in your training data.
Using unsupported model types or misspelling model_type.
Snowflake will reject the command because it only supports certain model types like 'linear_regression' or 'logistic_regression'.
Check Snowflake documentation for supported model types and use exact spelling.
Passing feature values incorrectly in MODEL_PREDICT function.
MODEL_PREDICT expects an OBJECT_CONSTRUCT with feature names and values; wrong format causes errors.
Use OBJECT_CONSTRUCT with feature names as keys and their values as shown in the example.
Summary
Use CREATE MODEL with SQL SELECT to train a machine learning model inside Snowflake.
Use SHOW MODELS to list your trained models.
Use MODEL_PREDICT with OBJECT_CONSTRUCT to make predictions with your model.