How to Use BigQuery ML: Simple Guide with Examples
Use
CREATE MODEL in BigQuery ML to build machine learning models with SQL. Then, use ML.PREDICT to get predictions from your model directly in BigQuery without moving data.Syntax
The basic syntax to create a model in BigQuery ML is:
CREATE MODEL: starts the model creation.model_name: the name you give your model.OPTIONS(): settings like model type and input features.AS SELECT: the SQL query that selects training data.
To predict, use ML.PREDICT with your model and input data.
sql
CREATE MODEL `project.dataset.model_name` OPTIONS( model_type='linear_reg', input_label_cols=['target_column'] ) AS SELECT feature1, feature2, target_column FROM `project.dataset.training_table`;
Example
This example creates a linear regression model to predict a target value using two features, then shows how to get predictions.
sql
CREATE MODEL `mydataset.my_linear_model` OPTIONS( model_type='linear_reg', input_label_cols=['target'] ) AS SELECT feature1, feature2, target FROM `mydataset.training_data`; SELECT * FROM ML.PREDICT(MODEL `mydataset.my_linear_model`, (SELECT feature1, feature2 FROM `mydataset.new_data`));
Output
feature1 | feature2 | predicted_target
-------- | -------- | ----------------
10 | 5 | 23.4
7 | 3 | 17.8
12 | 6 | 26.1
Common Pitfalls
Common mistakes include:
- Not specifying
input_label_colsfor supervised models causes errors. - Using unsupported model types or wrong options.
- Training data with missing or null values can cause model failure.
- Trying to predict with features not in the training data.
Always check your data and options before running.
sql
/* Wrong: Missing input_label_cols */ CREATE MODEL `mydataset.bad_model` OPTIONS( model_type='linear_reg' ) AS SELECT feature1, feature2, target FROM `mydataset.training_data`; /* Right: Include input_label_cols */ CREATE MODEL `mydataset.good_model` OPTIONS( model_type='linear_reg', input_label_cols=['target'] ) AS SELECT feature1, feature2, target FROM `mydataset.training_data`;
Quick Reference
| Command | Purpose | Example |
|---|---|---|
| CREATE MODEL | Builds a machine learning model | CREATE MODEL `dataset.model` OPTIONS(model_type='linear_reg', input_label_cols=['target']) AS SELECT * FROM `dataset.table` |
| ML.PREDICT | Generates predictions from a model | SELECT * FROM ML.PREDICT(MODEL `dataset.model`, (SELECT * FROM `dataset.new_data`)) |
| DROP MODEL | Deletes a model | DROP MODEL `dataset.model` |
| SHOW MODELS | Lists models in a dataset | SHOW MODELS IN `dataset` |
Key Takeaways
Use SQL commands like CREATE MODEL and ML.PREDICT to build and use models directly in BigQuery.
Always specify the label column with input_label_cols for supervised learning models.
Check your training data for missing values and matching feature columns.
BigQuery ML supports multiple model types like linear regression and classification.
You can manage models with commands like DROP MODEL and SHOW MODELS.