0
0
GcpHow-ToBeginner · 4 min read

How to Schedule Queries in BigQuery Easily

To schedule a query in BigQuery, use Cloud Scheduler to trigger a Cloud Function or Cloud Run service that runs your query. Alternatively, use BigQuery Scheduled Queries in the UI or API to automate query execution at set times.
📐

Syntax

BigQuery scheduled queries use a simple setup where you define the query, destination table, and schedule frequency. The key parts are:

  • query: The SQL statement to run.
  • destination_table: Where results are saved.
  • schedule: Cron-style expression for timing.
  • write_disposition: How to handle existing data (e.g., append or overwrite).

You can create scheduled queries via the BigQuery UI, CLI, or API.

bash
bq query --use_legacy_sql=false --destination_table=project.dataset.table --schedule='every 24 hours' --display_name='Daily Query' 'SELECT * FROM `project.dataset.source_table` WHERE DATE(timestamp) = CURRENT_DATE() - 1'
💻

Example

This example shows how to create a scheduled query using the bq command-line tool that runs daily and saves results to a table.

bash
bq query \
  --use_legacy_sql=false \
  --destination_table=myproject.mydataset.daily_results \
  --schedule='every 24 hours' \
  --display_name='Daily Sales Summary' \
  'SELECT product_id, SUM(sales) AS total_sales FROM `myproject.mydataset.sales` WHERE DATE(sale_date) = CURRENT_DATE() - 1 GROUP BY product_id'
Output
Scheduled query 'Daily Sales Summary' created successfully.
⚠️

Common Pitfalls

Common mistakes when scheduling BigQuery queries include:

  • Using legacy SQL instead of standard SQL (always use --use_legacy_sql=false).
  • Not specifying a destination table, causing query results to be lost.
  • Incorrect cron syntax in the schedule expression.
  • Not setting proper permissions for the scheduler or service account.

Always test your query manually before scheduling.

bash
bq query --use_legacy_sql=true --destination_table=myproject.mydataset.results --schedule='every 24 hours' 'SELECT * FROM `myproject.mydataset.table`'

# Wrong: legacy SQL used, may cause errors

bq query --use_legacy_sql=false --destination_table=myproject.mydataset.results --schedule='every 24 hours' 'SELECT * FROM `myproject.mydataset.table`'

# Correct: standard SQL enabled
📊

Quick Reference

Here is a quick summary of scheduling queries in BigQuery:

StepDescription
Write SQL queryCreate the query you want to run regularly.
Choose destinationSet the table where results will be saved.
Set scheduleUse cron syntax or presets like 'every 24 hours'.
Use standard SQLAlways set --use_legacy_sql=false.
Create scheduled queryUse BigQuery UI, CLI, or API to schedule.
Check permissionsEnsure scheduler has access to run queries.

Key Takeaways

Use BigQuery Scheduled Queries feature or Cloud Scheduler with Cloud Functions to automate queries.
Always use standard SQL by setting --use_legacy_sql=false when scheduling queries.
Specify a destination table to save query results and avoid data loss.
Test your query manually before scheduling to ensure it runs correctly.
Set correct permissions for the service account running the scheduled query.