How to Load Data into BigQuery: Simple Steps and Examples
To load data into
BigQuery, you can use the bq load command, the BigQuery web UI, or client libraries to import files like CSV or JSON into a table. You specify the source data, target dataset, and table, and BigQuery handles the rest.Syntax
The basic syntax to load data using the bq command-line tool is:
bq load [OPTIONS] dataset.table source_uri schema
Where:
dataset.tableis the target BigQuery table.source_uriis the path to your data file (local or Cloud Storage).schemadefines the table columns and types.OPTIONScan include format type, write disposition, and more.
bash
bq load --source_format=CSV mydataset.mytable gs://mybucket/myfile.csv name:STRING,age:INTEGERExample
This example loads a CSV file from Google Cloud Storage into a BigQuery table named users in the mydataset dataset. The CSV has two columns: name (text) and age (number).
bash
bq load --source_format=CSV mydataset.users gs://mybucket/users.csv name:STRING,age:INTEGEROutput
Waiting on bqjob_r1234567890_000001... Current status: DONE
Common Pitfalls
Common mistakes when loading data into BigQuery include:
- Not specifying the correct schema, causing load failures.
- Using the wrong file format option (e.g., CSV vs JSON).
- Trying to load data into a non-existent dataset or table.
- Not having proper permissions to access the source file or BigQuery.
- Overwriting data unintentionally by not setting the write disposition.
Always check your schema matches your data and use --replace or --append_table options carefully.
bash
bq load mydataset.users gs://mybucket/users.csv name,age # Wrong: missing schema causes error bq load --source_format=CSV --replace mydataset.users gs://mybucket/users.csv name:STRING,age:INTEGER # Right: schema specified and replace mode used
Quick Reference
| Command/Option | Description |
|---|---|
| bq load | Command to load data into BigQuery |
| --source_format=CSV|NEWLINE_DELIMITED_JSON|AVRO | Specify the format of the source data |
| dataset.table | Target dataset and table name |
| source_uri | Path to data file (local or gs://) |
| schema | Table schema in column:type format |
| --replace | Overwrite existing table data |
| --append | Add data to existing table |
Key Takeaways
Use the bq load command with correct dataset, table, source, and schema to load data.
Always specify the source data format to avoid load errors.
Check your schema matches the data columns and types exactly.
Use write disposition options to control overwriting or appending data.
Ensure you have permissions for both source data and BigQuery target.