Snowflakecloud~5 mins

File formats (CSV, JSON, Parquet, Avro) in Snowflake - Commands & Configuration

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Introduction

When you want to load or save data in Snowflake, you use file formats to tell Snowflake how the data is organized. Different file formats like CSV, JSON, Parquet, and Avro help Snowflake understand the data structure so it can read or write it correctly.

When you want to load a simple table from a text file with columns separated by commas, use CSV.

When you have nested or complex data like lists or objects, use JSON.

When you want efficient storage and fast queries on large datasets, use Parquet.

When you need a compact binary format that supports schema evolution, use Avro.

Config File - file_format.sql

file_format.sql

CREATE OR REPLACE FILE FORMAT my_csv_format
  TYPE = 'CSV'
  FIELD_DELIMITER = ','
  SKIP_HEADER = 1
  FIELD_OPTIONALLY_ENCLOSED_BY = '"'
  NULL_IF = ('NULL', 'null');

CREATE OR REPLACE FILE FORMAT my_json_format
  TYPE = 'JSON';

CREATE OR REPLACE FILE FORMAT my_parquet_format
  TYPE = 'PARQUET';

CREATE OR REPLACE FILE FORMAT my_avro_format
  TYPE = 'AVRO';

This SQL script creates four file formats in Snowflake:

my_csv_format: Defines how CSV files are read, including comma as separator, skipping the header row, and handling quoted fields.
my_json_format: Defines JSON file format for semi-structured data.
my_parquet_format: Defines Parquet file format for columnar storage.
my_avro_format: Defines Avro file format for compact binary data.

Commands

This command creates a CSV file format named my_csv_format. It tells Snowflake how to read CSV files with commas, skip the first header row, and treat quoted fields correctly.

Terminal

CREATE OR REPLACE FILE FORMAT my_csv_format TYPE = 'CSV' FIELD_DELIMITER = ',' SKIP_HEADER = 1 FIELD_OPTIONALLY_ENCLOSED_BY = '"' NULL_IF = ('NULL', 'null');

Expected OutputExpected

File format MY_CSV_FORMAT successfully created.

→

TYPE - Specifies the file format type, here CSV.

→

FIELD_DELIMITER - Defines the character that separates fields, here a comma.

→

SKIP_HEADER - Skips the first row, usually headers.

This command creates a JSON file format named my_json_format. It tells Snowflake to expect JSON structured data.

Terminal

CREATE OR REPLACE FILE FORMAT my_json_format TYPE = 'JSON';

Expected OutputExpected

File format MY_JSON_FORMAT successfully created.

→

TYPE - Specifies the file format type, here JSON.

This command creates a Parquet file format named my_parquet_format. Parquet is a columnar storage format good for big data.

Terminal

CREATE OR REPLACE FILE FORMAT my_parquet_format TYPE = 'PARQUET';

Expected OutputExpected

File format MY_PARQUET_FORMAT successfully created.

→

TYPE - Specifies the file format type, here Parquet.

This command creates an Avro file format named my_avro_format. Avro is a compact binary format that supports schema changes.

Terminal

CREATE OR REPLACE FILE FORMAT my_avro_format TYPE = 'AVRO';

Expected OutputExpected

File format MY_AVRO_FORMAT successfully created.

→

TYPE - Specifies the file format type, here Avro.

This command shows details about the CSV file format to verify it was created correctly.

Terminal

DESC FILE FORMAT my_csv_format;

Expected OutputExpected

Key Concept

If you remember nothing else from this pattern, remember: file formats tell Snowflake how to read or write your data files correctly.

Common Mistakes

Not specifying SKIP_HEADER for CSV files with headers

Snowflake will treat the header row as data, causing errors or wrong data loading.

Always set SKIP_HEADER = 1 if your CSV file has a header row.

Using wrong TYPE for the file format

Snowflake will fail to parse the file because it expects a different format.

Choose the correct TYPE like CSV, JSON, PARQUET, or AVRO matching your file.

Not enclosing fields properly in CSV

Fields with commas inside may break parsing if not enclosed by quotes.

Use FIELD_OPTIONALLY_ENCLOSED_BY = '"' to handle quoted fields.

Summary

Create file formats in Snowflake to define how data files are read or written.

Use commands like CREATE FILE FORMAT with TYPE and options for CSV, JSON, Parquet, or Avro.

Verify file formats with DESC FILE FORMAT to ensure correct settings.