0
0
Snowflakecloud~10 mins

File formats (CSV, JSON, Parquet, Avro) in Snowflake - Step-by-Step Execution

Choose your learning style9 modes available
Process Flow - File formats (CSV, JSON, Parquet, Avro)
Start: Data to store
Choose file format
Data saved in chosen format
Data loaded by Snowflake
Data parsed and used in queries
End
This flow shows how data is saved in different file formats and then loaded and used in Snowflake.
Execution Sample
Snowflake
CREATE OR REPLACE FILE FORMAT my_csv_format
  TYPE = 'CSV'
  FIELD_DELIMITER = ','
  SKIP_HEADER = 1;

COPY INTO my_table FROM @my_stage FILE_FORMAT = (FORMAT_NAME = 'my_csv_format');
This code creates a CSV file format and loads data from a stage into a table using that format.
Process Table
StepActionFile FormatEffectSnowflake Behavior
1Create CSV file formatCSVDefines delimiter and header skipSnowflake knows how to parse CSV files
2Load data from stageCSVReads CSV files with defined formatData is parsed into table rows
3Query dataCSVUses parsed dataReturns table rows for queries
4Create JSON file formatJSONDefines JSON parsing rulesSnowflake parses JSON objects
5Load JSON dataJSONReads JSON filesData loaded as semi-structured data
6Create Parquet file formatParquetDefines columnar binary formatSnowflake reads efficient columnar data
7Load Parquet dataParquetReads Parquet filesData loaded with schema and compression
8Create Avro file formatAvroDefines schema-based binary formatSnowflake reads Avro files with schema
9Load Avro dataAvroReads Avro filesData loaded with schema validation
10Query all dataAllUses internal parsingSnowflake returns data rows correctly
11End--Data ready for use in Snowflake
💡 All data formats are parsed and loaded successfully for querying in Snowflake.
Status Tracker
VariableStartAfter CSV LoadAfter JSON LoadAfter Parquet LoadAfter Avro LoadFinal
Data in StageRaw filesCSV filesJSON filesParquet filesAvro filesAll files ready
File Format ObjectNoneCSV format definedJSON format definedParquet format definedAvro format definedAll formats defined
Table DataEmptyCSV data loadedJSON data loadedParquet data loadedAvro data loadedAll data loaded
Key Moments - 3 Insights
Why do we need to define a file format before loading data?
Snowflake needs to know how to read the file correctly. For example, CSV files need delimiter info, JSON needs parsing rules. See execution_table rows 1 and 4.
What happens if the file format does not match the actual file?
Snowflake will fail to parse the data correctly, causing errors or wrong data. This is why defining the correct format is important (rows 2, 5, 7, 9).
How does Snowflake handle different file formats internally?
Snowflake uses specific parsers for each format to convert files into table rows for queries, as shown in rows 3, 5, 7, 9, and 10.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, at which step is the Parquet file format created?
AStep 8
BStep 3
CStep 6
DStep 4
💡 Hint
Check the 'Action' and 'File Format' columns in execution_table rows.
According to variable_tracker, what is the state of 'Table Data' after JSON load?
AEmpty
BJSON data loaded
CCSV data loaded
DAll data loaded
💡 Hint
Look at the 'Table Data' row and the 'After JSON Load' column in variable_tracker.
If the CSV file format is not defined correctly, what will happen during loading?
ALoading will fail or data will be incorrect
BData will load but queries will be slow
CSnowflake will guess the format and load data correctly
DSnowflake will convert CSV to JSON automatically
💡 Hint
Refer to key_moments about file format importance and execution_table steps 1 and 2.
Concept Snapshot
File formats in Snowflake:
- Define file format objects (CSV, JSON, Parquet, Avro)
- Specify parsing rules (delimiter, schema, etc.)
- Load data from stage using file format
- Snowflake parses files into table rows
- Correct format ensures successful data load and query
Full Transcript
This visual execution shows how Snowflake handles different file formats: CSV, JSON, Parquet, and Avro. First, you create a file format object that tells Snowflake how to read the files. For example, CSV needs delimiter info, JSON needs parsing rules, Parquet and Avro use schemas. Then, you load data from a stage using the defined format. Snowflake parses the files and loads data into tables. Finally, you can query the data as normal. The execution table traces each step from creating formats to loading and querying data. The variable tracker shows how data and formats change state during execution. Key moments clarify why defining the correct format is crucial and what happens if formats don't match. The quiz tests understanding of steps and states. This helps beginners see how Snowflake processes different file formats step-by-step.