Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Handling Load Errors in Snowflake
📖 Scenario: You are working with Snowflake to load data from a CSV file into a table. Sometimes, the data has errors like wrong data types or missing values. You want to handle these load errors gracefully so you can see which rows failed and why.
🎯 Goal: Build a Snowflake data loading process that captures load errors into an error table for review.
📋 What You'll Learn
Create a target table called customers with columns id (integer), name (string), and age (integer).
Create a stage called my_stage to hold the CSV file.
Create an error table called load_errors to capture load errors.
Write a COPY INTO command to load data from @my_stage/customers.csv into customers with error handling that stores errors in load_errors.
💡 Why This Matters
🌍 Real World
In real projects, data often has mistakes. Handling load errors helps keep your data clean and lets you fix problems quickly.
💼 Career
Knowing how to handle load errors in Snowflake is important for data engineers and cloud architects to ensure reliable data pipelines.
Progress0 / 4 steps
1
Create the customers table
Create a table called customers with columns id as INTEGER, name as VARCHAR(100), and age as INTEGER.
Snowflake
Hint
Use CREATE OR REPLACE TABLE customers and define the columns with their types.
2
Create the stage my_stage
Create a named internal stage called my_stage to hold your CSV files.
Snowflake
Hint
Use CREATE OR REPLACE STAGE my_stage with a CSV file format.
3
Create the load_errors table
Create a table called load_errors with columns error_line as VARIANT and error_message as STRING to store load error details.
Snowflake
Hint
Use CREATE OR REPLACE TABLE load_errors with columns to hold error data.
4
Load data with error handling
Write a COPY INTO customers command to load data from @my_stage/customers.csv. Use the ON_ERROR = 'CONTINUE' option and specify LOAD_ERRORS = 'load_errors' to capture errors in the load_errors table.
Snowflake
Hint
Use COPY INTO customers FROM @my_stage/customers.csv with ON_ERROR = 'CONTINUE' and LOAD_ERRORS = 'load_errors'.
Practice
(1/5)
1. What does the ON_ERROR option do when loading data into Snowflake?
easy
A. It controls how Snowflake handles errors during data loading.
B. It encrypts the data before loading.
C. It speeds up the data loading process.
D. It automatically deletes duplicate rows.
Solution
Step 1: Understand the purpose of ON_ERROR
The ON_ERROR option is used to specify how Snowflake should behave when it encounters errors during data loading.
Step 2: Identify the correct behavior
It can skip bad rows, skip files, or stop the load depending on the setting, thus controlling error handling.
Final Answer:
It controls how Snowflake handles errors during data loading. -> Option A
Quick Check:
ON_ERROR controls error handling [OK]
Hint: ON_ERROR sets error handling behavior during load [OK]
Common Mistakes:
Confusing ON_ERROR with encryption settings
Thinking ON_ERROR speeds up loading
Assuming ON_ERROR deletes duplicates
2. Which of the following is the correct syntax to skip bad rows during a Snowflake COPY INTO command?
easy
A. COPY INTO table_name FROM @stage FILE_FORMAT = (TYPE = 'CSV') ON_ERROR = 'ignore_error';
B. COPY INTO table_name FROM @stage FILE_FORMAT = (TYPE = 'CSV') ON_ERROR = 'skip_file';
C. COPY INTO table_name FROM @stage FILE_FORMAT = (TYPE = 'CSV') ON_ERROR = 'abort_load';
D. COPY INTO table_name FROM @stage FILE_FORMAT = (TYPE = 'CSV') ON_ERROR = 'skip_row';
Solution
Step 1: Recall valid ON_ERROR options
Snowflake supports options like 'skip_file', 'skip_row', and 'abort_load' for ON_ERROR.
Step 2: Identify option to skip bad rows
'skip_row' tells Snowflake to skip only the bad rows, not the entire file.
Final Answer:
ON_ERROR = 'skip_row' -> Option D
Quick Check:
Skip bad rows = skip_row [OK]
Hint: Use 'skip_row' to skip bad rows in ON_ERROR [OK]
Common Mistakes:
Using 'skip_file' to skip rows instead of files
Using invalid ON_ERROR values like 'ignore_error'
Confusing 'abort_load' with skipping errors
3. Given this COPY command:
COPY INTO my_table FROM @my_stage FILE_FORMAT = (TYPE = 'CSV') ON_ERROR = 'skip_file';
If one file has 5 bad rows, what happens?
medium
A. The entire file with bad rows is skipped, other files load normally.
B. Only the 5 bad rows are skipped, rest of the file loads.
C. The load stops immediately with an error.
D. All files are skipped regardless of errors.
Solution
Step 1: Understand ON_ERROR = 'skip_file'
This option skips the entire file if any error occurs in it.
Step 2: Apply to scenario
Since one file has 5 bad rows, Snowflake skips that whole file but continues loading other files.
Final Answer:
The entire file with bad rows is skipped, other files load normally. -> Option A
Quick Check:
skip_file skips whole file on error [OK]
Hint: 'skip_file' skips whole file if any error found [OK]
Common Mistakes:
Thinking only bad rows are skipped with 'skip_file'
Assuming load stops on first error
Believing all files skip on one bad file
4. You run a COPY INTO command with ON_ERROR = 'skip_row' but still see the load failing. What is a likely cause?
medium
A. ON_ERROR only works for JSON files, not CSV.
B. The file format is incorrect causing parsing errors.
C. You must set ON_ERROR to 'skip_file' to avoid failures.
D. The target table does not exist.
Solution
Step 1: Understand ON_ERROR limitations
ON_ERROR skips bad rows but cannot fix fundamental file format or parsing errors.
Step 2: Identify cause of failure
If file format is wrong, Snowflake cannot parse data, causing load failure despite ON_ERROR.
Final Answer:
The file format is incorrect causing parsing errors. -> Option B
Quick Check:
Wrong file format causes failure despite ON_ERROR [OK]
Hint: Check file format if ON_ERROR skip_row still fails [OK]
Common Mistakes:
Thinking ON_ERROR fixes all errors
Believing ON_ERROR only works for JSON
Ignoring table existence errors
5. You want to load multiple CSV files but skip any file with more than 10 bad rows, while loading others fully. Which ON_ERROR setting should you use?
hard
A. ON_ERROR = 'continue' with MAX_ERROR = 10
B. ON_ERROR = 'skip_row' with MAX_ERROR = 10
C. ON_ERROR = 'skip_file' with MAX_ERROR = 10
D. ON_ERROR = 'abort_load' with MAX_ERROR = 10
Solution
Step 1: Understand requirement
Skip entire files only if bad rows exceed 10, otherwise load fully.
Step 2: Match ON_ERROR and MAX_ERROR
Using ON_ERROR = 'skip_file' with MAX_ERROR = 10 skips files exceeding 10 errors, loads others fully.
Final Answer:
ON_ERROR = 'skip_file' with MAX_ERROR = 10 -> Option C
Quick Check:
Skip files over 10 errors = skip_file + MAX_ERROR [OK]
Hint: Use skip_file with MAX_ERROR to limit bad rows per file [OK]