When loading data into a Snowflake table using the COPY INTO command, what happens if some rows in the source file violate constraints or have invalid data?
COPY INTO my_table FROM @my_stage/file.csv FILE_FORMAT = (TYPE = 'CSV');Think about how Snowflake handles bad data rows during bulk loading.
Snowflake's COPY INTO command loads all valid rows and skips invalid rows, logging errors separately. This allows partial success without failing the entire load.
Which COPY INTO option should you use to specify where Snowflake writes details about rows that failed to load?
COPY INTO my_table FROM @my_stage/file.csv FILE_FORMAT = (TYPE = 'CSV') ???;Look for the option that controls error reporting without stopping the load.
The VALIDATION_MODE = 'RETURN_ERRORS' option tells Snowflake to validate data and return errors without loading, useful for error inspection.
You want to build a data pipeline that loads CSV files into Snowflake tables. The pipeline must load all valid rows, capture invalid rows separately for review, and automatically retry loading after fixing errors. Which architecture best supports this?
Consider how to handle partial loads and error review in an automated pipeline.
Option D allows loading valid rows while capturing errors separately for review and manual or automated retry after fixes, supporting robustness.
When Snowflake writes error files for failed data loads, what is the best practice to ensure sensitive data in those error files is protected?
Think about protecting sensitive data while still allowing error review.
Encrypting error files and restricting access ensures sensitive data is protected while enabling authorized users to review and fix errors.
You are loading very large CSV files into Snowflake. To minimize load failures and maximize throughput, which approach to error handling is best?
Consider how to balance load speed and error visibility for large data.
Using ON_ERROR = 'CONTINUE' allows loading all good data quickly while capturing errors for later review, which is efficient for large datasets.