In Google Cloud Data Fusion, when an ETL pipeline reads from a source and the source schema changes (e.g., a new column is added), what is the default behavior of the pipeline if no schema update is applied?
Think about how strict the schema enforcement is by default in Data Fusion pipelines.
By default, Data Fusion pipelines use the schema defined at design time. If the source schema changes, the pipeline ignores new columns unless the schema is updated manually or through schema registry integration.
You need to build an ETL pipeline in Google Cloud Data Fusion that requires custom plugins and advanced transformations. Which Data Fusion edition should you choose?
Consider which edition supports extensibility and advanced features.
The Enterprise edition of Data Fusion supports custom plugins and advanced transformations, making it suitable for complex ETL scenarios.
You have an ETL pipeline in Data Fusion that processes sensitive customer data. Which approach best protects this data during pipeline execution?
Think about encryption and access control best practices.
Encrypting data at rest with Cloud KMS and using least privilege service accounts ensures sensitive data is protected during processing and storage.
Which configuration correctly sets up a source plugin in Data Fusion to read CSV files from a Google Cloud Storage bucket named my-data-bucket?
Consider which plugin reads files from Cloud Storage.
The 'GCS File' source plugin reads files from Cloud Storage buckets using the 'gs://' URI format.
Which practice best improves performance and scalability of a Data Fusion ETL pipeline processing terabytes of data?
Think about how to handle large data volumes efficiently.
Partitioning data sources and sinks and enabling parallelism allows Data Fusion pipelines to process large data sets efficiently and scale horizontally.