0
0
GCPcloud~20 mins

Data Fusion for ETL in GCP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Data Fusion ETL Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
service_behavior
intermediate
2:00remaining
How does Data Fusion handle schema changes during ETL?

In Google Cloud Data Fusion, when an ETL pipeline reads from a source and the source schema changes (e.g., a new column is added), what is the default behavior of the pipeline if no schema update is applied?

AThe pipeline fails with a schema mismatch error and stops processing.
BThe pipeline automatically updates the schema and processes the new column without errors.
CThe pipeline ignores the new column and continues processing with the old schema.
DThe pipeline duplicates the data for the new column causing data inconsistency.
Attempts:
2 left
💡 Hint

Think about how strict the schema enforcement is by default in Data Fusion pipelines.

Architecture
intermediate
2:00remaining
Choosing the right Data Fusion edition for complex ETL

You need to build an ETL pipeline in Google Cloud Data Fusion that requires custom plugins and advanced transformations. Which Data Fusion edition should you choose?

ABasic edition, because it supports all plugin types and transformations.
BStandard edition, because it supports only built-in plugins.
CTrial edition, because it has no limits on plugin usage.
DEnterprise edition, because it supports custom plugins and advanced transformations.
Attempts:
2 left
💡 Hint

Consider which edition supports extensibility and advanced features.

security
advanced
2:00remaining
Securing sensitive data in Data Fusion pipelines

You have an ETL pipeline in Data Fusion that processes sensitive customer data. Which approach best protects this data during pipeline execution?

AUse Cloud KMS to encrypt data at rest and configure Data Fusion to use service accounts with least privilege.
BStore sensitive data in plain text and rely on network security only.
CDisable audit logs to prevent data exposure in logs.
DUse public service accounts with broad permissions for easy access.
Attempts:
2 left
💡 Hint

Think about encryption and access control best practices.

Configuration
advanced
2:00remaining
Configuring a Data Fusion pipeline to read from a Cloud Storage bucket

Which configuration correctly sets up a source plugin in Data Fusion to read CSV files from a Google Cloud Storage bucket named my-data-bucket?

ASet the source plugin type to 'BigQuery' and the table to 'my-data-bucket'.
BSet the source plugin type to 'GCS File' and the path to 'gs://my-data-bucket/*.csv'.
CSet the source plugin type to 'Database' and the connection string to 'gs://my-data-bucket'.
DSet the source plugin type to 'Kafka' and the topic to 'my-data-bucket'.
Attempts:
2 left
💡 Hint

Consider which plugin reads files from Cloud Storage.

Best Practice
expert
3:00remaining
Optimizing Data Fusion pipelines for large-scale ETL workloads

Which practice best improves performance and scalability of a Data Fusion ETL pipeline processing terabytes of data?

AUse batch pipelines with partitioned sources and sinks, and enable pipeline parallelism.
BUse single-threaded pipelines to avoid concurrency issues.
CStore all intermediate data in local temporary files on the Data Fusion instance.
DDisable pipeline retries to reduce execution time.
Attempts:
2 left
💡 Hint

Think about how to handle large data volumes efficiently.