Challenge - 5 Problems

🎖️

Data Fusion ETL Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ service_behavior

intermediate

2:00remaining

How does Data Fusion handle schema changes during ETL?

In Google Cloud Data Fusion, when an ETL pipeline reads from a source and the source schema changes (e.g., a new column is added), what is the default behavior of the pipeline if no schema update is applied?

AThe pipeline fails with a schema mismatch error and stops processing.

BThe pipeline automatically updates the schema and processes the new column without errors.

CThe pipeline ignores the new column and continues processing with the old schema.

DThe pipeline duplicates the data for the new column causing data inconsistency.

Attempts:

2 left

❓ Architecture

intermediate

2:00remaining

Choosing the right Data Fusion edition for complex ETL

You need to build an ETL pipeline in Google Cloud Data Fusion that requires custom plugins and advanced transformations. Which Data Fusion edition should you choose?

ABasic edition, because it supports all plugin types and transformations.

BStandard edition, because it supports only built-in plugins.

CTrial edition, because it has no limits on plugin usage.

DEnterprise edition, because it supports custom plugins and advanced transformations.

Attempts:

2 left

❓ security

advanced

2:00remaining

Securing sensitive data in Data Fusion pipelines

You have an ETL pipeline in Data Fusion that processes sensitive customer data. Which approach best protects this data during pipeline execution?

AUse Cloud KMS to encrypt data at rest and configure Data Fusion to use service accounts with least privilege.

BStore sensitive data in plain text and rely on network security only.

CDisable audit logs to prevent data exposure in logs.

DUse public service accounts with broad permissions for easy access.

Attempts:

2 left

❓ Configuration

advanced

2:00remaining

Configuring a Data Fusion pipeline to read from a Cloud Storage bucket

Which configuration correctly sets up a source plugin in Data Fusion to read CSV files from a Google Cloud Storage bucket named my-data-bucket?

ASet the source plugin type to 'BigQuery' and the table to 'my-data-bucket'.

BSet the source plugin type to 'GCS File' and the path to 'gs://my-data-bucket/*.csv'.

CSet the source plugin type to 'Database' and the connection string to 'gs://my-data-bucket'.

DSet the source plugin type to 'Kafka' and the topic to 'my-data-bucket'.

Attempts:

2 left

✅ Best Practice

expert

3:00remaining

Optimizing Data Fusion pipelines for large-scale ETL workloads

Which practice best improves performance and scalability of a Data Fusion ETL pipeline processing terabytes of data?

AUse batch pipelines with partitioned sources and sinks, and enable pipeline parallelism.

BUse single-threaded pipelines to avoid concurrency issues.

CStore all intermediate data in local temporary files on the Data Fusion instance.

DDisable pipeline retries to reduce execution time.

Attempts:

2 left