0
0
GCPcloud~20 mins

Data pipeline patterns in GCP - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Data Pipeline Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Architecture
intermediate
2:00remaining
Identify the correct data pipeline pattern for batch processing

You want to process large amounts of data collected over a day and run analytics once daily. Which data pipeline pattern fits best?

AStreaming pipeline that processes data in real-time as it arrives.
BLambda architecture combining batch and streaming for fault tolerance.
CBatch pipeline that processes data in scheduled chunks, like daily batches.
DMicro-batch pipeline that processes data every few seconds.
Attempts:
2 left
💡 Hint

Think about when the data is processed: all at once or continuously.

service_behavior
intermediate
2:00remaining
What happens when a streaming pipeline in GCP Pub/Sub loses a message?

In a streaming pipeline using Google Cloud Pub/Sub, if a message is not acknowledged by the subscriber, what is the expected behavior?

AThe message is deleted immediately and lost permanently.
BThe message is retried and redelivered until acknowledged or expires.
CThe message is sent to a dead-letter queue instantly.
DThe message is duplicated and sent to all subscribers.
Attempts:
2 left
💡 Hint

Consider how Pub/Sub ensures message delivery reliability.

security
advanced
2:30remaining
Secure data transfer in a multi-stage pipeline

You have a multi-stage data pipeline in GCP involving Cloud Storage, Dataflow, and BigQuery. Which practice best secures data in transit between these services?

AEnable VPC Service Controls and use private IPs for communication.
BUse FTP protocol for faster data transfer.
CEncrypt data only at rest, not during transfer.
DUse public internet with IP whitelisting for data transfer.
Attempts:
2 left
💡 Hint

Think about how to keep data inside Google's private network.

🧠 Conceptual
advanced
2:30remaining
Choosing the right pipeline pattern for low-latency analytics

You need to analyze data with minimal delay after it is generated. Which data pipeline pattern is most suitable?

AStreaming processing with event-driven triggers.
BBatch processing with daily scheduled jobs.
CMicro-batch processing with hourly intervals.
DManual data export and import for analysis.
Attempts:
2 left
💡 Hint

Consider how quickly data should be available for analysis.

Best Practice
expert
3:00remaining
Optimizing cost and performance in a hybrid data pipeline

You have a hybrid data pipeline combining batch and streaming in GCP. What is the best practice to optimize cost without sacrificing performance?

ARun streaming jobs continuously on large instances regardless of load.
BUse only batch processing to avoid streaming costs.
CSchedule batch jobs during peak hours and disable autoscaling.
DUse autoscaling for streaming jobs and schedule batch jobs during off-peak hours.
Attempts:
2 left
💡 Hint

Think about adjusting resources based on workload and timing.