Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Pipeline scheduling and triggers
📖 Scenario: You work as a machine learning engineer. You want to automate your ML pipeline to run regularly and also trigger it when new data arrives.This helps save time and keeps your models updated without manual work.
🎯 Goal: Build a simple Python script that simulates scheduling an ML pipeline to run every day at a fixed time and also triggers immediately when new data is detected.You will create a schedule, a trigger condition, and then run the pipeline accordingly.
📋 What You'll Learn
Create a variable called pipeline_name with the value 'daily_ml_pipeline'
Create a variable called schedule_time with the value '02:00' representing 2 AM daily
Create a variable called new_data_arrived set to True or False to simulate data arrival
Write an if statement that triggers the pipeline if new_data_arrived is True
Write an else statement that prints the scheduled run time of the pipeline
💡 Why This Matters
🌍 Real World
Automating ML pipelines saves time and ensures models stay updated by running at set times or when new data arrives.
💼 Career
Understanding pipeline scheduling and triggers is essential for ML engineers and DevOps professionals working with automated workflows.
Progress0 / 4 steps
1
Set up pipeline name
Create a variable called pipeline_name and set it to the string 'daily_ml_pipeline'.
MLOps
Hint
Use the assignment operator = to set the variable.
2
Add schedule time variable
Create a variable called schedule_time and set it to the string '02:00' representing 2 AM daily.
MLOps
Hint
Use a string to represent the time in 24-hour format.
3
Simulate new data arrival
Create a variable called new_data_arrived and set it to True to simulate that new data has arrived.
MLOps
Hint
Use the boolean value True to indicate data arrival.
4
Trigger pipeline based on data arrival or schedule
Write an if statement that checks if new_data_arrived is True. If yes, print "Triggering {pipeline_name} due to new data." using an f-string. Otherwise, print "Scheduled run of {pipeline_name} at {schedule_time}." using an f-string.
MLOps
Hint
Use if new_data_arrived: and f-strings for printing.
Practice
(1/5)
1. What is the main purpose of pipeline scheduling in MLOps?
easy
A. To store pipeline logs for debugging
B. To manually start pipelines whenever needed
C. To run tasks automatically at specific times without manual intervention
D. To create new machine learning models from scratch
Solution
Step 1: Understand pipeline scheduling
Pipeline scheduling is designed to run tasks automatically at set times, like daily or hourly, without needing a person to start them.
Step 2: Compare options
Only To run tasks automatically at specific times without manual intervention describes automatic running at specific times. Other options describe manual actions or unrelated tasks.
Final Answer:
To run tasks automatically at specific times without manual intervention -> Option C
Quick Check:
Pipeline scheduling = automatic timed runs [OK]
Hint: Scheduling means automatic runs at set times [OK]
Common Mistakes:
Confusing scheduling with manual triggering
Thinking scheduling stores logs
Assuming scheduling creates models directly
2. Which of the following is a correct cron expression to schedule a pipeline to run every day at 3 AM?
easy
A. 3 0 * * *
B. 0 3 * * *
C. * 3 * * *
D. 0 0 3 * * *
Solution
Step 1: Understand cron format
Cron syntax is: minute hour day month weekday. To run at 3 AM daily, minute=0, hour=3, day/month/weekday=any (*).
Step 2: Match expression
0 3 * * * "0 3 * * *" means minute 0, hour 3, every day. Others have wrong order or extra fields.
Final Answer:
0 3 * * * -> Option B
Quick Check:
Minute=0, Hour=3 daily = 0 3 * * * [OK]
Hint: Cron: minute hour day month weekday; 3 AM is '0 3 * * *' [OK]
Common Mistakes:
Swapping hour and minute fields
Adding extra fields in cron
Using '*' in wrong positions
3. Given this pipeline trigger configuration snippet:
What happens when a new JSON file arrives in the data folder?
medium
A. The pipeline does not run because the file type is not CSV
B. The pipeline runs because any new file triggers it
C. The pipeline runs only if the JSON file is large
D. The pipeline runs but ignores the file type
Solution
Step 1: Analyze trigger filter
The trigger listens for 'data_arrival' events but only runs if the file type is 'csv'.
Step 2: Apply to JSON file
A JSON file does not match the 'csv' filter, so the pipeline will not run.
Final Answer:
The pipeline does not run because the file type is not CSV -> Option A
Quick Check:
Filter file_type=csv blocks JSON files [OK]
Hint: Triggers with filters run only on matching events [OK]
Common Mistakes:
Ignoring filter conditions
Assuming any file triggers pipeline
Confusing event type with file type
4. You wrote this cron expression to schedule a pipeline every hour:
60 * * * *
Why does the pipeline never run?
medium
A. Because the hour field is missing
B. Because cron requires seconds field
C. Because the asterisks are misplaced
D. Because 60 is not a valid minute value in cron syntax
Solution
Step 1: Check minute field validity
Cron minute values must be 0-59. '60' is invalid and causes no runs.
Step 2: Confirm other fields
The hour and other fields are correct as '*', meaning every hour/day. The error is only the minute value.
Final Answer:
Because 60 is not a valid minute value in cron syntax -> Option D
Quick Check:
Minute must be 0-59; 60 is invalid [OK]
Hint: Minutes in cron go 0-59, never 60 [OK]
Common Mistakes:
Using 60 as minute value
Thinking cron needs seconds field
Misplacing asterisks
5. You want a pipeline to run automatically when new data arrives and also every Sunday at midnight. Which setup correctly combines scheduling and event triggers?
hard
A. Use a cron schedule '0 0 * * 0' and an event trigger for 'data_arrival' together
B. Use only a cron schedule '0 0 * * 0' because event triggers conflict with schedules
C. Use only an event trigger for 'data_arrival' and manually run on Sundays
D. Use a cron schedule '0 0 * * 7' and ignore event triggers
Solution
Step 1: Understand combined triggers
Pipelines can have both cron schedules and event triggers to run on different conditions.
Step 2: Verify cron expression for Sunday midnight
'0 0 * * 0' runs at midnight on Sundays (0 or 7 can represent Sunday, but 0 is standard).
Step 3: Confirm event trigger for data arrival
Adding an event trigger for 'data_arrival' ensures pipeline runs when new data arrives.
Final Answer:
Use a cron schedule '0 0 * * 0' and an event trigger for 'data_arrival' together -> Option A
Quick Check:
Combine cron and event triggers for full automation [OK]
Hint: Combine cron and event triggers for multiple run conditions [OK]