Bird
Raised Fist0
MLOpsdevops~10 mins

Pipeline scheduling and triggers in MLOps - Step-by-Step Execution

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Process Flow - Pipeline scheduling and triggers
Define Pipeline
Set Schedule or Trigger
Wait for Trigger Event
Time-based Schedule?
YesRun Pipeline
Complete
Event-based Trigger?
YesRun Pipeline
Complete
This flow shows how a pipeline is defined, then scheduled or triggered by time or events, leading to pipeline execution and completion.
Execution Sample
MLOps
pipeline:
  name: example-pipeline
  schedule:
    cron: '0 6 * * *'
  triggers:
    - event: data_arrival
Defines a pipeline named 'example-pipeline' that runs daily at 6 AM or when new data arrives.
Process Table
StepEventCondition CheckedAction TakenPipeline State
1Pipeline DefinedN/APipeline configuration savedIdle
2Time Tick at 5:59 AMIs current time 6:00 AM? NoWaitIdle
3Time Tick at 6:00 AMIs current time 6:00 AM? YesStart pipeline runRunning
4Pipeline RunningPipeline tasks executingProcess dataRunning
5Pipeline CompletedAll tasks doneMark pipeline completeCompleted
6Event: data_arrival at 7:00 AMIs event trigger enabled? YesStart pipeline runRunning
7Pipeline RunningPipeline tasks executingProcess new dataRunning
8Pipeline CompletedAll tasks doneMark pipeline completeCompleted
9Time Tick at 6:00 AM next dayIs current time 6:00 AM? YesStart pipeline runRunning
💡 Pipeline runs triggered by schedule or event, then completes; waits for next trigger.
Status Tracker
VariableStartAfter Step 3After Step 5After Step 6After Step 8After Step 9
pipeline_stateIdleRunningCompletedRunningCompletedRunning
current_timeN/A6:00 AMN/A7:00 AMN/A6:00 AM next day
event_triggeredNoNoNoYesNoNo
Key Moments - 3 Insights
Why doesn't the pipeline run at 5:59 AM even though time is close to 6:00 AM?
Because the schedule condition checks for exact time '6:00 AM' (see execution_table step 2 and 3). The pipeline only starts when the condition is exactly met.
How can the pipeline run twice in one day?
The pipeline can run once by schedule (6:00 AM) and again by event trigger (data arrival at 7:00 AM), as shown in steps 3 and 6.
What happens if the event trigger is disabled?
If event triggers are disabled, the pipeline will only run on schedule, so steps like 6 and 7 would not start a run.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 3, what is the pipeline_state?
AIdle
BRunning
CCompleted
DWaiting
💡 Hint
Check the 'Pipeline State' column at step 3 in the execution_table.
At which step does the pipeline run because of an event trigger?
AStep 3
BStep 9
CStep 6
DStep 2
💡 Hint
Look for the row where 'Event: data_arrival' occurs in the execution_table.
If the schedule cron is changed to '0 7 * * *', when will the pipeline run instead of step 3?
A7:00 AM
B6:00 AM
C5:59 AM
DImmediately
💡 Hint
Refer to the 'current_time' variable in variable_tracker and the schedule condition in execution_table step 3.
Concept Snapshot
Pipeline Scheduling and Triggers:
- Define pipeline with name and triggers
- Schedule uses cron syntax for time-based runs
- Triggers can be event-based (e.g., data arrival)
- Pipeline waits for schedule or event
- On trigger, pipeline runs and completes
- Can have multiple triggers for flexibility
Full Transcript
Pipeline scheduling and triggers let you run your pipeline automatically. First, you define the pipeline and set when it should run. This can be at a specific time using a schedule or when something happens, like new data arriving. The system waits for these triggers. When the time matches the schedule or the event happens, the pipeline starts running. After it finishes, it waits again for the next trigger. This way, your pipeline runs without you needing to start it manually.

Practice

(1/5)
1. What is the main purpose of pipeline scheduling in MLOps?
easy
A. To store pipeline logs for debugging
B. To manually start pipelines whenever needed
C. To run tasks automatically at specific times without manual intervention
D. To create new machine learning models from scratch

Solution

  1. Step 1: Understand pipeline scheduling

    Pipeline scheduling is designed to run tasks automatically at set times, like daily or hourly, without needing a person to start them.
  2. Step 2: Compare options

    Only To run tasks automatically at specific times without manual intervention describes automatic running at specific times. Other options describe manual actions or unrelated tasks.
  3. Final Answer:

    To run tasks automatically at specific times without manual intervention -> Option C
  4. Quick Check:

    Pipeline scheduling = automatic timed runs [OK]
Hint: Scheduling means automatic runs at set times [OK]
Common Mistakes:
  • Confusing scheduling with manual triggering
  • Thinking scheduling stores logs
  • Assuming scheduling creates models directly
2. Which of the following is a correct cron expression to schedule a pipeline to run every day at 3 AM?
easy
A. 3 0 * * *
B. 0 3 * * *
C. * 3 * * *
D. 0 0 3 * * *

Solution

  1. Step 1: Understand cron format

    Cron syntax is: minute hour day month weekday. To run at 3 AM daily, minute=0, hour=3, day/month/weekday=any (*).
  2. Step 2: Match expression

    0 3 * * * "0 3 * * *" means minute 0, hour 3, every day. Others have wrong order or extra fields.
  3. Final Answer:

    0 3 * * * -> Option B
  4. Quick Check:

    Minute=0, Hour=3 daily = 0 3 * * * [OK]
Hint: Cron: minute hour day month weekday; 3 AM is '0 3 * * *' [OK]
Common Mistakes:
  • Swapping hour and minute fields
  • Adding extra fields in cron
  • Using '*' in wrong positions
3. Given this pipeline trigger configuration snippet:
{
  "trigger": {
    "event": "data_arrival",
    "filter": {
      "file_type": "csv"
    }
  }
}

What happens when a new JSON file arrives in the data folder?
medium
A. The pipeline does not run because the file type is not CSV
B. The pipeline runs because any new file triggers it
C. The pipeline runs only if the JSON file is large
D. The pipeline runs but ignores the file type

Solution

  1. Step 1: Analyze trigger filter

    The trigger listens for 'data_arrival' events but only runs if the file type is 'csv'.
  2. Step 2: Apply to JSON file

    A JSON file does not match the 'csv' filter, so the pipeline will not run.
  3. Final Answer:

    The pipeline does not run because the file type is not CSV -> Option A
  4. Quick Check:

    Filter file_type=csv blocks JSON files [OK]
Hint: Triggers with filters run only on matching events [OK]
Common Mistakes:
  • Ignoring filter conditions
  • Assuming any file triggers pipeline
  • Confusing event type with file type
4. You wrote this cron expression to schedule a pipeline every hour:
60 * * * *

Why does the pipeline never run?
medium
A. Because the hour field is missing
B. Because cron requires seconds field
C. Because the asterisks are misplaced
D. Because 60 is not a valid minute value in cron syntax

Solution

  1. Step 1: Check minute field validity

    Cron minute values must be 0-59. '60' is invalid and causes no runs.
  2. Step 2: Confirm other fields

    The hour and other fields are correct as '*', meaning every hour/day. The error is only the minute value.
  3. Final Answer:

    Because 60 is not a valid minute value in cron syntax -> Option D
  4. Quick Check:

    Minute must be 0-59; 60 is invalid [OK]
Hint: Minutes in cron go 0-59, never 60 [OK]
Common Mistakes:
  • Using 60 as minute value
  • Thinking cron needs seconds field
  • Misplacing asterisks
5. You want a pipeline to run automatically when new data arrives and also every Sunday at midnight. Which setup correctly combines scheduling and event triggers?
hard
A. Use a cron schedule '0 0 * * 0' and an event trigger for 'data_arrival' together
B. Use only a cron schedule '0 0 * * 0' because event triggers conflict with schedules
C. Use only an event trigger for 'data_arrival' and manually run on Sundays
D. Use a cron schedule '0 0 * * 7' and ignore event triggers

Solution

  1. Step 1: Understand combined triggers

    Pipelines can have both cron schedules and event triggers to run on different conditions.
  2. Step 2: Verify cron expression for Sunday midnight

    '0 0 * * 0' runs at midnight on Sundays (0 or 7 can represent Sunday, but 0 is standard).
  3. Step 3: Confirm event trigger for data arrival

    Adding an event trigger for 'data_arrival' ensures pipeline runs when new data arrives.
  4. Final Answer:

    Use a cron schedule '0 0 * * 0' and an event trigger for 'data_arrival' together -> Option A
  5. Quick Check:

    Combine cron and event triggers for full automation [OK]
Hint: Combine cron and event triggers for multiple run conditions [OK]
Common Mistakes:
  • Thinking schedules and triggers cannot coexist
  • Using wrong cron day for Sunday
  • Ignoring event triggers for data arrival