ML lifecycle stages in MLOps - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
We want to understand how the time needed to complete an ML lifecycle changes as the amount of data or tasks grows.
How does the work increase when we add more data or models?
Analyze the time complexity of the following ML lifecycle stages code snippet.
for dataset in datasets:
preprocess(dataset)
model = train_model(dataset)
evaluate(model, dataset)
deploy(model)
This code runs the main ML lifecycle steps for each dataset in a list.
Look at what repeats as input grows.
- Primary operation: Running the full ML lifecycle (preprocess, train, evaluate, deploy) for each dataset.
- How many times: Once for each dataset in the list.
As the number of datasets increases, the total work grows proportionally.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 full ML lifecycles |
| 100 | 100 full ML lifecycles |
| 1000 | 1000 full ML lifecycles |
Pattern observation: Doubling the datasets doubles the total work.
Time Complexity: O(n)
This means the total time grows directly with the number of datasets processed.
[X] Wrong: "Adding more datasets won't affect total time much because each step is fast."
[OK] Correct: Each dataset requires a full set of steps, so more datasets mean more total work and time.
Understanding how work grows with input size helps you explain and plan ML workflows clearly in real projects.
"What if we parallelize the training for all datasets? How would the time complexity change?"
Practice
Solution
Step 1: Understand the role of data in ML lifecycle
Data must be collected and cleaned before training a model.Step 2: Identify the stage focused on data tasks
Data Preparation is the stage where data is gathered and made ready for training.Final Answer:
Data Preparation -> Option BQuick Check:
Data Preparation = Collecting and cleaning data [OK]
- Confusing deployment with data tasks
- Thinking monitoring includes data cleaning
- Mixing training with data preparation
Solution
Step 1: Recall the logical flow of ML lifecycle stages
First, data is prepared, then the model is trained, followed by deployment and monitoring.Step 2: Match the correct sequence from options
Data Preparation -> Model Training -> Model Deployment -> Model Monitoring correctly lists the stages in order: Data Preparation -> Model Training -> Model Deployment -> Model Monitoring.Final Answer:
Data Preparation -> Model Training -> Model Deployment -> Model Monitoring -> Option AQuick Check:
Correct stage order = Data Preparation -> Model Training -> Model Deployment -> Model Monitoring [OK]
- Mixing deployment before training
- Starting with monitoring instead of data
- Incorrect stage order
stages = ['Data Preparation', 'Model Training', 'Model Deployment', 'Model Monitoring']
for i, stage in enumerate(stages):
print(f"Stage {i+1}: {stage}")What will be the output of this code?
Solution
Step 1: Understand enumerate behavior in the loop
enumerate(stages) gives index and value starting at 0, but print uses i+1 for stage number.Step 2: Check the order of stages printed
The loop prints stages in list order with stage numbers 1 to 4 matching the list order.Final Answer:
Stage 1: Data Preparation Stage 2: Model Training Stage 3: Model Deployment Stage 4: Model Monitoring -> Option CQuick Check:
Index + 1 matches stage number [OK]
- Confusing index starting at 0
- Mixing stage order in output
- Printing wrong stage names
stages = ['Data Preparation', 'Model Training', 'Model Deployment', 'Model Monitoring']
stages.remove('Model Training')
print(stages)What is the output after running this code?
Solution
Step 1: Understand what stages.remove('Model Training') does
This removes the first occurrence of 'Model Training' from the list.Step 2: Check the list after removal
The list now excludes 'Model Training', leaving the other three stages.Final Answer:
['Data Preparation', 'Model Deployment', 'Model Monitoring'] -> Option AQuick Check:
Remove deletes specified item from list [OK]
- Expecting an error from remove()
- Thinking remove deletes by index
- Not updating the list after removal
Solution
Step 1: Identify stages involved in retraining after data changes
Retraining requires fresh data preparation and then training the model again.Step 2: Select stages that automate retraining
Data Preparation and Model Training together form the pipeline for retraining.Final Answer:
Data Preparation and Model Training -> Option DQuick Check:
Retrain = Prepare data + Train model [OK]
- Confusing deployment with retraining
- Thinking monitoring triggers retraining alone
- Ignoring data preparation before training
