Why scaling requires different strategies in MLOps - Performance Analysis
When systems grow bigger, the way they handle work changes. We want to see how the cost of running tasks grows as the system scales.
What happens to the work needed when we add more data or users?
Analyze the time complexity of the following code snippet.
for batch in data_batches:
preprocess(batch)
train_model(batch)
evaluate(batch)
aggregate_results()
This code processes data in batches: it prepares, trains, and evaluates each batch, then combines results at the end.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Loop over each data batch to preprocess, train, and evaluate.
- How many times: Once per batch, so the number of batches controls repetition.
Explain the growth pattern intuitively.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 batches | 10 times the work per batch |
| 100 batches | 100 times the work per batch |
| 1000 batches | 1000 times the work per batch |
Pattern observation: The total work grows directly with the number of batches. More batches mean more total work.
Time Complexity: O(n)
This means the work grows in a straight line with the number of batches; doubling batches roughly doubles the work.
[X] Wrong: "Scaling up just means running the same code more times without changing anything."
[OK] Correct: As data grows, running the same steps repeatedly can become too slow or costly. Different strategies like parallel processing or smarter batching are needed to handle growth efficiently.
Understanding how work grows with input size helps you explain why systems need new approaches as they scale. This skill shows you can think about real-world challenges calmly and clearly.
"What if we processed batches in parallel instead of one by one? How would the time complexity change?"