Logging artifacts and models in MLOps - Time & Space Complexity
When logging artifacts and models in MLOps, it's important to understand how the time to save these items grows as their size or number increases.
We want to know how the work changes when we log more or bigger files.
Analyze the time complexity of the following code snippet.
for artifact in artifacts_list:
mlflow.log_artifact(artifact)
mlflow.log_model(model, "model_path")
This code logs each artifact file one by one, then logs a model once.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Loop over the list of artifacts to log each one.
- How many times: Once for each artifact in the list.
- The model logging happens only once, so it does not repeat.
As the number of artifacts grows, the total time to log them grows roughly in direct proportion.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 artifact logs + 1 model log |
| 100 | 100 artifact logs + 1 model log |
| 1000 | 1000 artifact logs + 1 model log |
Pattern observation: The time grows linearly with the number of artifacts logged.
Time Complexity: O(n)
This means the time to log artifacts grows directly with how many artifacts you have.
[X] Wrong: "Logging multiple artifacts happens all at once, so time stays the same no matter how many artifacts there are."
[OK] Correct: Each artifact is logged one by one, so more artifacts mean more work and more time.
Understanding how logging scales helps you design efficient MLOps pipelines and shows you can think about system performance clearly.
"What if we logged artifacts in parallel instead of one by one? How would the time complexity change?"