Which of the following best explains why pipelines are used to automate ML workflows?
Think about how automation helps in daily tasks to avoid repeating the same work manually.
Pipelines automate repeated steps like data cleaning, training, and testing. This reduces mistakes and saves time.
What is the expected output when running the command mlflow run . --entry-point train in an ML pipeline project?
mlflow run . --entry-point train
Consider what the mlflow run command does with an entry point.
The command runs the specified step (train) in the MLflow project, executing training and logging results.
Arrange the typical ML pipeline steps in the correct order.
Think about what must happen before training and what comes after evaluation.
First data is prepared, then the model is trained, evaluated, and finally deployed.
An ML pipeline fails at the training step with an error saying 'FileNotFoundError: data.csv not found'. What is the most likely cause?
Consider what step creates the data file needed for training.
If the data file is missing, it usually means the preprocessing step did not run or failed to save the file.
Which practice best ensures reproducibility and traceability in ML pipelines?
Think about how software projects keep track of changes and versions.
Version controlling code and tracking data and models ensures you can reproduce and audit pipeline runs.