Bird
Raised Fist0
MLOpsdevops~5 mins

Comparing experiment runs in MLOps - Commands & Configuration

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
When you run machine learning experiments, you often try different settings to see which works best. Comparing experiment runs helps you find the best model by looking at their results side by side.
When you want to see which model version has the highest accuracy after training multiple times.
When you need to compare different hyperparameter settings to choose the best combination.
When you want to track improvements over time by comparing new runs with older ones.
When you want to share results with your team to decide which model to deploy.
When you want to find out if a change in data preprocessing improved the model.
Commands
This command runs an MLflow project in the current directory with a parameter alpha set to 0.1. It starts an experiment run to track results.
Terminal
mlflow run . -P alpha=0.1
Expected OutputExpected
2024/06/01 12:00:00 INFO mlflow.projects: === Run (ID 1a2b3c4d) started === 2024/06/01 12:00:10 INFO mlflow.projects: === Run (ID 1a2b3c4d) succeeded ===
-P - Set a parameter value for the run
Runs the same MLflow project but with alpha set to 0.5 to compare results with the previous run.
Terminal
mlflow run . -P alpha=0.5
Expected OutputExpected
2024/06/01 12:01:00 INFO mlflow.projects: === Run (ID 5e6f7g8h) started === 2024/06/01 12:01:10 INFO mlflow.projects: === Run (ID 5e6f7g8h) succeeded ===
-P - Set a parameter value for the run
Starts the MLflow tracking UI in your browser so you can visually compare experiment runs side by side.
Terminal
mlflow ui
Expected OutputExpected
2024/06/01 12:02:00 INFO mlflow.server: Starting MLflow UI at http://127.0.0.1:5000
Lists all experiments so you can find the experiment ID to compare runs within it.
Terminal
mlflow experiments list
Expected OutputExpected
Experiment ID Name 1 Default 2 MyExperiment
Lists all runs under experiment ID 1 so you can see their metrics and parameters for comparison.
Terminal
mlflow runs list --experiment-id 1
Expected OutputExpected
Run ID Status Start Time Metrics 1a2b3c4d FINISHED 2024-06-01 12:00:00 accuracy=0.85 5e6f7g8h FINISHED 2024-06-01 12:01:00 accuracy=0.90
--experiment-id - Specify which experiment's runs to list
Key Concept

If you remember nothing else from this pattern, remember: comparing experiment runs side by side helps you pick the best model by looking at their results clearly.

Code Example
MLOps
import mlflow
import random

def train_model(alpha):
    accuracy = 0.8 + alpha * 0.2 + random.uniform(-0.05, 0.05)
    mlflow.log_param("alpha", alpha)
    mlflow.log_metric("accuracy", accuracy)
    print(f"Run finished with accuracy: {accuracy:.3f}")

if __name__ == "__main__":
    for alpha_value in [0.1, 0.5]:
        with mlflow.start_run():
            train_model(alpha_value)
OutputSuccess
Common Mistakes
Not setting different parameters for each run and then trying to compare them.
Without different parameters, runs look the same and you can't tell which setting is better.
Always set different parameters using -P flag when running experiments to create meaningful comparisons.
Not starting the MLflow UI and trying to compare runs only by reading logs.
Logs are hard to read and compare; the UI shows metrics and parameters side by side clearly.
Run 'mlflow ui' to open the tracking interface and compare runs visually.
Summary
Run experiments with different parameters using 'mlflow run . -P parameter=value' to track variations.
Start the MLflow UI with 'mlflow ui' to visually compare experiment runs side by side.
List experiments and runs using 'mlflow experiments list' and 'mlflow runs list --experiment-id' to review results.

Practice

(1/5)
1.

What is the main purpose of comparing experiment runs in MLOps?

easy
A. To identify which model performs best by reviewing their results side by side
B. To delete old experiment runs to save space
C. To create new experiment runs automatically
D. To change the code of the model during training

Solution

  1. Step 1: Understand experiment runs

    Experiment runs record model training results and metrics.
  2. Step 2: Purpose of comparing runs

    Comparing runs helps see which model version performs better by looking at their results side by side.
  3. Final Answer:

    To identify which model performs best by reviewing their results side by side -> Option A
  4. Quick Check:

    Comparing runs = find best model [OK]
Hint: Comparing runs means checking results to pick the best model [OK]
Common Mistakes:
  • Thinking comparing runs deletes data
  • Confusing comparing with creating runs
  • Believing comparing changes model code
2.

Which command syntax correctly compares two experiment runs with IDs run1 and run2 under experiment exp123?

mlflow experiments compare-runs --experiment-id exp123 --run-ids run1 run2
easy
A. mlflow compare runs --experiment exp123 --ids run1,run2
B. mlflow experiments compare-runs --experiment-id exp123 --run-ids run1 run2
C. mlflow compare-runs --experiment exp123 --run-ids run1 run2
D. mlflow experiments compare --experiment-id exp123 --runs run1 run2

Solution

  1. Step 1: Check official command format

    The correct MLflow command uses 'mlflow experiments compare-runs' with '--experiment-id' and '--run-ids' flags.
  2. Step 2: Match options to syntax

    mlflow experiments compare-runs --experiment-id exp123 --run-ids run1 run2 matches the correct syntax exactly with proper flags and parameters.
  3. Final Answer:

    mlflow experiments compare-runs --experiment-id exp123 --run-ids run1 run2 -> Option B
  4. Quick Check:

    Correct command syntax = mlflow experiments compare-runs --experiment-id exp123 --run-ids run1 run2 [OK]
Hint: Use 'mlflow experiments compare-runs' with correct flags [OK]
Common Mistakes:
  • Using wrong flags like --runs instead of --run-ids
  • Mixing command order or names
  • Separating run IDs with commas instead of spaces
3.

Given two runs with metrics:
run1: accuracy=0.85, loss=0.35
run2: accuracy=0.88, loss=0.40
Which run is better if accuracy is the main metric?

medium
A. run1 because it has higher accuracy
B. run1 because it has lower loss
C. run2 because it has higher accuracy
D. run2 because it has lower loss

Solution

  1. Step 1: Identify main metric

    The question states accuracy is the main metric to compare runs.
  2. Step 2: Compare accuracy values

    run1 accuracy = 0.85, run2 accuracy = 0.88. Higher accuracy is better.
  3. Final Answer:

    run2 because it has higher accuracy -> Option C
  4. Quick Check:

    Main metric accuracy = higher is better [OK]
Hint: Focus on main metric value to pick best run [OK]
Common Mistakes:
  • Choosing run with lower loss when accuracy is main metric
  • Confusing higher and lower metric values
  • Ignoring stated main metric
4.

What is wrong with this command to compare runs?
mlflow experiments compare-runs --experiment-id exp123 --run-ids run1,run2

medium
A. Command should be 'mlflow compare-runs' without 'experiments'
B. Experiment ID flag should be --experiment, not --experiment-id
C. Run IDs must be specified with --runs, not --run-ids
D. Run IDs should be separated by spaces, not commas

Solution

  1. Step 1: Check run IDs format

    MLflow expects run IDs separated by spaces, not commas.
  2. Step 2: Verify other flags

    --experiment-id and --run-ids are correct flags; command includes 'experiments' correctly.
  3. Final Answer:

    Run IDs should be separated by spaces, not commas -> Option D
  4. Quick Check:

    Run IDs separated by spaces [OK]
Hint: Separate run IDs with spaces, not commas [OK]
Common Mistakes:
  • Using commas between run IDs
  • Changing correct flags incorrectly
  • Removing 'experiments' from command
5.

You want to compare three runs but only focus on the f1_score metric. Which command correctly filters to show only this metric?

mlflow experiments compare-runs --experiment-id exp456 --run-ids runA runB runC --metric-keys f1_score
hard
A. mlflow experiments compare-runs --experiment-id exp456 --run-ids runA runB runC --metric-keys f1_score
B. mlflow experiments compare-runs --experiment-id exp456 --run-ids runA runB runC --metrics f1_score
C. mlflow experiments compare-runs --experiment-id exp456 --run-ids runA runB runC --filter f1_score
D. mlflow experiments compare-runs --experiment-id exp456 --run-ids runA runB runC --metric-filter f1_score

Solution

  1. Step 1: Identify correct flag for metric filtering

    The correct flag to filter metrics in MLflow compare-runs is '--metric-keys'.
  2. Step 2: Match command with options

    mlflow experiments compare-runs --experiment-id exp456 --run-ids runA runB runC --metric-keys f1_score uses '--metric-keys' correctly with the metric name 'f1_score'.
  3. Final Answer:

    mlflow experiments compare-runs --experiment-id exp456 --run-ids runA runB runC --metric-keys f1_score -> Option A
  4. Quick Check:

    Use --metric-keys to focus on specific metric [OK]
Hint: Use --metric-keys flag to show only chosen metric [OK]
Common Mistakes:
  • Using wrong flag like --metrics or --filter
  • Misspelling flag names
  • Omitting metric filter when needed