Bird
Raised Fist0
MLOpsdevops~5 mins

A/B testing model versions in MLOps - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is A/B testing in the context of model versions?
A/B testing is a method to compare two versions of a model by running them simultaneously on different user groups to see which performs better.
Click to reveal answer
beginner
Why do we use A/B testing for model versions?
To safely evaluate a new model's performance against the current one without affecting all users, ensuring improvements before full deployment.
Click to reveal answer
beginner
What is a key metric in A/B testing model versions?
A key metric is a measurable value like accuracy, click-through rate, or error rate used to compare model performance.
Click to reveal answer
intermediate
How do you split traffic in A/B testing for model versions?
Traffic is split by directing a percentage of users to each model version, often 50/50 or weighted based on risk tolerance.
Click to reveal answer
intermediate
What is a common risk when running A/B tests on model versions?
The new model might perform worse, affecting user experience for the group exposed to it, so monitoring is essential.
Click to reveal answer
What does A/B testing help you decide in model deployment?
AWhich model version performs better in real use
BHow to write code faster
CHow to reduce model size
DHow to train models without data
In A/B testing, how is user traffic usually divided?
AAll users to one model
BNo users see the new model
COnly new users to new model
DRandom split between model versions
Which metric is NOT typically used in A/B testing model versions?
AModel file size
BClick-through rate
CAccuracy
DError rate
What should you do if the new model performs worse in A/B testing?
AImmediately replace the old model
BStop the test and investigate
CIgnore results and continue
DIncrease traffic to the new model
What is a benefit of A/B testing model versions?
ATrain models faster
BDeploy models without any monitoring
CTest models safely with partial user exposure
DAvoid collecting user data
Explain how A/B testing helps in deploying new model versions safely.
Think about how you can test something new without affecting everyone.
You got /4 concepts.
    Describe the steps to set up an A/B test for two model versions.
    Consider what you need before, during, and after the test.
    You got /4 concepts.

      Practice

      (1/5)
      1. What is the main purpose of A/B testing in model deployment?
      easy
      A. To train a model faster using multiple GPUs
      B. To compare two model versions by splitting user traffic
      C. To backup model data in the cloud
      D. To monitor server CPU usage during training

      Solution

      1. Step 1: Understand A/B testing concept

        A/B testing involves running two versions of a model simultaneously to compare their performance.
      2. Step 2: Identify the main goal

        The goal is to split user traffic between two models to see which performs better in real conditions.
      3. Final Answer:

        To compare two model versions by splitting user traffic -> Option B
      4. Quick Check:

        A/B testing = compare models by traffic split [OK]
      Hint: A/B testing means splitting users to compare models [OK]
      Common Mistakes:
      • Confusing A/B testing with training speedup
      • Thinking it is about data backup
      • Mixing it with server monitoring
      2. Which of the following is the correct way to define a traffic split for A/B testing in YAML?
      easy
      A. traffic: - model: v1 split: 50 - model: v2 split: 50
      B. traffic: modelVersion: v1 percent: 50 modelVersion: v2 percent: 50
      C. traffic: - version: v1 percent: 50 - version: v2 percent: 50
      D. traffic: - modelVersion: v1 percent: 50 - modelVersion: v2 percent: 50

      Solution

      1. Step 1: Check YAML list syntax for traffic split

        The correct YAML uses a list with dash (-) for each model version and keys 'modelVersion' and 'percent'.
      2. Step 2: Validate keys and indentation

        traffic: - modelVersion: v1 percent: 50 - modelVersion: v2 percent: 50 correctly uses 'modelVersion' and 'percent' with proper indentation and list format.
      3. Final Answer:

        traffic: - modelVersion: v1 percent: 50 - modelVersion: v2 percent: 50 -> Option D
      4. Quick Check:

        YAML list with modelVersion and percent = traffic: - modelVersion: v1 percent: 50 - modelVersion: v2 percent: 50 [OK]
      Hint: YAML lists use dash and proper keys for traffic split [OK]
      Common Mistakes:
      • Missing dash for list items
      • Wrong key names like 'model' or 'version'
      • Incorrect indentation breaking YAML
      3. Given this Python snippet for A/B testing traffic assignment:
      import random
      traffic_split = {'v1': 70, 'v2': 30}
      user_id = 12345
      random.seed(user_id)
      roll = random.randint(1, 100)
      if roll <= traffic_split['v1']:
          assigned_version = 'v1'
      else:
          assigned_version = 'v2'
      print(assigned_version)
      What will be the printed output?
      medium
      A. Random output each run
      B. v2
      C. v1
      D. Error due to wrong seed usage

      Solution

      1. Step 1: Understand random seed and randint

        Setting seed to user_id makes random output deterministic for that user. randint(1,100) generates a number between 1 and 100.
      2. Step 2: Calculate roll value for user_id=12345

        With seed 12345, roll is 54 (verified by running the code). Since 54 <= 70, assigned_version is 'v1'.
      3. Final Answer:

        v1 -> Option C
      4. Quick Check:

        roll=54 <= 70 means assign v1 [OK]
      Hint: Seed fixes random; check roll against split [OK]
      Common Mistakes:
      • Assuming random changes every run despite seed
      • Misreading comparison operator
      • Confusing randint range
      4. You have this traffic split config for A/B testing:
      traffic:
        - modelVersion: v1
          percent: 60
        - modelVersion: v2
          percent: 50
      What is the main problem with this configuration?
      medium
      A. Percentages add up to more than 100%
      B. Missing modelVersion key for v2
      C. Percentages must be equal for A/B testing
      D. YAML syntax error due to indentation

      Solution

      1. Step 1: Sum the traffic percentages

        60% + 50% = 110%, which is more than 100% allowed for traffic split.
      2. Step 2: Understand traffic split constraints

        Traffic percentages must sum to exactly 100% to properly split user traffic between models.
      3. Final Answer:

        Percentages add up to more than 100% -> Option A
      4. Quick Check:

        Sum of percents > 100% is invalid [OK]
      Hint: Traffic split percentages must total 100% [OK]
      Common Mistakes:
      • Ignoring total percentage sum
      • Thinking percentages can be unequal but sum over 100
      • Confusing syntax error with logic error
      5. You want to run an A/B test comparing model versions v1 and v2. You have 10,000 users and want to assign 70% traffic to v1 and 30% to v2. Which approach ensures consistent user assignment and fair metric tracking?
      hard
      A. Assign users based on hashing their user ID modulo 100 and map to traffic split
      B. Assign users manually by checking their signup date
      C. Assign all users to v1 for the first week, then switch all to v2
      D. Randomly assign users on each request without storing assignment

      Solution

      1. Step 1: Understand consistent user assignment need

        Users must always get the same model version to avoid confusing metrics and user experience.
      2. Step 2: Evaluate assignment methods

        Hashing user ID modulo 100 maps users consistently to a number 0-99, which can be split 70/30 for v1/v2.
      3. Step 3: Reject other options

        Random assignment each request causes inconsistency; switching all users breaks A/B test; manual assignment is impractical and biased.
      4. Final Answer:

        Assign users based on hashing their user ID modulo 100 and map to traffic split -> Option A
      5. Quick Check:

        Consistent hashing ensures stable A/B assignment [OK]
      Hint: Use hashing on user ID for stable traffic split [OK]
      Common Mistakes:
      • Random assignment causing inconsistent user experience
      • Switching all users breaks test validity
      • Manual assignment is error-prone and biased