A/B testing model versions in MLOps - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
We want to understand how the time to run A/B testing grows as we increase the number of users or model versions.
How does the system handle more users or more models in terms of time?
Analyze the time complexity of the following code snippet.
# Distribute users to model versions
for user in users:
model_version = select_model_version(user)
prediction = model_version.predict(user.data)
log_result(user.id, model_version.id, prediction)
# Aggregate results
results = aggregate_logs()
This code assigns each user to a model version, gets a prediction, logs it, and then aggregates results.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Loop over all users to get predictions and log results.
- How many times: Once per user, so number of users (n) times.
As the number of users grows, the time to process grows roughly the same amount.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 predictions and logs |
| 100 | 100 predictions and logs |
| 1000 | 1000 predictions and logs |
Pattern observation: Doubling users roughly doubles the work.
Time Complexity: O(n)
This means the time grows linearly with the number of users tested.
[X] Wrong: "Adding more model versions multiplies the time by the number of versions squared."
[OK] Correct: Each user is assigned to only one model version, so time grows with users, not the square of versions.
Understanding how time grows with users helps you design scalable testing systems and shows you can think about real-world system limits.
"What if we tested every user on every model version instead of just one? How would the time complexity change?"
Practice
Solution
Step 1: Understand A/B testing concept
A/B testing involves running two versions of a model simultaneously to compare their performance.Step 2: Identify the main goal
The goal is to split user traffic between two models to see which performs better in real conditions.Final Answer:
To compare two model versions by splitting user traffic -> Option BQuick Check:
A/B testing = compare models by traffic split [OK]
- Confusing A/B testing with training speedup
- Thinking it is about data backup
- Mixing it with server monitoring
Solution
Step 1: Check YAML list syntax for traffic split
The correct YAML uses a list with dash (-) for each model version and keys 'modelVersion' and 'percent'.Step 2: Validate keys and indentation
traffic: - modelVersion: v1 percent: 50 - modelVersion: v2 percent: 50 correctly uses 'modelVersion' and 'percent' with proper indentation and list format.Final Answer:
traffic: - modelVersion: v1 percent: 50 - modelVersion: v2 percent: 50 -> Option DQuick Check:
YAML list with modelVersion and percent = traffic: - modelVersion: v1 percent: 50 - modelVersion: v2 percent: 50 [OK]
- Missing dash for list items
- Wrong key names like 'model' or 'version'
- Incorrect indentation breaking YAML
import random
traffic_split = {'v1': 70, 'v2': 30}
user_id = 12345
random.seed(user_id)
roll = random.randint(1, 100)
if roll <= traffic_split['v1']:
assigned_version = 'v1'
else:
assigned_version = 'v2'
print(assigned_version)
What will be the printed output?Solution
Step 1: Understand random seed and randint
Setting seed to user_id makes random output deterministic for that user. randint(1,100) generates a number between 1 and 100.Step 2: Calculate roll value for user_id=12345
With seed 12345, roll is 54 (verified by running the code). Since 54 <= 70, assigned_version is 'v1'.Final Answer:
v1 -> Option CQuick Check:
roll=54 <= 70 means assign v1 [OK]
- Assuming random changes every run despite seed
- Misreading comparison operator
- Confusing randint range
traffic:
- modelVersion: v1
percent: 60
- modelVersion: v2
percent: 50
What is the main problem with this configuration?Solution
Step 1: Sum the traffic percentages
60% + 50% = 110%, which is more than 100% allowed for traffic split.Step 2: Understand traffic split constraints
Traffic percentages must sum to exactly 100% to properly split user traffic between models.Final Answer:
Percentages add up to more than 100% -> Option AQuick Check:
Sum of percents > 100% is invalid [OK]
- Ignoring total percentage sum
- Thinking percentages can be unequal but sum over 100
- Confusing syntax error with logic error
Solution
Step 1: Understand consistent user assignment need
Users must always get the same model version to avoid confusing metrics and user experience.Step 2: Evaluate assignment methods
Hashing user ID modulo 100 maps users consistently to a number 0-99, which can be split 70/30 for v1/v2.Step 3: Reject other options
Random assignment each request causes inconsistency; switching all users breaks A/B test; manual assignment is impractical and biased.Final Answer:
Assign users based on hashing their user ID modulo 100 and map to traffic split -> Option AQuick Check:
Consistent hashing ensures stable A/B assignment [OK]
- Random assignment causing inconsistent user experience
- Switching all users breaks test validity
- Manual assignment is error-prone and biased
