Random seed management in MLOps - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When managing random seeds in machine learning pipelines, it's important to understand how the time to set or use seeds grows as the number of operations increases.
We want to know how the cost changes when we repeat random operations with seed control.
Analyze the time complexity of the following code snippet.
import random
for i in range(n):
random.seed(i)
value = random.random()
# use value in pipeline
This code sets a new random seed and generates a random number for each iteration up to n.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Loop running from 0 to n-1.
- How many times: Exactly n times, each time setting a seed and generating one random number.
Each iteration does a fixed amount of work: setting a seed and generating one number.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | 10 seed sets + 10 random generations |
| 100 | 100 seed sets + 100 random generations |
| 1000 | 1000 seed sets + 1000 random generations |
Pattern observation: The total work grows directly in proportion to n, doubling n doubles the work.
Time Complexity: O(n)
This means the time to run this code grows linearly as the number of iterations increases.
[X] Wrong: "Setting the random seed once at the start will make all iterations equally random and fast."
[OK] Correct: Each iteration here resets the seed, so the cost happens every time. Setting the seed once would not repeat this cost each iteration.
Understanding how repeated seed setting affects runtime helps you reason about reproducibility and performance in machine learning workflows.
"What if we set the random seed only once before the loop? How would the time complexity change?"
Practice
Solution
Step 1: Understand the role of randomness in experiments
Randomness affects initialization and data shuffling, causing different results each run.Step 2: Identify the effect of setting a seed
Setting a seed fixes randomness so results are the same every time.Final Answer:
To make the results reproducible and consistent across runs -> Option AQuick Check:
Random seed = reproducibility [OK]
- Thinking seed speeds up training
- Believing seed increases randomness
- Confusing seed with dataset size
random and NumPy libraries?Solution
Step 1: Recall correct seed setting methods
Python's random uses random.seed(value), NumPy uses np.random.seed(value).Step 2: Check each option's syntax
import random import numpy as np random.seed(42) np.random.seed(42)uses correct functions. Others use non-existentset_seed, incorrect assignments toseed, ornp.seed(42)which doesn't exist.Final Answer:
import random import numpy as np random.seed(42) np.random.seed(42) -> Option BQuick Check:
random.seed() and np.random.seed() are correct [OK]
- Using random.set_seed instead of random.seed
- Assigning seed as a variable instead of calling method
- Calling np.seed instead of np.random.seed
import random random.seed(123) print([random.randint(1, 10) for _ in range(3)]) random.seed(123) print([random.randint(1, 10) for _ in range(3)])What will be the output?
Solution
Step 1: Understand effect of setting seed before generating numbers
Setting seed resets the random number generator to a fixed state.Step 2: Predict output of two identical seed calls
Both lists will be identical because the seed is reset before each list generation.Final Answer:
[3, 2, 7], [3, 2, 7] -> Option CQuick Check:
Same seed = same random sequence [OK]
- Assuming different outputs after resetting seed
- Confusing seed effect with random state progression
- Ignoring that seed resets generator state
import random random.seed(42) print(random.randint(1, 100)) import numpy as np np.random.seed(42) print(np.random.randint(1, 100))What is the most likely reason for the non-reproducible results?
Solution
Step 1: Analyze seed setting for Python random and NumPy
Seeds are set correctly for both libraries before generating numbers.Step 2: Consider other sources of randomness
If another library (e.g., TensorFlow, PyTorch) uses randomness but seed is not set there, results vary.Final Answer:
Seed set only for Python random and NumPy, but another library uses randomness -> Option AQuick Check:
All libraries need seed set for full reproducibility [OK]
- Thinking seed value size matters
- Believing print affects randomness
- Assuming seed order is wrong here
random, NumPy, and PyTorch. Which of the following code snippets correctly sets seeds for all three libraries and disables nondeterministic behavior in PyTorch?Solution
Step 1: Set seeds for Python random, NumPy, and PyTorch
Use random.seed(), np.random.seed(), and torch.manual_seed() with the same value.Step 2: Enable deterministic algorithms in PyTorch
Use torch.use_deterministic_algorithms(True) to disable nondeterministic ops.Final Answer:
import random import numpy as np import torch random.seed(123) np.random.seed(123) torch.manual_seed(123) torch.use_deterministic_algorithms(True) -> Option DQuick Check:
All seeds set + deterministic mode = full reproducibility [OK]
- Using non-existent torch.set_deterministic method
- Assigning torch.deterministic instead of calling function
- Forgetting to enable deterministic algorithms in PyTorch
