Bird
Raised Fist0
MLOpsdevops~10 mins

Random seed management in MLOps - Step-by-Step Execution

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Process Flow - Random seed management
Set random seed
Initialize random generator
Run stochastic process
Get reproducible output
Use output for training/testing
Repeat with same seed -> same results
Change seed -> different results
This flow shows how setting a random seed fixes the starting point of randomness, making results repeatable across runs.
Execution Sample
MLOps
import random
random.seed(42)
print(random.randint(1, 10))
print(random.randint(1, 10))
This code sets a random seed and prints two random numbers, which will be the same every time it runs.
Process Table
StepActionSeed StateRandom Number GeneratedOutput
1Set seed to 42Seed=42N/ANo output
2Generate first random intSeed=42 (updated internally)Random int between 1-102
3Generate second random intSeed=updatedRandom int between 1-101
4End of executionSeed=updatedN/AOutputs: 2, 1
💡 Random numbers generated based on seed 42, ensuring reproducibility.
Status Tracker
VariableStartAfter Step 1After Step 2After Step 3Final
seedNone42internal updatedinternal updatedinternal updated
random_number_1NoneNone222
random_number_2NoneNoneNone11
Key Moments - 3 Insights
Why do the random numbers stay the same every time we run the code?
Because setting the seed (Step 1) fixes the starting point of the random number generator, so the sequence of numbers generated (Steps 2 and 3) is always the same.
What happens if we don't set a seed before generating random numbers?
Without setting a seed, the random number generator starts from a different state each time, so the numbers will differ on each run (not shown in table but implied).
Does the seed variable itself change after generating numbers?
The seed is used to initialize the generator; internally the generator state updates after each number, but the original seed value remains conceptually fixed (see 'internal updated' in variable_tracker).
Visual Quiz - 3 Questions
Test your understanding
Look at the execution table, what is the first random number generated at Step 2?
A2
B1
C42
DNone
💡 Hint
Check the 'Random Number Generated' and 'Output' columns at Step 2 in the execution_table.
At which step does the seed get set to 42?
AStep 2
BStep 1
CStep 3
DStep 4
💡 Hint
Look at the 'Action' column in the execution_table to find when the seed is set.
If we change the seed to 100, what will happen to the output numbers?
AThey will be the same as with seed 42
BNo numbers will be generated
CThey will be different from seed 42
DThe seed will reset to 42 automatically
💡 Hint
Refer to the concept_flow where changing the seed leads to different results.
Concept Snapshot
Random seed management:
- Use random.seed(value) to fix randomness
- Fixing seed makes outputs reproducible
- Same seed = same random sequence
- Different seed = different sequence
- Useful for debugging and consistent experiments
Full Transcript
Random seed management means setting a starting point for random number generation so results repeat every time. We set the seed using random.seed(42). Then, when we generate random numbers, they come out the same on every run. This helps in machine learning experiments to get consistent results. The seed initializes the random generator's state. Each call to generate a random number updates this state internally. If we change the seed, the sequence of random numbers changes. Without setting a seed, the numbers vary each run. This trace showed setting seed 42 and generating two random numbers: 2 and 1, always the same on every run.

Practice

(1/5)
1. What is the main purpose of setting a random seed in machine learning experiments?
easy
A. To make the results reproducible and consistent across runs
B. To speed up the training process
C. To increase the randomness of the model
D. To reduce the size of the dataset

Solution

  1. Step 1: Understand the role of randomness in experiments

    Randomness affects initialization and data shuffling, causing different results each run.
  2. Step 2: Identify the effect of setting a seed

    Setting a seed fixes randomness so results are the same every time.
  3. Final Answer:

    To make the results reproducible and consistent across runs -> Option A
  4. Quick Check:

    Random seed = reproducibility [OK]
Hint: Random seed fixes randomness for repeatable results [OK]
Common Mistakes:
  • Thinking seed speeds up training
  • Believing seed increases randomness
  • Confusing seed with dataset size
2. Which of the following Python code snippets correctly sets the random seed for both Python's random and NumPy libraries?
easy
A. import random import numpy as np random.seed(42) np.seed(42)
B. import random import numpy as np random.seed(42) np.random.seed(42)
C. import random import numpy as np random.seed = 42 np.random.seed = 42
D. import random import numpy as np random.set_seed(42) np.set_seed(42)

Solution

  1. Step 1: Recall correct seed setting methods

    Python's random uses random.seed(value), NumPy uses np.random.seed(value).
  2. Step 2: Check each option's syntax

    import random import numpy as np random.seed(42) np.random.seed(42) uses correct functions. Others use non-existent set_seed, incorrect assignments to seed, or np.seed(42) which doesn't exist.
  3. Final Answer:

    import random import numpy as np random.seed(42) np.random.seed(42) -> Option B
  4. Quick Check:

    random.seed() and np.random.seed() are correct [OK]
Hint: Use .seed() method, not .set_seed or assignment [OK]
Common Mistakes:
  • Using random.set_seed instead of random.seed
  • Assigning seed as a variable instead of calling method
  • Calling np.seed instead of np.random.seed
3. Consider the following Python code snippet:
import random
random.seed(123)
print([random.randint(1, 10) for _ in range(3)])
random.seed(123)
print([random.randint(1, 10) for _ in range(3)])
What will be the output?
medium
A. [[3, 2, 7], [4, 5, 6]]
B. [[1, 10, 2], [1, 10, 2]]
C. [[3, 2, 7], [3, 2, 7]]
D. [[1, 10, 2], [4, 5, 6]]

Solution

  1. Step 1: Understand effect of setting seed before generating numbers

    Setting seed resets the random number generator to a fixed state.
  2. Step 2: Predict output of two identical seed calls

    Both lists will be identical because the seed is reset before each list generation.
  3. Final Answer:

    [3, 2, 7], [3, 2, 7] -> Option C
  4. Quick Check:

    Same seed = same random sequence [OK]
Hint: Resetting seed repeats the same random sequence [OK]
Common Mistakes:
  • Assuming different outputs after resetting seed
  • Confusing seed effect with random state progression
  • Ignoring that seed resets generator state
4. You have the following code snippet that aims to fix randomness but still produces different results each run:
import random
random.seed(42)
print(random.randint(1, 100))
import numpy as np
np.random.seed(42)
print(np.random.randint(1, 100))
What is the most likely reason for the non-reproducible results?
medium
A. The seed is set only for Python random and NumPy separately, but another library uses randomness
B. The random seed is set after generating random numbers
C. The seed value 42 is too small to fix randomness
D. The print statements cause randomness to reset

Solution

  1. Step 1: Analyze seed setting for Python random and NumPy

    Seeds are set correctly for both libraries before generating numbers.
  2. Step 2: Consider other sources of randomness

    If another library (e.g., TensorFlow, PyTorch) uses randomness but seed is not set there, results vary.
  3. Final Answer:

    Seed set only for Python random and NumPy, but another library uses randomness -> Option A
  4. Quick Check:

    All libraries need seed set for full reproducibility [OK]
Hint: Set seed in all libraries that use randomness [OK]
Common Mistakes:
  • Thinking seed value size matters
  • Believing print affects randomness
  • Assuming seed order is wrong here
5. You want to ensure full reproducibility of a machine learning experiment using Python's random, NumPy, and PyTorch. Which of the following code snippets correctly sets seeds for all three libraries and disables nondeterministic behavior in PyTorch?
hard
A. import random import numpy as np import torch random.seed(123) np.random.seed(123) torch.manual_seed(123)
B. import random import numpy as np import torch random.seed(123) np.random.seed(123) torch.manual_seed(123) torch.set_deterministic(True)
C. import random import numpy as np import torch random.seed(123) np.random.seed(123) torch.manual_seed(123) torch.deterministic = True
D. import random import numpy as np import torch random.seed(123) np.random.seed(123) torch.manual_seed(123) torch.use_deterministic_algorithms(True)

Solution

  1. Step 1: Set seeds for Python random, NumPy, and PyTorch

    Use random.seed(), np.random.seed(), and torch.manual_seed() with the same value.
  2. Step 2: Enable deterministic algorithms in PyTorch

    Use torch.use_deterministic_algorithms(True) to disable nondeterministic ops.
  3. Final Answer:

    import random import numpy as np import torch random.seed(123) np.random.seed(123) torch.manual_seed(123) torch.use_deterministic_algorithms(True) -> Option D
  4. Quick Check:

    All seeds set + deterministic mode = full reproducibility [OK]
Hint: Set all seeds and enable deterministic mode in PyTorch [OK]
Common Mistakes:
  • Using non-existent torch.set_deterministic method
  • Assigning torch.deterministic instead of calling function
  • Forgetting to enable deterministic algorithms in PyTorch