PyTesttesting~15 mins

Flaky test detection and retry in PyTest - Deep Dive

Choose your learning style10 modes available

Learn Why Deep Trace Try Challenge Automate Recall Frame

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Overview - Flaky test detection and retry

What is it?

Flaky test detection and retry is a way to find and handle tests that sometimes pass and sometimes fail without any code changes. These tests are called flaky because their results are not stable. The retry part means running the test again automatically if it fails, to see if it was a temporary problem. This helps keep test results trustworthy and reduces false alarms.

Why it matters

Without flaky test detection, developers waste time chasing bugs that aren't real. Flaky tests hide real problems and make teams lose trust in automated testing. Detecting and retrying flaky tests helps keep the test suite reliable and saves time by avoiding unnecessary debugging. It improves confidence in software quality and speeds up development.

Where it fits

Before learning flaky test detection, you should understand basic automated testing and how to write tests in pytest. After this, you can learn advanced test stability techniques, continuous integration setups, and test result analysis tools.

Mental Model

Core Idea

Flaky test detection and retry means catching tests that fail unpredictably and rerunning them to confirm if the failure is real or just a temporary glitch.

Think of it like...

It's like a smoke alarm that sometimes goes off because of steam, not fire. Instead of calling the fire department immediately, you check again to see if the alarm is still sounding before taking action.

┌───────────────┐
│ Run Test      │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Test Passes?  │
└──────┬────────┘
       │Yes
       ▼
┌───────────────┐
│ Report Pass   │
└───────────────┘

       │No
       ▼
┌───────────────┐
│ Retry Test    │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Pass on Retry?│
└──────┬────────┘
       │Yes
       ▼
┌───────────────┐
│ Mark Flaky    │
│ Report Pass   │
└───────────────┘

       │No
       ▼
┌───────────────┐
│ Report Fail   │
└───────────────┘

Build-Up - 6 Steps

FoundationUnderstanding Flaky Tests

Concept: Introduce what flaky tests are and why they cause problems.

A flaky test is a test that sometimes passes and sometimes fails without any changes in the code. This can happen because of timing issues, network delays, or shared resources. For example, a test that depends on the current time or external service might fail randomly.

Result

You can identify that some tests are unreliable and cause confusion in test results.

Understanding flaky tests is key to knowing why test results might not always be trustworthy.

FoundationBasic Pytest Test Writing

IntermediateDetecting Flaky Tests with pytest-rerunfailures

IntermediateMarking Flaky Tests Explicitly

AdvancedAnalyzing Flaky Test Causes

ExpertIntegrating Flaky Test Detection in CI Pipelines

Under the Hood

When pytest runs a test with retry enabled, it catches test failures and reruns the test function up to the specified number of times. If the test passes on any retry, pytest marks it as passed but notes it as flaky. This works by wrapping the test call in a loop and tracking results internally. The retry delay option adds a pause between attempts to reduce timing-related flakiness.

Why designed this way?

Flaky test retry was designed to reduce noise in test results caused by transient failures. Instead of failing the whole suite on a single temporary glitch, retrying helps confirm if the failure is consistent. This approach balances test reliability with developer productivity. Alternatives like ignoring failures were rejected because they hide real problems.

┌───────────────┐
│ Start Test    │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Run Test Func │
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Test Pass?    │
└──────┬────────┘
       │Yes
       ▼
┌───────────────┐
│ Mark Passed   │
└───────────────┘

       │No
       ▼
┌───────────────┐
│ Retry Count < N?│
└──────┬────────┘
       │Yes
       ▼
┌───────────────┐
│ Wait (optional)│
└──────┬────────┘
       │
       ▼
┌───────────────┐
│ Run Test Func │
└───────────────┘

       │No
       ▼
┌───────────────┐
│ Mark Failed   │
└───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Does retrying flaky tests always hide real bugs? Commit to yes or no.

Common Belief:Retrying flaky tests just hides real bugs and should be avoided.

Tap to reveal reality

Quick: Are flaky tests always caused by bad test code? Commit to yes or no.

Common Belief:Flaky tests are always due to poorly written test code.

Tap to reveal reality

Quick: Should flaky test retries be used in local development? Commit to yes or no.

Common Belief:Retrying flaky tests is only useful in continuous integration, not locally.

Tap to reveal reality

Quick: Does marking a test as flaky mean you can ignore fixing it? Commit to yes or no.

Common Belief:Marking a test as flaky means it's okay to leave it broken indefinitely.

Tap to reveal reality

Expert Zone

Flaky test retries can mask timing-related bugs if retry counts are too high, so balance is key.

Some flaky tests only appear under specific environment conditions, requiring environment-aware retry strategies.

Integrating flaky test detection with test analytics tools helps prioritize which flaky tests to fix first.

When NOT to use

Retrying tests is not suitable when failures indicate critical bugs that must be fixed immediately. Instead, use test isolation, mocks, or fix the root cause. Also, avoid retries for tests with side effects or those that modify shared state, as retries can cause inconsistent results.

Production Patterns

In production, flaky test detection is integrated into CI pipelines with limited retries and reporting. Teams track flaky tests separately and assign them for fixes. Some use quarantine environments to isolate flaky tests. Advanced setups combine retries with test parallelization and resource monitoring to reduce flakiness.

Connections

Circuit Breaker Pattern (Software Architecture)

Both detect unstable conditions and retry or fallback to maintain system stability.

Understanding flaky test retries helps grasp how systems handle temporary failures gracefully, like circuit breakers prevent cascading failures.

Statistical Sampling (Mathematics)

Retrying tests multiple times is like sampling to confirm if a failure is consistent or random.

Knowing statistical sampling clarifies why multiple test runs reduce false positives in flaky test detection.

Quality Control in Manufacturing

Both involve detecting inconsistent product quality and deciding when to accept or reject based on repeated checks.

Seeing flaky tests as quality control helps understand the importance of repeatability and reliability in software testing.

Common Pitfalls

#1Ignoring flaky tests and letting them fail silently.

Wrong approach:def test_feature(): assert some_function() == expected_result # flaky but no retry or marking

Correct approach:import pytest @pytest.mark.flaky(reruns=3) def test_feature(): assert some_function() == expected_result

Root cause:Not recognizing flaky tests leads to ignoring retries or markings, causing unstable test suites.

#2Setting retry count too high, hiding real bugs.

Wrong approach:pytest --reruns 10 # retries too many times, masking failures

Correct approach:pytest --reruns 2 # limited retries to balance detection and bug visibility

Root cause:Overusing retries can hide real issues, reducing test effectiveness.

#3Retrying tests with side effects causing inconsistent state.

Wrong approach:def test_modify_db(): db.insert(data) assert db.count() == expected # retried without cleanup

Correct approach:def test_modify_db(): db.setup_clean_state() db.insert(data) assert db.count() == expected

Root cause:Not isolating test state causes retries to produce unreliable results.

Key Takeaways

Flaky tests cause unpredictable test results and reduce trust in automated testing.

Retrying failed tests helps distinguish between real failures and temporary glitches.

Explicitly marking flaky tests allows teams to track and fix them without blocking progress.

Analyzing flaky test causes is essential to improve test reliability and software quality.

Integrating flaky test detection in CI pipelines ensures stable builds and early problem detection.

Practice

(1/5)

1. What is the main purpose of marking a test as flaky in pytest using @pytest.mark.flaky(reruns=N)?

easy

A. To skip the test permanently

B. To automatically retry the test N times if it fails

C. To mark the test as slow and run it last

D. To run the test only once without retries

Flaky test detection and retry in PyTest - Deep Dive

Start learning this pattern below

Practice

Solution

Step 1: Understand the flaky test concept

Step 2: Analyze the effect of `@pytest.mark.flaky(reruns=N)`

Final Answer:

Quick Check:

Solution

Step 1: Recall the correct decorator name and parameter

Step 2: Match syntax with options

Final Answer:

Quick Check:

Solution

Step 1: Understand the flaky decorator effect

Step 2: Analyze the random assertion

Final Answer:

Quick Check:

Solution

Step 1: Check the decorator parameter spelling

Step 2: Understand impact of wrong parameter

Final Answer:

Quick Check:

Solution

Step 1: Understand flaky test retry purpose

Step 2: Combine retry with test stabilization

Step 3: Evaluate other options

Final Answer:

Quick Check:

Start learning this pattern below

Practice

Solution

Step 1: Understand the flaky test concept

Step 2: Analyze the effect of @pytest.mark.flaky(reruns=N)

Final Answer:

Quick Check:

Solution

Step 1: Recall the correct decorator name and parameter

Step 2: Match syntax with options

Final Answer:

Quick Check:

Solution

Step 1: Understand the flaky decorator effect

Step 2: Analyze the random assertion

Final Answer:

Quick Check:

Solution

Step 1: Check the decorator parameter spelling

Step 2: Understand impact of wrong parameter

Final Answer:

Quick Check:

Solution

Step 1: Understand flaky test retry purpose

Step 2: Combine retry with test stabilization

Step 3: Evaluate other options

Final Answer:

Quick Check:

Step 2: Analyze the effect of `@pytest.mark.flaky(reruns=N)`