Root cause #
A Jest test that fails on its first attempt and passes on a retry is, by definition, non-deterministic. In a single-process Jest run the most common drivers are leaked async state between tests, a timer that resolves at a different point relative to an assertion, or shared module state that the previous test left dirty. None of these are fixed by the retry — the retry only proves the outcome depends on timing or order rather than on the code under test.
The danger with jest.retryTimes is that it defaults to silence. If a test passes on attempt two, Jest reports the suite as green and you never learn the test is flaky. The whole point of using it as a detection tool is to invert that: keep the retries so CI stays unblocked, but emit a record every time a retry was needed so the flakiness shows up in your historical flakiness tracking instead of vanishing.
Step-by-step fix #
1. Enable retries with error logging #
Call jest.retryTimes in a setup file so it applies to every suite. The logErrorsBeforeRetry option prints the failing error before each retry, which is what makes the retry observable instead of invisible.
// jest.setup.js — referenced from setupFilesAfterEach in jest.config
// retryTimes is a global; it must run before the tests in each file.
jest.retryTimes(2, { logErrorsBeforeRetry: true });
// Trade-off: 2 retries caps CI cost; higher counts mask deeper bugs
// and inflate runtime, so keep n small and treat retries as a signal.
2. Count retried-but-passed tests with a custom reporter #
A Jest reporter sees each test result, including how many invocations it took. testResult exposes invocations (total attempts) — when a test ends passed but took more than one invocation, it is flaky.
// flaky-reporter.js
const fs = require('fs');
class FlakyReporter {
constructor() { this.flaky = []; }
onTestResult(_test, testResult) {
for (const r of testResult.testResults) {
// r.invocations > 1 means a retry happened; status 'passed' means it recovered.
if (r.status === 'passed' && r.invocations > 1) {
this.flaky.push({
file: testResult.testFilePath,
title: r.fullName,
attempts: r.invocations,
});
}
}
}
onRunComplete() {
// Persist as JSON for later aggregation, not stdout noise.
fs.writeFileSync('flaky-jest.json', JSON.stringify(this.flaky, null, 2));
if (this.flaky.length) {
console.warn(`Detected ${this.flaky.length} flaky test(s) via retry.`);
}
}
}
module.exports = FlakyReporter;
// Trade-off: writing a file per shard is cheap, but you must merge
// shard outputs before computing a real flake rate (CI cost is low).
3. Register the reporter and fail the build on regressions #
Wire the reporter into Jest config alongside the default reporter so you keep normal output, then decide whether a newly flaky test should fail the build.
// jest.config.js
module.exports = {
setupFilesAfterEach: ['<rootDir>/jest.setup.js'],
reporters: ['default', '<rootDir>/flaky-reporter.js'],
};
// Trade-off: keeping CI green on retry keeps velocity, but pair it with
// a budget check (step below) so flakiness cannot grow unbounded.
4. Gate on a flaky budget #
Add a tiny post-run check that compares the detected flaky count against a budget. This keeps retries from becoming a permanent crutch.
// check-flaky-budget.js
const flaky = require('./flaky-jest.json');
const BUDGET = 3; // max tolerated flaky tests per run
if (flaky.length > BUDGET) {
console.error(`Flaky budget exceeded: ${flaky.length} > ${BUDGET}`);
process.exit(1); // fail the job so the regression is addressed
}
// Trade-off: a hard budget surfaces drift early but can block merges;
// start lenient, then ratchet down as you stabilize the suite.
Pitfalls #
- Treating retries as a fix: passing on retry hides the bug. Mitigation: always emit a record and review it; never let a green retry close the loop.
- Setting
retryTimestoo high: 5+ retries can turn a 90%-failing test green. Mitigation: cap at 1-2 and rely on the reporter to flag recurrence. - Losing data across shards: each worker writes its own JSON. Mitigation: merge all
flaky-jest.jsonartifacts before computing rates. - Order-dependent flakiness: a test only flakes after another runs first. Mitigation: run with
--randomizeperiodically so retries surface ordering bugs. - Confusing
invocationswithnumPassingAsserts: onlyinvocations > 1indicates a retry. Mitigation: assert on the attempt count, not assertion count.
Reliability targets #
| Metric | Target |
|---|---|
Retry count (retryTimes n) |
1-2 |
| Flaky tests detected per run | ≤ 3 (budgeted) |
| Pass-on-retry rate per suite | < 1% |
| Time-to-quarantine after detection | < 1 sprint |
| CI pass rate (post-retry) | ≥ 99% |
Frequently Asked Questions #
Q: Does jest.retryTimes hide real failures? A: It can, which is why detection matters. A test that fails all retries still fails the build; only pass-on-retry outcomes are softened, and the reporter records each one so they are never truly hidden.
Q: Where do I call jest.retryTimes?
A: In a setupFilesAfterEach file so it executes before each test file. It is a global and has no effect if called from inside an individual test body.
Q: How do I turn detection into a trend? A: Persist the reporter’s JSON per run and aggregate it over time. See tracking test flakiness trends over time for the rolling-window approach.