Race Conditions in Parallel Test

Per-shard schemas and per-worker contexts remove the contention that makes shared-state suites flaky under parallelism.

Identifying Shared State & Concurrency Conflicts #

Race conditions typically manifest when tests assume exclusive access to databases, local storage, or global variables. In modern frontend architectures, Async State Management in E2E Tests frequently compounds these issues by introducing unpredictable promise resolutions across parallel workers. Additionally, visual assertions can fail when DOM Mutation & Rendering Races cause elements to detach or re-render mid-execution. Reliable parallelization requires strict worker isolation, deterministic data seeding, and explicit synchronization primitives.

Engineering Trade-off: Strict isolation increases per-spec execution overhead but eliminates cross-worker contamination. The optimal balance is achieved by isolating at the browser context and database transaction level rather than spinning up entirely new VMs per test.

Framework-Specific Isolation Patterns #

Playwright utilizes native process sharding with strict browser context isolation via fullyParallel: true. Cypress distributes tests across workers via Cypress Cloud or CI-level matrix strategies; each worker runs its own Cypress process in isolation. For component-level testing, developers must explicitly mock network layers and reset component state between specs. Detailed strategies for Resolving Race Conditions in Cypress Component Tests demonstrate how to enforce deterministic rendering before assertions trigger.

Key Isolation Vectors:

Browser Contexts: Use incognito/private contexts per worker to prevent cookie, localStorage, and session bleed.
Network Mocks: Route API calls through framework-native interceptors scoped to individual test files.
Database State: Implement transactional rollbacks or unique schema prefixes (test_worker_${SHARD_INDEX}) per parallel execution.

CI Pipeline Configuration & Worker Orchestration #

Effective parallel execution depends on CI infrastructure that supports dynamic worker allocation and artifact caching. Configure your pipeline to split test suites by execution time rather than file count. Use matrix builds to isolate database connections, mock API servers, and browser instances per worker. Implement idempotent setup/teardown scripts that run independently for each shard. Monitor worker health metrics to detect resource contention before it manifests as flaky failures.

GitHub Actions Matrix Example (.github/workflows/ci.yml):

name: Parallel E2E Pipeline
on: [push, pull_request]

jobs:
  e2e-shards:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shard: [1, 2, 3, 4]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '22', cache: 'npm' }
      - run: npm ci
      - run: npx playwright install --with-deps
      - name: Run Parallel Tests
        run: npx playwright test --shard=${{ matrix.shard }}/4
        env:
          DATABASE_URL: ${{ secrets.DB_URL }}_shard_${{ matrix.shard }}
      - uses: actions/upload-artifact@v4
        if: failure()
        with:
          name: test-results-${{ matrix.shard }}
          path: test-results/

CI Impact & Trade-offs: Sharding by execution time reduces pipeline variance but requires historical test duration data. Fixed-count sharding is simpler but leads to straggler workers. Always enforce fail-fast: false to ensure all shards run and report flakiness metrics accurately.

Step-by-Step Implementation Workflow #

Audit Shared State: Scan the test suite for implicit dependencies (localStorage, global mocks, singleton DB fixtures, shared ports).
Enable Native Parallel Mode: Activate framework-specific sharding with strict isolation flags (fullyParallel: true in Playwright or CI-level matrix for Cypress).
Replace Fixed Waits: Eliminate cy.wait(ms) and page.waitForTimeout() in favor of explicit network interception and DOM state assertions.
Implement Transactional Cleanup: Configure per-worker database transaction rollbacks or unique schema prefixes to guarantee state isolation.
Simulate Concurrency Locally: Run parallel suites locally with workers: 4 and simulated network latency (--slow-mo in Playwright or cy.intercept delay in Cypress) to reproduce race conditions deterministically.
Integrate Flakiness Tracking: Configure CI to auto-quarantine unstable specs, tag them with failure signatures, and block merges until deterministic fixes are verified.

Production Configuration Examples #

Playwright (`playwright.config.ts`) #

import { defineConfig } from '@playwright/test';

export default defineConfig({
  fullyParallel: true,
  workers: process.env.CI ? 4 : 2,
  retries: process.env.CI ? 2 : 0,
  use: {
    // storageState: undefined ensures no shared auth/cookies across workers.
    // Omit storageState from the use block when you want a clean context per test.
    trace: 'on-first-retry',
    bypassCSP: false,
    ignoreHTTPSErrors: false
  },
  reporter: process.env.CI ? [['github'], ['html', { open: 'never' }]] : 'list'
});

Cypress (`cypress.config.ts`) #

import { defineConfig } from 'cypress';

export default defineConfig({
  e2e: {
    // testIsolation: true (default since Cypress 12) clears state between tests.
    testIsolation: true,
    setupNodeEvents(on, config) {
      on('task', {
        async cleanupDB() {
          // Execute per-worker transaction rollback or schema purge.
          // Replace with your actual DB client logic.
          // Trade-off: Slight latency increase per test vs. guaranteed isolation.
          const workerId = config.env.WORKER_ID ?? 'default';
          console.log(`[task] cleanupDB for worker ${workerId}`);
          return null;
        }
      });
    }
  }
});

Common Pitfalls #

Assuming test file order guarantees execution sequence across workers (CI schedulers distribute specs non-deterministically)
Sharing localStorage, cookies, or IndexedDB across parallel browser contexts
Neglecting database transaction rollbacks between specs
Overusing fixed cy.wait(ms) or page.waitForTimeout() instead of explicit assertions
Running parallel tests against a single shared mock API server without request routing or port isolation

Reliability Metrics & KPIs #

Metric	Target Threshold	Tracking Method
Flakiness Reduction	`< 2%` intermittent failure rate	CI dashboard tracking retry vs. pass rates
CI Timeout per Shard	`15 minutes` max	Pipeline duration alerts & auto-cancellation
Retry Success Rate	`> 85%` on first retry	Test runner analytics (Playwright/Cypress Cloud)
Test Isolation Score	`100%` independent worker contexts	Static analysis + runtime state leak detection
Mean Time to Recovery (MTTR)	`< 4 hours` for quarantined specs	Incident tracking & flaky test auto-quarantine logs

Implementation Note: Track these metrics via CI pipeline exports (e.g., JUnit XML, Playwright JSON reporter, Cypress Cloud API). Integrate with Slack/PagerDuty for automated alerts when flakiness exceeds the 2% threshold.

FAQ #

How do I differentiate between a true race condition and a network timeout? Race conditions produce non-deterministic failures that vary based on execution order or system load, while network timeouts consistently fail after a fixed duration. Reproduce the test with simulated latency and varying worker counts to isolate timing dependencies.

Can I run Cypress and Playwright tests in parallel on the same CI runner? Yes, but you must isolate their respective browser instances, port allocations, and artifact directories. Use containerized runners or separate VMs per framework to prevent resource contention and port conflicts.

What is the recommended retry strategy for parallel test suites? Limit retries to 1–2 attempts. Excessive retries mask underlying race conditions. Combine retries with automatic flaky test quarantine and root-cause analysis dashboards.

Race Conditions in Parallel Test Runs: Detection & Resolution

Identifying Shared State & Concurrency Conflicts #

Framework-Specific Isolation Patterns #

CI Pipeline Configuration & Worker Orchestration #

Step-by-Step Implementation Workflow #

Production Configuration Examples #

Playwright (`playwright.config.ts`) #

Cypress (`cypress.config.ts`) #

Common Pitfalls #

Reliability Metrics & KPIs #

FAQ #

Child guides in this section

Isolating Database State in Parallel Jest Workers

Preventing Sharded Runner Seed Collisions

Resolving Race Conditions in Cypress Component Tests

Identifying Shared State & Concurrency Conflicts #

Framework-Specific Isolation Patterns #

CI Pipeline Configuration & Worker Orchestration #

Step-by-Step Implementation Workflow #

Production Configuration Examples #

Playwright (playwright.config.ts) #

Cypress (cypress.config.ts) #

Common Pitfalls #

Reliability Metrics & KPIs #

FAQ #

Related guides #

Child guides in this section

Isolating Database State in Parallel Jest Workers

Preventing Sharded Runner Seed Collisions

Resolving Race Conditions in Cypress Component Tests

Playwright (`playwright.config.ts`) #

Cypress (`cypress.config.ts`) #