Problem
The distributed render plan() stage crashes when headless Chrome encounters a transient frame detachment error during the browser probe, with no retry logic. The plan tarball is never written, and downstream chunk workers fail.
Error: errorCode=INTERNAL_ERROR message=Navigating frame was detached
This is a Puppeteer/Chrome DevTools Protocol error that occurs when Chrome's main frame navigates away or is detached during an operation — typically due to memory pressure, compositor race conditions, or Chrome warmup timing.
Impact
- All chunk workers fail (plan artifact missing)
- The entire distributed render fails after retry pass
- The error is transient — a fresh browser session would succeed
Root Cause
probeStage.ts calls initializeSession() (which calls page.goto() in frameCapture.ts) with zero retry logic for transient browser errors. The error propagates straight through: initializeSession → probeStage → plan() → adapter → failure.
Known transient Puppeteer errors that should be retried:
Navigating frame was detached
Target closed
Protocol error ... Target closed
Session closed
Navigation failed because browser has disconnected
Fix Plan
- Retry logic in
probeStage.ts: Wrap the browser session lifecycle (create → initialize → probe) with retry-on-transient-error. On failure: close the crashed session, create a fresh browser, retry once.
- Typed error classification: Add
isTransientBrowserError() utility in the engine for classifying retryable vs. fatal browser errors.
- Observability: Structured logging at the retry boundary — error class, attempt number, elapsed time.
- Tests: Unit tests for the error classifier + integration test for probe retry.
Problem
The distributed render
plan()stage crashes when headless Chrome encounters a transient frame detachment error during the browser probe, with no retry logic. The plan tarball is never written, and downstream chunk workers fail.Error:
errorCode=INTERNAL_ERROR message=Navigating frame was detachedThis is a Puppeteer/Chrome DevTools Protocol error that occurs when Chrome's main frame navigates away or is detached during an operation — typically due to memory pressure, compositor race conditions, or Chrome warmup timing.
Impact
Root Cause
probeStage.tscallsinitializeSession()(which callspage.goto()inframeCapture.ts) with zero retry logic for transient browser errors. The error propagates straight through:initializeSession→probeStage→plan()→ adapter → failure.Known transient Puppeteer errors that should be retried:
Navigating frame was detachedTarget closedProtocol error ... Target closedSession closedNavigation failed because browser has disconnectedFix Plan
probeStage.ts: Wrap the browser session lifecycle (create → initialize → probe) with retry-on-transient-error. On failure: close the crashed session, create a fresh browser, retry once.isTransientBrowserError()utility in the engine for classifying retryable vs. fatal browser errors.