Skip to content

fix: detect zombie browser context in getSession() to prevent 30s tab timeouts#5416

Open
Love-JourneY wants to merge 1 commit into
jo-inc:masterfrom
Love-JourneY:fix/zombie-context-detection-v2
Open

fix: detect zombie browser context in getSession() to prevent 30s tab timeouts#5416
Love-JourneY wants to merge 1 commit into
jo-inc:masterfrom
Love-JourneY:fix/zombie-context-detection-v2

Conversation

@Love-JourneY

@Love-JourneY Love-JourneY commented Jun 8, 2026

Copy link
Copy Markdown

Fix: proactive zombie context detection

Problem

getSession() at line 1142-1148 uses session.context.pages() as a health probe — but this is synchronous and returns cached data even when the underlying CDP connection is dead. When newPage() is subsequently called on a zombie context, it hangs for 30s.

Evidence

// browser pre-warmed successfully
{"ts":"2026-06-08T09:01:57.960Z","msg":"browser pre-warmed","ms":11505}

// 43 seconds later — first tab creation times out
{"ts":"2026-06-08T09:02:40.897Z","msg":"tab create failed","error":"tab create timed out after 30000ms"}

// Recovery — retry succeeds
{"ts":"2026-06-08T09:03:58.686Z","msg":"tab created","userId":"bbsoyarch"}

Root Cause

// server.js:1142-1148 — existing check
try {
    session.context.pages();  // sync — returns cached data even if CDP dead
} catch (err) {
    // only catches fully closed contexts
}
// ↓ newPage() hangs 30s on zombie context
const page = await session.context.newPage();  // line 2616

The project already tracks healthState.lastSuccessfulNav (line 633) and has an active health probe (lines 5949-5980) at 60s intervals. But between probes, a zombie context can go undetected.

Fix

Add a staleness check after the existing pages() probe — if no successful navigation in 120s, recreate the context proactively:

if (session && Date.now() - healthState.lastSuccessfulNav > 120_000) {
  log('warn', 'session context possibly stale, recreating', {
    userId: key,
    lastNavAgeMs: Date.now() - healthState.lastSuccessfulNav,
  });
  await closeSession(key, session, { reason: 'stale_context', ... });
  session = null;
}

Scope

  • +11 / -0 lines, single file: server.js
  • No new dependencies
  • Leverages existing healthState.lastSuccessfulNav timestamp
  • Backward compatible

Closes #5415

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: getSession() sync pages() check misses zombie CDP connections, causing 30s tab create timeouts

1 participant