Skip to content

fix(monitor): treat UND_ERR_HEADERS_TIMEOUT as soft failure in getUpdates long-poll (#75)#187

Open
WilShi wants to merge 1 commit into
Tencent:mainfrom
WilShi:fix/monitor-und-headers-timeout-soft-failure
Open

fix(monitor): treat UND_ERR_HEADERS_TIMEOUT as soft failure in getUpdates long-poll (#75)#187
WilShi wants to merge 1 commit into
Tencent:mainfrom
WilShi:fix/monitor-und-headers-timeout-soft-failure

Conversation

@WilShi

@WilShi WilShi commented Jun 4, 2026

Copy link
Copy Markdown

Fixes #75

Summary

getUpdates long-poll intermittently enters 30s backoff after UND_ERR_HEADERS_TIMEOUT, making the Weixin channel appear unavailable. This patch treats UND_ERR_HEADERS_TIMEOUT as a soft failure in the monitor catch block.

Root Cause

Node.js undici throws HeadersTimeoutError (UND_ERR_HEADERS_TIMEOUT) when the server holds a long-poll connection open waiting for messages and none arrive within the timeout window. This is a normal long-poll boundary event, not a network fault.

In src/monitor/monitor.ts, the catch block treats all errors uniformly — any exception increments consecutiveFailures. After 3 consecutive failures the monitor backs off for 30 seconds, during which no messages are received. If network conditions are unstable, this backoff cycle repeats, making the Weixin channel effectively unavailable.

This was previously reported in openclaw/openclaw#67564 and redirected to this repository as issue #75.

Fix

src/monitor/monitor.ts: Added a check in the catch block (after the abortSignal.aborted guard) that detects UND_ERR_HEADERS_TIMEOUT via err.cause.code. When matched, the error is logged at debug level and the loop retries after RETRY_DELAY_MS (2s) without incrementing the failure counter. All other errors continue through the existing hard-failure path unchanged.

Error type Behavior after fix
UND_ERR_HEADERS_TIMEOUT Soft failure (debug log + 2s retry, no backoff)
UND_ERR_CONNECT_TIMEOUT Hard failure (unchanged)
UND_ERR_SOCKET Hard failure (unchanged)
ENOTFOUND / ECONNREFUSED / EPROTO Hard failure (unchanged)

Test Plan

npm run typecheck   # ✓ passed (pre-existing type declaration warnings only)
npm run build       # ✓ passed
npx vitest run      # 371/372 passed (1 pre-existing failure in auth/pairing.test.ts)

…ates long-poll (Tencent#75)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] getUpdates long-poll UND_ERR_HEADERS_TIMEOUT treated as hard failure, triggers 30s backoff loop

1 participant