Skip to content

perf: cache comment router issue comments#50

Open
stainlu wants to merge 2 commits intoopenclaw:mainfrom
stainlu:stainlu/perf-comment-router-comment-cache
Open

perf: cache comment router issue comments#50
stainlu wants to merge 2 commits intoopenclaw:mainfrom
stainlu:stainlu/perf-comment-router-comment-cache

Conversation

@stainlu
Copy link
Copy Markdown
Contributor

@stainlu stainlu commented May 6, 2026

Summary

  • add a reusable cached issue-comments lookup with copied return arrays
  • route comment-router targeted discovery, prehydration, and replay/status checks through the same per-run cache
  • avoid repeated identical issue comment pagination without changing command classification or mutation behavior

Validation

  • npx --yes -p node@24 -c 'node -v && pnpm run check'

Copy link
Copy Markdown

@ds4psb-ai ds4psb-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent code review

Posted on behalf of @ds4psb-ai (read-only contributor). Maintainer judgment owns merge.

Summary verdict

LGTM after addressing one P1 — the cache itself is well-shaped and the test is good, but the refactor inadvertently downgrades prehydrateCommandLookups from concurrent async fetches to serialized sync subprocess spawns.

Findings

P0 / blocker

  • (none)

P1 / should fix before merge

  • src/repair/comment-router.ts:131-136 + src/repair/comment-router.ts:307-312 — the new cachedIssueComments is wired through sync ghPaged (line 132 in this PR). Pre-PR, prehydrateCommandLookups called fetchIssueCommentsAsync(number) which used ghPagedAsync (ghPagedWithRetryAsync) and so achieved real parallelism under mapLimit(issueNumbers, lookupConcurrency, ...). Post-PR, the mapper body cachedIssueComments(number) contains no await, so all N mapLimit workers serialize their gh subprocess spawns through the event loop. For a repair run with many open issue comments, this is a measurable wall-clock regression that runs counter to the perf goal of the PR.
    • Suggested fix: keep createCachedIssueCommentsLookup for the sync hot-path consumers (hasExistingResponse, hasExistingModeStatusResponse, issueCommentsFor), but introduce a parallel async-aware variant — e.g. createCachedIssueCommentsLookupAsync that takes fetchComments: (n) => Promise<T[]>, populates the same shared map, and is what prehydrateCommandLookups calls. The sync getter then opportunistically returns the warm copy for downstream readers. Alternatively, accept a dual fetcher { sync, async } in one factory.

P2 / nice to have / followup

  • src/repair/comment-router-core.ts:117-131 — the cache silently swallows fetchComments returning a non-array (safeComments = Array.isArray(comments) ? comments : []) AND caches the empty result. That hides upstream parser regressions: a future change that breaks the gh JSON shape would be cached as "this issue has no comments" for the rest of the run. Either (a) don't cache non-array fetches, or (b) log the shape mismatch once. Mirrors a pattern createCachedLabelNumberLookup doesn't have (it relies on uniquePositiveIntegers to pre-validate).
  • src/repair/comment-router-core.ts:121-122 — when Number.isInteger(key) && key > 0 is false you return [] without calling fetchComments. Good — but worth a one-line code comment so a maintainer doesn't "fix" the early return later thinking it skips a legitimate fetch.
  • CHANGELOG.md:40-41 — the entry is meaningful (per-run cache scope is observable behavior for anyone reading the bot's runtime budget), so the +2 lines are warranted.

Test coverage

test/repair/comment-router-core.test.ts:209-225 covers:

  • cold fetch + cached miss → fetch invoked once for key 12 (calls === [12, 13])
  • copied-array semantics (mutating the returned array doesn't pollute the next read)
  • string-vs-number key normalization (lookup("12") returns the cached entry)
  • invalid key short-circuit (lookup(0))

Solid for the core. Gaps worth one followup test (P2):

  • Concurrent calls for the same key — important now that prehydrate is the primary cache populator. Two lookup(12) issued before the first resolves should NOT result in two underlying fetches. The current sync implementation makes this safe by accident; an async variant (per the P1 above) would need an in-flight Map<number, Promise<T[]>> to remain race-free.
  • Empty-array fetch result is cached and returns a fresh [] on each subsequent call (sanity for the spread of an empty array; trivial but locks behavior).

Risks I considered and dismissed

  • Per-run lifetime correctness: cachedIssueComments is module-scoped, same as the old issueCommentsCache. The router process is per-run by AGENTS.md operating model, so the cache scope is unchanged.
  • Stale comments mid-run: pre-PR behavior was already "first fetch wins" via Map.get(...) ?? ghPaged(...); post-PR is the same.
  • Loss of hasExistingResponse / hasExistingModeStatusResponse correctness: both still see the same comments as before because prehydrateCommandLookups populates the map (sequentially now, but populated nonetheless) before classifyCommand runs.
  • Removed import ghPagedWithRetryAsync as ghPagedAsync: searched the file — it was only used by fetchIssueCommentsAsync, which the PR also removes. Clean removal.

Nice consolidation pattern overall — once the prehydrate parallelism is restored this is a strict win.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants