Skip to content

perf: bound repair plan PR hydration#49

Merged
steipete merged 2 commits intoopenclaw:mainfrom
stainlu:stainlu/perf-bounded-pr-plan-hydration
May 9, 2026
Merged

perf: bound repair plan PR hydration#49
steipete merged 2 commits intoopenclaw:mainfrom
stainlu:stainlu/perf-bounded-pr-plan-hydration

Conversation

@stainlu
Copy link
Copy Markdown
Contributor

@stainlu stainlu commented May 6, 2026

Summary

  • add a limited GitHub REST page helper for bounded list hydration
  • bound repair cluster PR files and commits to the counts carried into generated plans
  • preserve full issue comments, PR reviews, review comments, and checks hydration for safety evidence

Validation

  • npx --yes -p node@24 -c 'node -v && pnpm run check'

@stainlu stainlu force-pushed the stainlu/perf-bounded-pr-plan-hydration branch from ac5a8e5 to 9d1b504 Compare May 6, 2026 04:29
Copy link
Copy Markdown
Contributor

@ds4psb-ai ds4psb-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent code review

Posted on behalf of @ds4psb-ai (read-only contributor). Maintainer judgment owns merge.

Summary verdict

LGTM with two P2 notes — the bounding is mechanical (only files & commits go through ghPagedLimit; comments, reviews, review-comments, and checks stay on full pagination), the helper math holds at every boundary I checked, and the new counts struct extension is backward-compatible.

Findings

P0 / blocker

  • (none)

P1 / should fix before merge

  • (none)

P2 / nice to have / followup

  • src/repair/plan-cluster.ts:21-22Number(process.env.CLAWSWEEPER_MAX_FILES_PER_PR ?? 80) silently coerces a typo'd env var (e.g. CLAWSWEEPER_MAX_FILES_PER_PR=eighty) to NaN, which ghPagedLimit then turns into max=0 → empty files array → silent loss of all PR file hydration for the run. Default 80 protects production, but operators tweaking these knobs can disable hydration without warning. Suggest either: (a) guard at module load and warn-and-fallback if !Number.isFinite(x) || x <= 0; or (b) document the parsing rule next to the constant. The same applies to MAX_COMMITS_PER_PR, and the existing MAX_LINKED_REFS/MAX_COMMENTS_PER_ITEM/MAX_REVIEW_COMMENTS_PER_PR triangulate the same risk.
  • test/repair/plan-cluster.test.ts:225-283 — the bounded-vs-full divide test only exercises the 'PR has exactly enough on page 1' codepath. The fake gh script returns limit entries on the first per_page request, and because ghPagedLimit then sees out.length === max after page 1, the loop exits before page 2. The multi-page hydration path (e.g. MAX_FILES_PER_PR=150 with the helper splitting into two pages of 100 + 50) is not covered. A small extension to the test that drives limit > 100 and asserts the helper actually issues two paginated calls would lock that in.

Test coverage

test/repair/github-cli.test.ts:21-37 covers githubLimitedPagePath:

  • happy path (limit=80, page=1)
  • existing query param preservation + override of per_page=100 (good — proves the override: true path)
  • zero/zero edge case (limit=0, page=0 → both clamped to 1)

test/repair/plan-cluster.test.ts:225-283 exercises the end-to-end happy path: 120-file/120-commit PR returns hydrated=80, truncated=40 for both. Good shape.

Gaps (P2):

  • Multi-page ghPagedLimit path (per the second P2 above).
  • changed_files returned by GitHub API as null/missing — countValue falls back to files.length, which is good, but no test pins this fallback.
  • commits_truncated and files_truncated survive the round-trip through compactPlanItem (src/repair/lib.ts:380-387). The end-to-end test asserts on cluster-plan.json so this is actually exercised, but it'd be worth a one-line direct unit test for compactPlanItem to lock the new keys.

Risks I considered and dismissed

  • Mechanical preservation claim: grep -rn 'ghPagedLimit' src/ test/ confirms the only consumers are the two lines for files & commits. Issue comments, PR reviews, review comments, and checks all still go through ghPaged / ghPrChecks (src/repair/plan-cluster.ts:189, 197, 198, 201). Future contributors would have to actively replace ghPaged to bound those — not opt-in by accident.
  • ghPagedLimit boundary at max == perPage: outer loop out.length < max exits before issuing page 2 — no over-fetch.
  • ghPagedLimit slice safety at max=120, perPage=100: page 1 returns 100, out.length=100<120, page 2 fetches up to 100, slice trims to 120. Bounded both above and below.
  • githubLimitedPagePath overriding per_page from caller-passed query: this is the intended behavior because the helper is the SSoT for the page size, and the existing ?per_page=50 test case on line 31 of test/repair/github-cli.test.ts proves the override even when the caller passed a different value. Good.
  • CHANGELOG entry meaningfulness: the +2 lines are observable (anyone reviewing a generated cluster plan will now see files_truncated / commits_truncated non-zero on large PRs and correlate to this bound). Warranted.

Clean perf bound. The 'preserve full hydration for safety evidence' framing in the PR description is accurate at the codepath level.

@stainlu
Copy link
Copy Markdown
Contributor Author

stainlu commented May 8, 2026

Thanks, addressed in 0f2e9aa.

What changed:

  • parsed all plan-cluster numeric limits through one readNonNegativeIntegerEnv helper, so malformed/negative/non-integer env values fall back to the default instead of collapsing hydration to zero
  • kept explicit 0 as the opt-out behavior for bounded hydration
  • extended the plan-cluster fake gh fixture to simulate a 150-file/commit PR across two GitHub pages, then asserted both file and commit hydration fetch page 1 + page 2

Validation:

  • pnpm run build:repair
  • pnpm run lint:repair
  • pnpm run lint:scripts
  • pnpm run format:check
  • git diff --check
  • node --test test/repair/plan-cluster.test.ts

I also tried pnpm run test:repair; under my local Node 22 shell (package wants >=24) it reaches 243 passing tests including the new plan-cluster coverage, then exits on the existing notifier cancelledByParent pending-promise cases.

Copy link
Copy Markdown
Contributor

@steipete steipete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed bounded repair plan hydration fix; CI green and targeted local repair tests passed.

@steipete steipete merged commit fc474e2 into openclaw:main May 9, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants