Skip to content

fix(pr-linking): require word boundary before closing keywords (#151)#152

Closed
dale053 wants to merge 6 commits into
MkDev11:mainfrom
dale053:fix/issue-151-pr-link-word-boundary
Closed

fix(pr-linking): require word boundary before closing keywords (#151)#152
dale053 wants to merge 6 commits into
MkDev11:mainfrom
dale053:fix/issue-151-pr-link-word-boundary

Conversation

@dale053
Copy link
Copy Markdown
Contributor

@dale053 dale053 commented May 25, 2026

Summary

Fix PR→issue closing-keyword linking to require a word boundary before the keyword, and add an authoritative cleanup path for the bad links the old regex already created.

  • Anchor the keyword regex (src/lib/pr-linking.ts): the close/fix/resolve alternation had no leading word boundary, so it matched the keyword as the tail of a larger word — bugfix #42, hotfix #1234, prefix #7, unresolved #5, discloses #3 all became false "closing" links. Added a non-consuming (?<=^|[^\w]) lookbehind (a consuming prefix would eat the separator and break adjacent matches under matchAll). GitHub does not treat these as closing references.
  • Reconciling backfill (src/lib/refresh.ts): backfillClosingIssuesForRepo now treats GitHub's closingIssuesReferences as the source of truth and delete-then-inserts each PR's rows, so it can finally purge the stale/false bugfix #N-class links the old append-only path could never remove. Untouched PRs do no writes (avoids WAL churn). The per-poll path stays additive (never deletes) so GraphQL-only links survive between reconciles.
  • Re-sweep guard (src/lib/db.ts, src/lib/refresh.ts): added a closing_issues_reconcile_version column + CLOSING_RECONCILE_VERSION = 1, so already-backfilled repos re-run exactly once when reconcile semantics change.
  • Safer batch handling (src/lib/github.ts): fetchPrsClosingIssuesBatch now omits PRs with null/absent nodes from the returned map (deleted PR / partial-data error) instead of asserting "closes nothing", so a failing batch never causes a delete.
  • Test setup (package.json, tsconfig.json): added a test script (node --test) and allowImportingTsExtensions to support the new unit tests.

Related Issues

Fixes #151.

Type of Change

  • Bug fix
  • New feature
  • Enhancement
  • Refactor
  • Documentation
  • Other (describe):

Testing

  • pnpm build passes
  • Manual browser smoke test (for UI changes) — N/A, no UI changes
  • N/A — docs / config only
  • pnpm test — new regression suite in src/lib/pr-linking.test.ts covers the bugfix #N-class over-matches plus standalone keyword/tense, punctuation, cross-repo, and de-dup cases

Checklist

  • Self-reviewed the diff
  • Follows existing code patterns and naming
  • No unrelated changes included
  • Documentation updated if behavior changed

Summary by CodeRabbit

  • New Features

    • Switched PR→issue backfill to a reconciliation-based process with a reconcile-version marker tracked per repository and added a repository metadata field to record it.
    • Backfill now skips unchanged PRs and reports added/removed links.
  • Bug Fixes

    • Fixed false positives for closing keywords embedded in larger words.
    • Omit missing/null PR nodes from GitHub responses rather than treating them as empty results.
  • Tests

    • Added automated tests for closing-keyword detection and a package script to run them.
  • Chores

    • Enabled TypeScript option to allow extension-suffixed imports.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 25, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c64aba81-a8d9-4aa3-a276-f890d27119a8

📥 Commits

Reviewing files that changed from the base of the PR and between ca410b4 and 79fd65f.

📒 Files selected for processing (1)
  • package.json

📝 Walkthrough

Walkthrough

Enforces correct closing-keyword boundaries with a regex fix and tests, migrates the DB to track reconcile versioning, changes the closing-issues backfill to per-PR reconciliation (delete+insert) with sweep reprocessing, and adds a Node test script.

Changes

PR→Issue Link Reconciliation

Layer / File(s) Summary
Regex and GraphQL boundary fixes
src/lib/pr-linking.ts, src/lib/github.ts
LINK_REGEX now uses a non-consuming lookbehind `(?<=^
Test suite for closing-link extraction
src/lib/pr-linking.test.ts
Adds Node tests covering regression corpus (#151), tense variants, punctuation/parentheses/newline boundaries, non-closing #n references, cross-repo and full-URL preservation, same-repo defaulting, title+body scanning, and deduplication.
Database schema migration for reconciliation tracking
src/lib/db.ts
Adds conditional migration to create repo_meta.closing_issues_reconcile_version (INTEGER NOT NULL DEFAULT 0) during DB initialization.
Delete-then-insert reconciliation for pr_issue_links
src/lib/refresh.ts
backfillClosingIssuesForRepo computes existing vs desired per-PR, skips unchanged PRs, deletes stale rows and inserts reconciled rows, tracks new_links and removed_links, and exports CLOSING_RECONCILE_VERSION = 1; return type updated to include removed_links.
Background sweep version-based repo selection and aggregation
src/lib/refresh.ts
runClosingBackfillSweep re-processes repos when closing_issues_reconcile_version is older than CLOSING_RECONCILE_VERSION or closing_issues_backfilled_at is null, and aggregates removed_links in logs.
Build configuration for test runner
package.json, tsconfig.json
Adds scripts.test running node --test 'src/**/*.test.ts' and enables compilerOptions.allowImportingTsExtensions in tsconfig.json.

Sequence Diagram

sequenceDiagram
  participant BackfillSweep
  participant GitHubGraphQL
  participant BackfillWorker
  participant Database
  BackfillSweep->>BackfillWorker: select repos needing reconcile
  BackfillWorker->>GitHubGraphQL: fetch PRs' closingIssuesReferences batch
  GitHubGraphQL-->>BackfillWorker: PR nodes (some may be null)
  BackfillWorker->>BackfillWorker: compute existing vs want per-PR
  BackfillWorker->>Database: delete stale pr_issue_links for PR (if mismatch)
  BackfillWorker->>Database: insert reconciled pr_issue_links (INSERT OR IGNORE)
  BackfillWorker->>Database: update repo_meta with closing_issues_reconcile_version and backfilled_at
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐇 I nibbled a regex that matched mid-word,

Now it waits for boundaries, tidy and stirred.
I sweep, delete, and replant links anew—
Tests guard the burrow; the reconcile grew.
Hooray — no more phantom issues to chew!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Linked Issues check ❓ Inconclusive The PR fully addresses both linked issues: #151 is resolved by anchoring the regex, adding reconciliation, and including regression tests; #42 is out of scope (no dependency updates present in this PR). Issue #42 (dependency bumps) is listed but no package.json dependency versions were actually changed; verify whether #42 should remain linked or be separated into a different PR.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly reflects the main fix: anchoring the closing-keyword regex with a word boundary to prevent false matches inside larger words, which is the primary correction in the changeset.
Out of Scope Changes check ✅ Passed All changes directly support issue #151 (regex fix, reconciliation backfill, column tracking, safer batch handling, regression tests, and config). No unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@package.json`:
- Around line 9-11: The package.json scripts block contains a duplicate "lint"
entry and a missing comma after the "test" script which breaks JSON parsing;
remove the duplicated "lint" key so only the original "lint": "next lint"
remains, ensure the "test": "node --test 'src/**/*.test.ts'" entry has a
trailing comma separating it from the next property, and keep the existing
"lint" value (do not replace it with "eslint . --max-warnings=0"); update the
scripts object to contain unique keys ("lint" and "test") with proper comma
separators.

In `@src/lib/pr-linking.ts`:
- Line 11: Remove the stray regex literal
"/\b(?:close[sd]?|fix(?:e[sd])?|resolve[sd]?)\s*:?\s*(?:(?:https?:\/\/github\.com\/)?([\w.-]+\/[\w.-]+))?#(\d+)/gi"
from the codebase (it appears to be a leftover) or, if it was intended as
documentation, replace it with a clear comment explaining its purpose and
reference the actual variable/function that holds the active PR-parsing regex
(e.g., the constant or function that previously used this pattern), then re-run
tests and ensure the committed file no longer contains the orphaned regex
literal.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: b9b493cd-66b5-4f0d-b844-abe9dea3ce9e

📥 Commits

Reviewing files that changed from the base of the PR and between 1d0b23d and 50e06a9.

📒 Files selected for processing (7)
  • package.json
  • src/lib/db.ts
  • src/lib/github.ts
  • src/lib/pr-linking.test.ts
  • src/lib/pr-linking.ts
  • src/lib/refresh.ts
  • tsconfig.json

Comment thread package.json Outdated
Comment thread src/lib/pr-linking.ts Outdated
@MkDev11
Copy link
Copy Markdown
Owner

MkDev11 commented May 26, 2026

Thanks for the PR. I’m going to close this one as not applicable for #151.

The concrete regex bug from #151 is already fixed on main by #138: LINK_REGEX now has a leading word boundary, and src/lib/db.ts already includes a one-shot purge for the old false-positive pr_issue_links rows. So Fixes #151 is no longer accurate.

This PR also introduces a broader delete-then-insert reconciliation path, but that is risky as written: closingIssuesReferences(first: 20) is not paginated, so valid links after the first 20 could be deleted, and null/missing GraphQL fields can be treated as empty results during reconciliation. There is also an accidental package-lock.json in a pnpm repo.

The parser regression tests are useful, but the PR no longer has a valid target issue and the extra reconciliation behavior would need a separate, current issue with pagination/partial-response handling covered.

@MkDev11 MkDev11 closed this May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] PR→issue closing-link regex matches keywords *inside larger words* — merged PRs falsely mark unrelated issues as solved, permanently

2 participants