Standardize public API response-shape validation across maintenance scripts by yanyishuai · Pull Request #1164 · ramimbo/mergework

yanyishuai · 2026-06-28T09:50:00Z

Summary

Implements proposed work for #1143 (and shared fetch helper from #1144).

Add scripts/public_api_json.py with shared JSON fetch, list/object shape checks, bounty list loading, and activity validation.
Migrate check_bounty_issue_states.py, check_live_bounty_closing_refs.py, claim_inventory.py, and proposed_work_triage.py to use the shared helper instead of duplicated urllib blocks.
Add tests/test_public_api_json.py covering shape validation helpers.

Test plan

Module imports cleanly
pytest tests/test_public_api_json.py tests/test_check_bounty_issue_states.py tests/test_claim_inventory.py

Closes #1143

Summary by CodeRabbit

New Features
- Added centralized handling for public API data used by multiple scripts.
Bug Fixes
- Improved reliability when loading bounty, activity, and related issue data.
- Added stronger validation and clearer error handling for unexpected or malformed responses.
Tests
- Added coverage for JSON validation, data filtering, and normalized public API state extraction.

coderabbitai · 2026-06-28T09:50:11Z

Warning

Review limit reached

@yanyishuai, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 17 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: d6961596-0f3a-4a42-8a8b-da4c33a838ca

📥 Commits

Reviewing files that changed from the base of the PR and between 52c920f and e8ba33d.

📒 Files selected for processing (4)

scripts/check_bounty_issue_states.py
scripts/check_live_bounty_closing_refs.py
scripts/gh_cli.py
scripts/public_api_json.py

📝 Walkthrough

Walkthrough

Adds scripts/public_api_json.py with centralized HTTP JSON fetching, type-shape validators, and endpoint loaders for the public bounty and activity APIs. Migrates check_bounty_issue_states.py, check_live_bounty_closing_refs.py, claim_inventory.py, and proposed_work_triage.py to use these shared helpers, removing their local urllib and subprocess implementations.

Shared public API module and script migration

Layer / File(s)	Summary
New `public_api_json` module and tests `scripts/public_api_json.py`, `tests/test_public_api_json.py`	Defines `fetch_public_json`, `ensure_json_list`, `ensure_json_object`, `dict_rows`, `load_public_bounty_list`, `validate_public_activity`, `load_public_activity`, `extract_public_api_state`, and `load_public_api_state`. Tests cover type-assertion failures, `dict_rows` filtering, `validate_public_activity` rejection, and `extract_public_api_state` normalization.
`check_bounty_issue_states` migration `scripts/check_bounty_issue_states.py`	Removes local `_fetch_json`, `_load_public_bounties`, and `_run_gh_json`; `load_live_data` calls `load_public_bounty_list`; `GH_TIMEOUT_SECONDS` imported as `DEFAULT_GH_TIMEOUT_SECONDS`; mutating `_run_gh` stays local.
`check_live_bounty_closing_refs` migration `scripts/check_live_bounty_closing_refs.py`	Replaces local urllib fetch and subprocess wrapper with `load_public_bounty_list` and imported `run_gh_json`; analysis and reporting logic unchanged.
`claim_inventory` migration `scripts/claim_inventory.py`	Removes `_get_json` and urllib imports; `load_public_api_state` delegates to `fetch_public_api_state(api_host, limit=GH_LIMIT)`; inventory analysis, surface extraction, and formatting unchanged.
`proposed_work_triage` migration `scripts/proposed_work_triage.py`	`_load_public_bounty_issue` calls `fetch_public_json` for list and detail URLs; error handling switches from urllib/JSON exceptions to `RuntimeError`; fallback and warning behavior preserved.

Possibly related issues

Proposed work: standardize public API response-shape validation across maintenance scripts #1143: This PR directly implements the proposed standardization — adds a shared public_api_json module with consistent shape-validation policy and aligns four maintenance scripts to it, with focused regression tests for malformed response shapes.

Possibly related PRs

ramimbo/mergework#582: Directly connected — claim_inventory.py in that PR uses fetch_public_json/load_public_api_state helpers introduced by this PR's new scripts/public_api_json.py.
ramimbo/mergework#763: Both PRs modify scripts/proposed_work_triage.py; the retrieved PR adds triage report logic that this PR's refactor touches at the HTTP fetch layer.
ramimbo/mergework#1021: Introduced check_bounty_issue_states.py and check_live_bounty_closing_refs.py, both of which this PR migrates to shared helpers.

🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The description has a summary and partial test plan, but it omits the required Evidence, Test Evidence, and MRWK sections.	Add the missing template sections: Evidence bullets, full Test Evidence checklist, and MRWK with the related bounty or issue.

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly names the changed surface: public API response-shape validation across maintenance scripts.
Linked Issues check	✅ Passed	The PR matches `#1143` by centralizing JSON shape validation, migrating affected scripts, and adding focused regression tests.
Out of Scope Changes check	✅ Passed	The diff stays within shared public-API validation and script migrations, with no unrelated feature or lifecycle changes.
Mergework Public Artifact Hygiene	✅ Passed	No README/docs changes were present, and the PR description only describes helper migrations/tests—no investment, price, cash-out/off-ramp, payout, or security claims.
Bounty Pr Focus	✅ Passed	PASS: Diff stays within the stated bounty/public-API maintenance scripts, shared helper, and tests; no unrelated subsystems surfaced.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 4

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 44b91f10-03c4-4772-94c7-be845249b9b5

📥 Commits

Reviewing files that changed from the base of the PR and between 3bc87d2 and 52c920f.

📒 Files selected for processing (6)

scripts/check_bounty_issue_states.py
scripts/check_live_bounty_closing_refs.py
scripts/claim_inventory.py
scripts/proposed_work_triage.py
scripts/public_api_json.py
tests/test_public_api_json.py

coderabbitai · 2026-06-28T09:58:21Z

+
+from scripts.api_host_args import public_api_host
+from scripts.bounty_refs import BOUNTY_REF_RE
+from scripts.public_api_json import load_public_api_state as fetch_public_api_state


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Keep claim inventory on the best-effort public API path.

fetch_public_api_state() is the strict loader: it calls validate_public_activity() and now raises if /api/v1/activity contains a non-list contributors or recent. claim_inventory can still produce a useful live report from GitHub data plus any valid public rows, so this delegation turns a partial public-API shape problem into a hard failure for every --repo run.

Proposed fix

-from scripts.public_api_json import load_public_api_state as fetch_public_api_state +from scripts.public_api_json import extract_public_api_state, fetch_public_json ... def load_public_api_state(api_host: str) -> dict[str, Any]: - return fetch_public_api_state(api_host, limit=GH_LIMIT) + host = api_host.rstrip("/") + bounties = fetch_public_json(f"{host}/api/v1/bounties?limit={GH_LIMIT}") + activity = fetch_public_json(f"{host}/api/v1/activity?limit={GH_LIMIT}") + return extract_public_api_state(bounties, activity)

Also applies to: 572-573

coderabbitai · 2026-06-28T09:58:21Z

+    try:
+        rows = fetch_public_json(list_url)
+    except RuntimeError as exc:
+        warnings.append(
+            "payment_state_incomplete: failed to load public bounty list "
+            f"for issue #{issue_number} ({type(exc).__name__})"
+        )
+        return [], warnings
+    bounties: list[dict[str, Any]] = []
+    for row in rows if isinstance(rows, list) else []:
+        if not isinstance(row, dict):
+            continue
+        bounty_id = row.get("id")
+        if not isinstance(bounty_id, int):
+            continue
+        detail_url = f"{api_host.rstrip('/')}/api/v1/bounties/{bounty_id}"
+        try:
+            detail = fetch_public_json(detail_url)
+        except RuntimeError as exc:
+            warnings.append(
+                "payment_state_incomplete: failed to load public bounty "
+                f"detail for bounty {bounty_id}; using list row only ({type(exc).__name__})"
+            )
+            detail = row
+        if isinstance(detail, dict):
+            bounties.append(detail)
+    return bounties, warnings


🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win

Validate the bounty list/detail shapes before deriving payment state.

This path now uses fetch_public_json() directly, which only guarantees “parseable JSON.” A non-list /api/v1/bounties?... response is silently treated as no rows, and a non-object /api/v1/bounties/{id} response is silently dropped. That turns malformed public API responses into missing payment data and can suppress accepted_pending_payout / proof_backed_paid warnings, which conflicts with the PR’s shared shape-validation goal.

Proposed fix

-from scripts.public_api_json import fetch_public_json +from scripts.public_api_json import dict_rows, ensure_json_object, fetch_public_json ... try: - rows = fetch_public_json(list_url) + rows = dict_rows(fetch_public_json(list_url), url=list_url) except RuntimeError as exc: warnings.append( "payment_state_incomplete: failed to load public bounty list " f"for issue #{issue_number} ({type(exc).__name__})" ) return [], warnings bounties: list[dict[str, Any]] = [] - for row in rows if isinstance(rows, list) else []: - if not isinstance(row, dict): - continue + for row in rows: bounty_id = row.get("id") if not isinstance(bounty_id, int): continue detail_url = f"{api_host.rstrip('/')}/api/v1/bounties/{bounty_id}" try: - detail = fetch_public_json(detail_url) + detail = ensure_json_object( + fetch_public_json(detail_url), url=detail_url, label="bounty detail" + ) except RuntimeError as exc: warnings.append( "payment_state_incomplete: failed to load public bounty " f"detail for bounty {bounty_id}; using list row only ({type(exc).__name__})" ) detail = row - if isinstance(detail, dict): - bounties.append(detail) + bounties.append(detail)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

rows = fetch_public_json(list_url)

except RuntimeError as exc:

warnings.append(

"payment_state_incomplete: failed to load public bounty list "

f"for issue #{issue_number} ({type(exc).__name__})"

)

return [], warnings

bounties: list[dict[str, Any]] = []

for row in rows if isinstance(rows, list) else []:

if not isinstance(row, dict):

continue

bounty_id = row.get("id")

if not isinstance(bounty_id, int):

continue

detail_url = f"{api_host.rstrip('/')}/api/v1/bounties/{bounty_id}"

try:

detail = fetch_public_json(detail_url)

except RuntimeError as exc:

warnings.append(

"payment_state_incomplete: failed to load public bounty "

f"detail for bounty {bounty_id}; using list row only ({type(exc).__name__})"

)

detail = row

if isinstance(detail, dict):

bounties.append(detail)

return bounties, warnings

try:

rows = dict_rows(fetch_public_json(list_url), url=list_url)

except RuntimeError as exc:

warnings.append(

"payment_state_incomplete: failed to load public bounty list "

f"for issue #{issue_number} ({type(exc).__name__})"

)

return [], warnings

bounties: list[dict[str, Any]] = []

for row in rows:

bounty_id = row.get("id")

if not isinstance(bounty_id, int):

continue

detail_url = f"{api_host.rstrip('/')}/api/v1/bounties/{bounty_id}"

try:

detail = ensure_json_object(

fetch_public_json(detail_url), url=detail_url, label="bounty detail"

)

except RuntimeError as exc:

warnings.append(

"payment_state_incomplete: failed to load public bounty "

f"detail for bounty {bounty_id}; using list row only ({type(exc).__name__})"

)

detail = row

bounties.append(detail)

return bounties, warnings

coderabbitai · 2026-06-28T09:58:21Z

+    try:
+        with urllib.request.urlopen(request, timeout=timeout_seconds) as response:
+            return json.loads(response.read().decode("utf-8"))
+    except (TimeoutError, urllib.error.URLError) as exc:
+        raise RuntimeError(f"public API request failed: {url}") from exc
+    except json.JSONDecodeError as exc:
+        raise RuntimeError(f"public API returned invalid JSON from {url}") from exc


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Normalize invalid UTF-8 into the RuntimeError path. response.read().decode("utf-8") can raise UnicodeDecodeError before the current handler, so callers that only treat RuntimeError as a soft failure will still see a raw exception here. Wrap decode/parse failures together.

🧰 Tools

🪛 ast-grep (0.44.0)

[warning] 15-15: Request-controlled URL passed to urlopen; validate against an allowlist to prevent SSRF.
Context: urllib.request.urlopen(request, timeout=timeout_seconds)
Note: [CWE-918] Server-Side Request Forgery (SSRF).

(urlopen-unsanitized-data)

coderabbitai · 2026-06-28T09:58:21Z

+def test_ensure_json_list_rejects_object() -> None:
+    with pytest.raises(RuntimeError, match="expected a JSON list"):
+        pai.ensure_json_list({}, url="https://example.test/bounties")
+
+
+def test_ensure_json_object_rejects_list() -> None:
+    with pytest.raises(RuntimeError, match="expected a JSON object"):
+        pai.ensure_json_object([], url="https://example.test/activity")
+
+
+def test_validate_public_activity_rejects_bad_contributors() -> None:
+    with pytest.raises(RuntimeError, match="contributors list"):
+        pai.validate_public_activity({"contributors": "bad", "recent": []}, url="https://example.test/activity")
+
+
+def test_dict_rows_skips_non_dict_entries() -> None:
+    rows = pai.dict_rows([{"id": 1}, "skip", {"id": 2}], url="https://example.test/bounties")
+    assert rows == [{"id": 1}, {"id": 2}]
+
+
+def test_extract_public_api_state_keeps_valid_sections() -> None:
+    state = pai.extract_public_api_state(
+        [{"id": 1}],
+        {"contributors": [{"name": "a"}], "recent": [{"id": 2}], "extra": True},
+    )
+    assert state == {
+        "bounties": [{"id": 1}],
+        "contributors": [{"name": "a"}],
+        "recent": [{"id": 2}],
+    }


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Add a regression for the shared fetch-helper failure contract.

This suite never exercises fetch_public_json, even though the migrated scripts now depend on it raising RuntimeError on transport/parse failures. A small monkeypatch-based test for URLError plus malformed UTF-8/JSON would lock that contract down and catch regressions in the shared path. As per coding guidelines, "Add or update tests for changed behavior." As per path instructions, "Focus on whether tests prove the changed behavior and include negative, replay, boundary, or regression cases where relevant."

Sources: Coding guidelines, Path instructions

qingfeng312

Reviewed current head 52c920feab49f49ed78373f03035fe908698bcfe.

The PR currently fails the required quality gate during pytest collection. CI run 28318397540 reports:

tests/test_check_bounty_issue_states.py cannot import scripts.check_bounty_issue_states because scripts.gh_cli is missing.
tests/test_check_live_bounty_closing_refs.py cannot import scripts.check_live_bounty_closing_refs because scripts.gh_cli is missing.

This PR adds scripts/public_api_json.py and migrates several maintenance scripts to shared public API loaders, but the migrated scripts also import scripts.gh_cli, which is not present in the branch. That makes the shared response-shape validation unusable in a clean CI checkout.

Please add the missing helper or keep this refactor self-contained before merge.

Scope checked: CI log, current PR metadata, CodeRabbit status, and changed-file list only. No wallet, treasury, payout, private data, credentials, or external mutation paths were exercised.

…amimbo#1143)

coderabbitai Bot reviewed Jun 28, 2026

View reviewed changes

qingfeng312 suggested changes Jun 29, 2026

View reviewed changes

qingfeng312 mentioned this pull request Jun 29, 2026

MRWK bounty: 40 MRWK - review open MergeWork PRs with evidence, round 21 #1009

Open

yanyishuai force-pushed the bounty-1143-public-api-json branch 4 times, most recently from 8337789 to ae50578 Compare July 1, 2026 03:25

fix(scripts): add missing gh_cli helper for public API JSON scripts (r…

e8ba33d

…amimbo#1143)

yanyishuai force-pushed the bounty-1143-public-api-json branch from ae50578 to e8ba33d Compare July 1, 2026 03:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Standardize public API response-shape validation across maintenance scripts#1164

Standardize public API response-shape validation across maintenance scripts#1164
yanyishuai wants to merge 1 commit into
ramimbo:mainfrom
yanyishuai:bounty-1143-public-api-json

yanyishuai commented Jun 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 28, 2026 •

edited

Loading

Review limit reached

Walkthrough

Possibly related issues

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Uh oh!

coderabbitai Bot Jun 28, 2026

Uh oh!

coderabbitai Bot Jun 28, 2026

Uh oh!

coderabbitai Bot Jun 28, 2026

Uh oh!

qingfeng312 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

yanyishuai commented Jun 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Possibly related issues

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

qingfeng312 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yanyishuai commented Jun 28, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 28, 2026 •

edited

Loading