Standardize public API response-shape validation across maintenance scripts#1164
Standardize public API response-shape validation across maintenance scripts#1164yanyishuai wants to merge 1 commit into
Conversation
|
Warning Review limit reached
Next review available in: 17 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds Shared public API module and script migration
Possibly related issues
Possibly related PRs
🚥 Pre-merge checks | ✅ 5 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (5 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 44b91f10-03c4-4772-94c7-be845249b9b5
📒 Files selected for processing (6)
scripts/check_bounty_issue_states.pyscripts/check_live_bounty_closing_refs.pyscripts/claim_inventory.pyscripts/proposed_work_triage.pyscripts/public_api_json.pytests/test_public_api_json.py
|
|
||
| from scripts.api_host_args import public_api_host | ||
| from scripts.bounty_refs import BOUNTY_REF_RE | ||
| from scripts.public_api_json import load_public_api_state as fetch_public_api_state |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Keep claim inventory on the best-effort public API path.
fetch_public_api_state() is the strict loader: it calls validate_public_activity() and now raises if /api/v1/activity contains a non-list contributors or recent. claim_inventory can still produce a useful live report from GitHub data plus any valid public rows, so this delegation turns a partial public-API shape problem into a hard failure for every --repo run.
Proposed fix
-from scripts.public_api_json import load_public_api_state as fetch_public_api_state
+from scripts.public_api_json import extract_public_api_state, fetch_public_json
...
def load_public_api_state(api_host: str) -> dict[str, Any]:
- return fetch_public_api_state(api_host, limit=GH_LIMIT)
+ host = api_host.rstrip("/")
+ bounties = fetch_public_json(f"{host}/api/v1/bounties?limit={GH_LIMIT}")
+ activity = fetch_public_json(f"{host}/api/v1/activity?limit={GH_LIMIT}")
+ return extract_public_api_state(bounties, activity)Also applies to: 572-573
| try: | ||
| rows = fetch_public_json(list_url) | ||
| except RuntimeError as exc: | ||
| warnings.append( | ||
| "payment_state_incomplete: failed to load public bounty list " | ||
| f"for issue #{issue_number} ({type(exc).__name__})" | ||
| ) | ||
| return [], warnings | ||
| bounties: list[dict[str, Any]] = [] | ||
| for row in rows if isinstance(rows, list) else []: | ||
| if not isinstance(row, dict): | ||
| continue | ||
| bounty_id = row.get("id") | ||
| if not isinstance(bounty_id, int): | ||
| continue | ||
| detail_url = f"{api_host.rstrip('/')}/api/v1/bounties/{bounty_id}" | ||
| try: | ||
| detail = fetch_public_json(detail_url) | ||
| except RuntimeError as exc: | ||
| warnings.append( | ||
| "payment_state_incomplete: failed to load public bounty " | ||
| f"detail for bounty {bounty_id}; using list row only ({type(exc).__name__})" | ||
| ) | ||
| detail = row | ||
| if isinstance(detail, dict): | ||
| bounties.append(detail) | ||
| return bounties, warnings |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Validate the bounty list/detail shapes before deriving payment state.
This path now uses fetch_public_json() directly, which only guarantees “parseable JSON.” A non-list /api/v1/bounties?... response is silently treated as no rows, and a non-object /api/v1/bounties/{id} response is silently dropped. That turns malformed public API responses into missing payment data and can suppress accepted_pending_payout / proof_backed_paid warnings, which conflicts with the PR’s shared shape-validation goal.
Proposed fix
-from scripts.public_api_json import fetch_public_json
+from scripts.public_api_json import dict_rows, ensure_json_object, fetch_public_json
...
try:
- rows = fetch_public_json(list_url)
+ rows = dict_rows(fetch_public_json(list_url), url=list_url)
except RuntimeError as exc:
warnings.append(
"payment_state_incomplete: failed to load public bounty list "
f"for issue #{issue_number} ({type(exc).__name__})"
)
return [], warnings
bounties: list[dict[str, Any]] = []
- for row in rows if isinstance(rows, list) else []:
- if not isinstance(row, dict):
- continue
+ for row in rows:
bounty_id = row.get("id")
if not isinstance(bounty_id, int):
continue
detail_url = f"{api_host.rstrip('/')}/api/v1/bounties/{bounty_id}"
try:
- detail = fetch_public_json(detail_url)
+ detail = ensure_json_object(
+ fetch_public_json(detail_url), url=detail_url, label="bounty detail"
+ )
except RuntimeError as exc:
warnings.append(
"payment_state_incomplete: failed to load public bounty "
f"detail for bounty {bounty_id}; using list row only ({type(exc).__name__})"
)
detail = row
- if isinstance(detail, dict):
- bounties.append(detail)
+ bounties.append(detail)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| rows = fetch_public_json(list_url) | |
| except RuntimeError as exc: | |
| warnings.append( | |
| "payment_state_incomplete: failed to load public bounty list " | |
| f"for issue #{issue_number} ({type(exc).__name__})" | |
| ) | |
| return [], warnings | |
| bounties: list[dict[str, Any]] = [] | |
| for row in rows if isinstance(rows, list) else []: | |
| if not isinstance(row, dict): | |
| continue | |
| bounty_id = row.get("id") | |
| if not isinstance(bounty_id, int): | |
| continue | |
| detail_url = f"{api_host.rstrip('/')}/api/v1/bounties/{bounty_id}" | |
| try: | |
| detail = fetch_public_json(detail_url) | |
| except RuntimeError as exc: | |
| warnings.append( | |
| "payment_state_incomplete: failed to load public bounty " | |
| f"detail for bounty {bounty_id}; using list row only ({type(exc).__name__})" | |
| ) | |
| detail = row | |
| if isinstance(detail, dict): | |
| bounties.append(detail) | |
| return bounties, warnings | |
| try: | |
| rows = dict_rows(fetch_public_json(list_url), url=list_url) | |
| except RuntimeError as exc: | |
| warnings.append( | |
| "payment_state_incomplete: failed to load public bounty list " | |
| f"for issue #{issue_number} ({type(exc).__name__})" | |
| ) | |
| return [], warnings | |
| bounties: list[dict[str, Any]] = [] | |
| for row in rows: | |
| bounty_id = row.get("id") | |
| if not isinstance(bounty_id, int): | |
| continue | |
| detail_url = f"{api_host.rstrip('/')}/api/v1/bounties/{bounty_id}" | |
| try: | |
| detail = ensure_json_object( | |
| fetch_public_json(detail_url), url=detail_url, label="bounty detail" | |
| ) | |
| except RuntimeError as exc: | |
| warnings.append( | |
| "payment_state_incomplete: failed to load public bounty " | |
| f"detail for bounty {bounty_id}; using list row only ({type(exc).__name__})" | |
| ) | |
| detail = row | |
| bounties.append(detail) | |
| return bounties, warnings |
| try: | ||
| with urllib.request.urlopen(request, timeout=timeout_seconds) as response: | ||
| return json.loads(response.read().decode("utf-8")) | ||
| except (TimeoutError, urllib.error.URLError) as exc: | ||
| raise RuntimeError(f"public API request failed: {url}") from exc | ||
| except json.JSONDecodeError as exc: | ||
| raise RuntimeError(f"public API returned invalid JSON from {url}") from exc |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Normalize invalid UTF-8 into the RuntimeError path. response.read().decode("utf-8") can raise UnicodeDecodeError before the current handler, so callers that only treat RuntimeError as a soft failure will still see a raw exception here. Wrap decode/parse failures together.
🧰 Tools
🪛 ast-grep (0.44.0)
[warning] 15-15: Request-controlled URL passed to urlopen; validate against an allowlist to prevent SSRF.
Context: urllib.request.urlopen(request, timeout=timeout_seconds)
Note: [CWE-918] Server-Side Request Forgery (SSRF).
(urlopen-unsanitized-data)
| def test_ensure_json_list_rejects_object() -> None: | ||
| with pytest.raises(RuntimeError, match="expected a JSON list"): | ||
| pai.ensure_json_list({}, url="https://example.test/bounties") | ||
|
|
||
|
|
||
| def test_ensure_json_object_rejects_list() -> None: | ||
| with pytest.raises(RuntimeError, match="expected a JSON object"): | ||
| pai.ensure_json_object([], url="https://example.test/activity") | ||
|
|
||
|
|
||
| def test_validate_public_activity_rejects_bad_contributors() -> None: | ||
| with pytest.raises(RuntimeError, match="contributors list"): | ||
| pai.validate_public_activity({"contributors": "bad", "recent": []}, url="https://example.test/activity") | ||
|
|
||
|
|
||
| def test_dict_rows_skips_non_dict_entries() -> None: | ||
| rows = pai.dict_rows([{"id": 1}, "skip", {"id": 2}], url="https://example.test/bounties") | ||
| assert rows == [{"id": 1}, {"id": 2}] | ||
|
|
||
|
|
||
| def test_extract_public_api_state_keeps_valid_sections() -> None: | ||
| state = pai.extract_public_api_state( | ||
| [{"id": 1}], | ||
| {"contributors": [{"name": "a"}], "recent": [{"id": 2}], "extra": True}, | ||
| ) | ||
| assert state == { | ||
| "bounties": [{"id": 1}], | ||
| "contributors": [{"name": "a"}], | ||
| "recent": [{"id": 2}], | ||
| } |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Add a regression for the shared fetch-helper failure contract.
This suite never exercises fetch_public_json, even though the migrated scripts now depend on it raising RuntimeError on transport/parse failures. A small monkeypatch-based test for URLError plus malformed UTF-8/JSON would lock that contract down and catch regressions in the shared path. As per coding guidelines, "Add or update tests for changed behavior." As per path instructions, "Focus on whether tests prove the changed behavior and include negative, replay, boundary, or regression cases where relevant."
Sources: Coding guidelines, Path instructions
qingfeng312
left a comment
There was a problem hiding this comment.
Reviewed current head 52c920feab49f49ed78373f03035fe908698bcfe.
The PR currently fails the required quality gate during pytest collection. CI run 28318397540 reports:
tests/test_check_bounty_issue_states.pycannot importscripts.check_bounty_issue_statesbecausescripts.gh_cliis missing.tests/test_check_live_bounty_closing_refs.pycannot importscripts.check_live_bounty_closing_refsbecausescripts.gh_cliis missing.
This PR adds scripts/public_api_json.py and migrates several maintenance scripts to shared public API loaders, but the migrated scripts also import scripts.gh_cli, which is not present in the branch. That makes the shared response-shape validation unusable in a clean CI checkout.
Please add the missing helper or keep this refactor self-contained before merge.
Scope checked: CI log, current PR metadata, CodeRabbit status, and changed-file list only. No wallet, treasury, payout, private data, credentials, or external mutation paths were exercised.
8337789 to
ae50578
Compare
ae50578 to
e8ba33d
Compare
Summary
Implements proposed work for #1143 (and shared fetch helper from #1144).
scripts/public_api_json.pywith shared JSON fetch, list/object shape checks, bounty list loading, and activity validation.check_bounty_issue_states.py,check_live_bounty_closing_refs.py,claim_inventory.py, andproposed_work_triage.pyto use the shared helper instead of duplicated urllib blocks.tests/test_public_api_json.pycovering shape validation helpers.Test plan
pytest tests/test_public_api_json.py tests/test_check_bounty_issue_states.py tests/test_claim_inventory.pyCloses #1143
Summary by CodeRabbit
New Features
Bug Fixes
Tests