fix(scripts): reject empty report source args (#807) by yanyishuai · Pull Request #1186 · ramimbo/mergework

yanyishuai · 2026-06-30T03:24:45Z

Summary

Harden remaining maintainer report scripts so empty or whitespace-padded --input / --repo source arguments fail with bounded argparse errors instead of tracebacks or malformed live-mode calls.

Changes

scripts/proposed_work_triage.py — validate --repo and --input paths
scripts/claim_inventory.py — validate --input / --repo
scripts/pr_queue_health.py — validate --input / --repo
Tests mirror #801 / #805 patterns for empty and whitespace-padded CLI args

Fixes #807

Wallet: Do4v7foHJvRJLpRRoGaVPWX6DDEjX3yTK7J91gpwUQpE

Summary by CodeRabbit

Bug Fixes
- Added stricter CLI input validation to reject empty values and any leading/trailing whitespace for source arguments.
- Improved safety-limit enforcement for GitHub data collection so runs fail fast when result thresholds are reached.
New Features
- Introduced clearer separation of safety thresholds between public API and live repository collection.
Tests
- Expanded automated coverage for invalid CLI inputs and safety-cap behavior across inventory, queue health, and triage workflows.

coderabbitai · 2026-06-30T03:25:10Z

📝 Walkthrough

Walkthrough

Adds safety-cap checks to scripts/claim_inventory.py and strict CLI validation for --input and --repo in three report scripts. New tests cover cap overflow, fixture/live loading guards, and invalid source arguments.

Changes

Safety caps and CLI source validation

Layer / File(s)	Summary
Safety-cap constants and loader guards `scripts/claim_inventory.py`	Defines `GH_PUBLIC_API_SAFETY_CAP`, `GH_ISSUE_SAFETY_CAP`, and `GH_PR_SAFETY_CAP`, then uses them in public API and live inventory loading with `RuntimeError` guards when returned lists reach the caps.
CLI source validation in report scripts `scripts/claim_inventory.py`, `scripts/pr_queue_health.py`, `scripts/proposed_work_triage.py`	Adds strict validation for `--input` and `--repo` values before loading fixture or live data, and updates the main control flow to use the validated arguments.
Regression tests for caps and invalid source args `tests/test_claim_inventory.py`, `tests/test_pr_queue_health.py`, `tests/test_proposed_work_triage.py`	Adds and updates tests for cap constants, overflow failures, and CLI rejection of empty, whitespace-only, or padded source arguments.

Possibly related issues

#978: Directly covers the scripts/claim_inventory.py safety-cap guards and matching regression tests added here.

Possibly related PRs

ramimbo/mergework#324: Shares the scripts/pr_queue_health.py CLI path and related source-argument handling changes.
ramimbo/mergework#582: Closely related to the scripts/claim_inventory.py loader and inventory-report behavior updated here.
ramimbo/mergework#763: Shares the scripts/proposed_work_triage.py CLI entrypoint and source-selection code path.

Suggested labels

mrwk:accepted, mrwk:paid

Suggested reviewers

jakerated-r
elianguitarra
JONASXZB

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Out of Scope Changes check	⚠️ Warning	claim_inventory.py also adds new live/public GitHub safety-cap logic, which is unrelated to the linked validation issue.	Split the safety-cap changes into a separate PR or remove them from this issue-scoped patch.
Docstring Coverage	⚠️ Warning	Docstring coverage is 1.82% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: stricter validation for report source arguments in scripts.
Linked Issues check	✅ Passed	The PR matches `#807` by hardening --input/--repo validation in all three scripts and adding focused tests.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 4

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 37deb7af-35b0-4c21-bf7b-119a18808188

📥 Commits

Reviewing files that changed from the base of the PR and between 3bc87d2 and 59db114.

📒 Files selected for processing (6)

scripts/claim_inventory.py
scripts/pr_queue_health.py
scripts/proposed_work_triage.py
tests/test_claim_inventory.py
tests/test_pr_queue_health.py
tests/test_proposed_work_triage.py

coderabbitai · 2026-06-30T03:31:50Z

+def _require_input_path(parser: argparse.ArgumentParser, path: Path) -> Path:
+    raw = str(path)
+    if not raw.strip():
+        parser.error("--input must be a non-empty path")
+    if raw != raw.strip():
+        parser.error("--input must not include leading or trailing whitespace")
+    return path
+
+
+def main(argv: list[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(description="Read-only proposed-work intake triage report")
+    source = parser.add_mutually_exclusive_group(required=True)
+    source.add_argument("--input", type=Path, help="Offline JSON fixture with issues/payments")
+    source.add_argument("--repo", help="GitHub repo for read-only gh live mode")


🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash python - <<'PY' from pathlib import Path print(f"Path('') -> {Path('')!r}") print(f"str(Path('')) -> {str(Path(''))!r}") PY rg -n -C2 'source_args|--input' tests/test_proposed_work_triage.py

Repository: ramimbo/mergework

Length of output: 1056

--input "" still bypasses the new check. argparse applies type=Path first, so an empty value becomes Path("."); _require_input_path() then sees "." and the script can fail with an OS error instead of argparse’s exit-2 path. Accept the raw string first, then convert to Path after validation. scripts/proposed_work_triage.py:544-565

coderabbitai · 2026-06-30T03:31:50Z

+@pytest.mark.parametrize(
+    ("source_args", "expected_message"),
+    (
+        (["--repo", ""], "--repo must be a non-empty value"),
+        (["--repo", "   "], "--repo must be a non-empty value"),
+        (["--repo", " ramimbo/mergework "], "--repo must not include"),
+    ),
+)
+def test_proposed_work_triage_rejects_empty_repo_source(
+    source_args: list[str],
+    expected_message: str,
+    capsys,
+) -> None:
+    with pytest.raises(SystemExit) as excinfo:
+        main([*source_args, "--format", "json"])
+
+    assert excinfo.value.code == 2
+    assert expected_message in capsys.readouterr().err


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

This still misses the --input validation branch.

The issue scope here is both source arguments, but this parametrization only exercises --repo. Blank or whitespace-padded --input values can still regress without any test failing.

Suggested test extension

`@pytest.mark.parametrize`( ("source_args", "expected_message"), ( + (["--input", ""], "--input must be a non-empty value"), + (["--input", " "], "--input must be a non-empty value"), + (["--input", " fixture.json "], "--input must not include"), (["--repo", ""], "--repo must be a non-empty value"), (["--repo", " "], "--repo must be a non-empty value"), (["--repo", " ramimbo/mergework "], "--repo must not include"), ), ) -def test_proposed_work_triage_rejects_empty_repo_source( +def test_proposed_work_triage_rejects_empty_source_args(

As per coding guidelines, "Add or update tests for changed behavior." As per path instructions, "Focus on whether tests prove the changed behavior and include negative, replay, boundary, or regression cases where relevant."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@pytest.mark.parametrize(

("source_args", "expected_message"),

(

(["--repo", ""], "--repo must be a non-empty value"),

(["--repo", " "], "--repo must be a non-empty value"),

(["--repo", " ramimbo/mergework "], "--repo must not include"),

),

)

def test_proposed_work_triage_rejects_empty_repo_source(

source_args: list[str],

expected_message: str,

capsys,

) -> None:

with pytest.raises(SystemExit) as excinfo:

main([*source_args, "--format", "json"])

assert excinfo.value.code == 2

assert expected_message in capsys.readouterr().err

`@pytest.mark.parametrize`(

("source_args", "expected_message"),

(

(["--input", ""], "--input must be a non-empty value"),

(["--input", " "], "--input must be a non-empty value"),

(["--input", " fixture.json "], "--input must not include"),

(["--repo", ""], "--repo must be a non-empty value"),

(["--repo", " "], "--repo must be a non-empty value"),

(["--repo", " ramimbo/mergework "], "--repo must not include"),

),

)

def test_proposed_work_triage_rejects_empty_source_args(

source_args: list[str],

expected_message: str,

capsys,

) -> None:

with pytest.raises(SystemExit) as excinfo:

main([*source_args, "--format", "json"])

assert excinfo.value.code == 2

assert expected_message in capsys.readouterr().err

Sources: Coding guidelines, Path instructions

qingfeng312

Reviewed current head 59db114b195cbed357f4ec7eaf65d16101d92a3f. This needs changes before merge: the hosted pytest run fails during collection because scripts/pr_queue_health.py and scripts/proposed_work_triage.py import scripts.gh_cli_constants, but this PR does not add that module. That makes both scripts and their test modules import-fail before the new source-argument assertions can run. Please either add the shared module or keep the timeout constant local, then rerun the full quality/readiness workflow.

I also checked the new argparse coverage: proposed_work_triage should validate raw --input strings before converting to Path, otherwise an empty value can become Path(".") and miss the intended bounded error path.

qingfeng312

Reviewed updated head 6f20f2a088ebd8864cd4f2fa7072e1f26c424283. This still needs changes: hosted pytest fails during collection because scripts/pr_queue_health.py and scripts/proposed_work_triage.py import scripts.gh_cli_constants, but the module is still absent from the PR. That keeps both test modules from importing and prevents validation of the new source-argument tests.

The proposed_work_triage --input path also still needs raw-string validation before Path conversion so --input "" cannot become Path(".") and bypass the intended argparse error path.

coderabbitai

♻️ Duplicate comments (3)

tests/test_proposed_work_triage.py (1)
787-793: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Missing --input validation coverage. This parametrization only exercises --repo; blank or whitespace-padded --input is untested. Note this gap aligns with the --input "" bypass flagged in scripts/proposed_work_triage.py — adding ["--input", ""] here would currently fail (exit code ≠ 2) until that helper is fixed. After the fix, extend this test.
Suggested extension (after fixing the validation bug)
     (
+        (["--input", ""], "--input must be a non-empty value"),
+        (["--input", "   "], "--input must be a non-empty value"),
+        (["--input", " fixture.json "], "--input must not include"),
         (["--repo", ""], "--repo must be a non-empty value"),
         (["--repo", "   "], "--repo must be a non-empty value"),
         (["--repo", " ramimbo/mergework "], "--repo must not include"),
     ),
 )
-def test_proposed_work_triage_rejects_empty_repo_source(
+def test_proposed_work_triage_rejects_empty_source_args(
As per coding guidelines, "Add or update tests for changed behavior."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_proposed_work_triage.py` around lines 787 - 793, The current
validation test coverage in test_proposed_work_triage_rejects_empty_repo_source
only checks --repo and misses the known --input empty-value bypass. First fix
the validation in the proposed_work_triage helper in
scripts/proposed_work_triage.py so blank or whitespace-padded --input is
rejected with the same exit code and message pattern as --repo, then extend the
parametrization in test_proposed_work_triage_rejects_empty_repo_source to
include empty and whitespace-only --input cases and assert they fail as
expected.
Source: Coding guidelines
scripts/proposed_work_triage.py (1)
533-539: 🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift

--input "" still bypasses validation and produces a traceback. argparse applies type=Path before _require_input_path runs, so an empty value becomes Path("") whose str() is ".". That value is non-empty and whitespace-free, so it passes both checks, and read_text(".") then raises IsADirectoryError instead of argparse's exit-2 path — defeating the PR's goal of bounded argparse-style errors.

The other two scripts avoid this by keeping --input a plain string and validating the raw value. Mirror that here: drop type=Path, validate the raw string with _require_non_empty_arg, then convert to Path.
🐛 Proposed fix
-    source.add_argument("--input", type=Path, help="Offline JSON fixture with issues/payments")
+    source.add_argument("--input", help="Offline JSON fixture with issues/payments")
     if args.input is not None:
-        input_path = _require_input_path(parser, args.input)
+        input_path = Path(_require_non_empty_arg(parser, "--input", args.input))
         data = json.loads(input_path.read_text(encoding="utf-8"))
_require_input_path can then be removed.
Also applies to: 564-566
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/proposed_work_triage.py` around lines 533 - 539, The input-path
validation is happening too late because _require_input_path receives a Path
object, so `--input ""` is converted to `Path("")` and slips through as `"."`,
leading to a traceback instead of an argparse error. Update the argument
handling in the relevant parser setup and the helper used by the triage script
so `--input` is accepted as a raw string, validated with
`_require_non_empty_arg`, and only then converted to Path; if that flow is used,
`_require_input_path` can be removed.
tests/test_pr_queue_health.py (1)
553-559: 📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Add the padded --input case to lock the whitespace branch. Empty/whitespace --input is covered, but the "must not include leading or trailing whitespace" branch is only proven for --repo. A regression in --input whitespace handling would pass silently.
Suggested extension
         (["--input", ""], "--input must be a non-empty value"),
         (["--input", "   "], "--input must be a non-empty value"),
+        (["--input", " queue.json "], "--input must not include"),
         (["--repo", ""], "--repo must be a non-empty value"),
As per coding guidelines, "Add or update tests for changed behavior."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_pr_queue_health.py` around lines 553 - 559, Add a test case in
test_pr_queue_health.py alongside the existing argument validation table to
cover a padded --input value (for example, a non-empty string with
leading/trailing spaces) and assert it hits the same whitespace-rejection branch
as --repo. Use the existing CLI validation fixture or parameterized cases around
the --input checks so the behavior is locked in for the input parser and any
regression in whitespace handling is caught.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@scripts/proposed_work_triage.py`:
- Around line 533-539: The input-path validation is happening too late because
_require_input_path receives a Path object, so `--input ""` is converted to
`Path("")` and slips through as `"."`, leading to a traceback instead of an
argparse error. Update the argument handling in the relevant parser setup and
the helper used by the triage script so `--input` is accepted as a raw string,
validated with `_require_non_empty_arg`, and only then converted to Path; if
that flow is used, `_require_input_path` can be removed.

In `@tests/test_pr_queue_health.py`:
- Around line 553-559: Add a test case in test_pr_queue_health.py alongside the
existing argument validation table to cover a padded --input value (for example,
a non-empty string with leading/trailing spaces) and assert it hits the same
whitespace-rejection branch as --repo. Use the existing CLI validation fixture
or parameterized cases around the --input checks so the behavior is locked in
for the input parser and any regression in whitespace handling is caught.

In `@tests/test_proposed_work_triage.py`:
- Around line 787-793: The current validation test coverage in
test_proposed_work_triage_rejects_empty_repo_source only checks --repo and
misses the known --input empty-value bypass. First fix the validation in the
proposed_work_triage helper in scripts/proposed_work_triage.py so blank or
whitespace-padded --input is rejected with the same exit code and message
pattern as --repo, then extend the parametrization in
test_proposed_work_triage_rejects_empty_repo_source to include empty and
whitespace-only --input cases and assert they fail as expected.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: d4878be6-3fa0-4e6a-b10e-2c0015ed2a77

📥 Commits

Reviewing files that changed from the base of the PR and between 59db114 and c3fa196.

📒 Files selected for processing (6)

scripts/claim_inventory.py
scripts/pr_queue_health.py
scripts/proposed_work_triage.py
tests/test_claim_inventory.py
tests/test_pr_queue_health.py
tests/test_proposed_work_triage.py

yanyishuai · 2026-06-30T14:12:21Z

@qingfeng312 #807 report source-arg validation is green on latest head. Scoped argparse hardening; ready for merge.

Wallet: Do4v7foHJvRJLpRRoGaVPWX6DDEjX3yTK7J91gpwUQpE

qingfeng312

Reviewed updated head c3fa1965b2f124aed53f555bf299caad66502c57.

The import/CI blocker is fixed, but one source-argument path still needs changes before merge. In scripts/proposed_work_triage.py, --input is still declared with type=Path and then passed to _require_input_path(). That means --input "" is converted by argparse into Path("") / . before validation, so it can fall through to read_text() and raise an unbounded directory/error path instead of the intended argparse exit-2 message.

Please keep --input as a raw string for this script too, validate it with the same _require_non_empty_arg(parser, "--input", args.input) helper, then convert to Path only after validation. Add regression coverage for empty, whitespace-only, and padded --input values in tests/test_proposed_work_triage.py; adding the padded --input case to tests/test_pr_queue_health.py would also lock the whitespace branch.

coderabbitai Bot reviewed Jun 30, 2026

View reviewed changes

qingfeng312 suggested changes Jun 30, 2026

View reviewed changes

qingfeng312 mentioned this pull request Jun 30, 2026

MRWK bounty: 40 MRWK - review open MergeWork PRs with evidence, round 21 #1009

Open

yanyishuai force-pushed the fix/issue-807-report-source-validation branch from 59db114 to 6f20f2a Compare June 30, 2026 05:49

qingfeng312 suggested changes Jun 30, 2026

View reviewed changes

fix(scripts): reject empty report source args (ramimbo#807)

c3fa196

yanyishuai force-pushed the fix/issue-807-report-source-validation branch from 6f20f2a to c3fa196 Compare June 30, 2026 13:53

coderabbitai Bot reviewed Jun 30, 2026

View reviewed changes

qingfeng312 suggested changes Jun 30, 2026

View reviewed changes

Uh oh!

Conversation

yanyishuai commented Jun 30, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related issues

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

qingfeng312 left a comment

Choose a reason for hiding this comment

Uh oh!

qingfeng312 left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

yanyishuai commented Jun 30, 2026

Uh oh!

qingfeng312 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yanyishuai commented Jun 30, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 30, 2026 •

edited

Loading