Skip to content

feat(enable): add --limit flag to cap issues imported from beads/GitHub#252

Open
visigoth wants to merge 1 commit intofrankbria:mainfrom
visigoth:fix/beads-limit
Open

feat(enable): add --limit flag to cap issues imported from beads/GitHub#252
visigoth wants to merge 1 commit intofrankbria:mainfrom
visigoth:fix/beads-limit

Conversation

@visigoth
Copy link
Copy Markdown
Contributor

@visigoth visigoth commented Apr 11, 2026

Summary

Adds a --limit <n> flag to ralph-enable and ralph-enable-ci that bounds the number of issues imported from each task source (beads, GitHub) during project enablement. Default is 0, meaning "import all" — preserving today's behavior.

Motivation: on repos with large beads or GitHub issue backlogs, the enable wizard currently dumps every open issue into fix_plan.md, producing an unwieldy initial task list. --limit 25 lets the user bring in just the top of the queue and grow from there.

Changes

  • lib/task_sources.sh
    • fetch_beads_tasks now takes a second limit argument and forwards --limit to bd list (both JSON and text fallback paths).
    • get_beads_count accepts a limit argument symmetrically.
    • fetch_github_tasks honors the limit argument, mapping 01000 since gh issue list treats 0 differently from "all".
    • import_tasks_from_sources plumbs the limit through to both source fetchers.
  • ralph_enable.sh — adds --limit <n> CLI parsing with non-negative-integer validation, default 0, plumbed into fetch_beads_tasks / fetch_github_tasks.
  • ralph_enable_ci.sh — same --limit <n> flag for the non-interactive path, with output_error validation.

Why it's safe

  • Default of 0 is a no-op for existing users.
  • Validation rejects non-numeric / negative values before any side effects.
  • The change is purely additive at the CLI surface — no existing flags or argument positions move.

Test plan

  • ralph-enable --from beads --limit 5 imports at most 5 beads issues into fix_plan.md
  • ralph-enable --from github --limit 10 imports at most 10 GitHub issues
  • ralph-enable --from beads (no --limit) still imports the full backlog
  • ralph-enable --limit -3 and ralph-enable --limit foo exit with the validation error
  • ralph-enable-ci --from beads --limit 5 --json produces the limited list and a clean JSON envelope
  • npm test passes locally

Summary by CodeRabbit

  • New Features
    • Added a global --limit option (default 0 = import all) to cap how many tasks/issues are imported per source (GitHub and Beats).
  • Documentation
    • Help/usage updated to document the new --limit option and its default behavior.
  • Tests
    • Added unit tests covering GitHub fetch behavior for limit=0 vs positive limits, pagination selection, label passing, and PR filtering.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 11, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f45a3283-004d-4eca-9a9c-35b63d1ec2a9

📥 Commits

Reviewing files that changed from the base of the PR and between 2cce79c and 9775288.

📒 Files selected for processing (4)
  • lib/task_sources.sh
  • ralph_enable.sh
  • ralph_enable_ci.sh
  • tests/unit/test_task_sources.bats
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/task_sources.sh

Walkthrough

Added an optional per-source limit for task imports. lib/task_sources.sh functions now accept/propagate limit for Beads and GitHub; limit=="0" makes GitHub use gh api --paginate and filters out PRs. ralph_enable.sh and ralph_enable_ci.sh add a --limit <n> CLI option and pass it into imports; tests for GitHub behavior were added.

Changes

Cohort / File(s) Summary
Task Source Library
lib/task_sources.sh
Added limit parameter to fetch_beads_tasks, get_beads_count, fetch_github_tasks, and import_tasks_from_sources. Passes --limit to bd list and uses gh api --paginate when limit=="0"; non-zero limits use gh issue list --limit. PRs are filtered out in paginated flow.
Enable Script (interactive)
ralph_enable.sh
Introduced IMPORT_LIMIT default "0", --limit <n> CLI parsing and numeric validation, updated help text, and forward $IMPORT_LIMIT into fetch_beads_tasks and fetch_github_tasks.
Enable Script (CI)
ralph_enable_ci.sh
Added --limit <n> parsing with validation, IMPORT_LIMIT variable, pass limit into beads/github fetch calls; introduced local vars for fetched task lists in main.
Tests
tests/unit/test_task_sources.bats
Added tests and helpers to mock gh; verify fetch_github_tasks uses gh api --paginate when limit=0, uses gh issue list --limit for positive limits, checks PR filtering, and asserts label argument forwarding.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant User as "User (CLI)"
    participant Script as "ralph_enable / ralph_enable_ci"
    participant BeadsCLI as "bd CLI"
    participant GitHubCLI as "gh CLI / API"
    participant Parser as "jq / local parser"

    User->>Script: run import with --limit N
    Script->>BeadsCLI: bd list --limit N
    BeadsCLI->>Parser: emit JSON / lines
    Parser->>Script: parsed Beads tasks

    alt limit == "0"
        Script->>GitHubCLI: gh api --paginate /repos/.../issues?state=open&per_page=100
        GitHubCLI->>Parser: emit pages
        Parser->>Script: filter out items with "pull_request"
    else limit > 0
        Script->>GitHubCLI: gh issue list --limit N --json ...
        GitHubCLI->>Parser: emit JSON
    end
    Parser->>Script: parsed GitHub tasks
    Script->>Script: merge/import tasks into pipeline
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐇 I hopped through flags and CLI light,
I nudged each fetch to mind the bite,
"Zero" lets pages stream so free,
A number trims the harvest tree,
Hooray — neat imports, carrot for me! 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the primary change: adding a --limit flag to cap issues imported from beads and GitHub sources in the ralph-enable tooling.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/task_sources.sh (1)

173-175: ⚠️ Potential issue | 🟡 Minor

Update stale parameter docs for fetch_github_tasks.

Line 174 still says default is 50, but Line 186 now defaults to 0. Please align the function comment with implementation.

As per coding guidelines "Update inline comments in bash scripts immediately when implementation changes, remove outdated comments".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/task_sources.sh` around lines 173 - 175, The parameter documentation for
fetch_github_tasks is out of sync: the comment for $2 (limit) says default 50
but the implementation defaults to 0; update the inline comment block above
fetch_github_tasks so the $2 (limit) description matches the actual default used
in the function (change "default: 50" to "default: 0" or adjust the function to
use 50 if that was intended), and ensure any surrounding explanatory text is
consistent with the implementation.
🧹 Nitpick comments (1)
ralph_enable_ci.sh (1)

338-347: Declare task buffers as local in main() to avoid global leakage.

Line 338 and Line 345 assign beads_tasks / github_tasks without local declaration, which can leak state in longer scripts.

Scoped variable cleanup
-    local imported_tasks=""
+    local imported_tasks=""
+    local beads_tasks=""
+    local github_tasks=""
+    local prd_tasks=""

     case "$TASK_SOURCE" in
         beads)
             if beads_tasks=$(fetch_beads_tasks "open" "$IMPORT_LIMIT" 2>/dev/null); then

Based on learnings "Bash script naming conventions: use snake_case for function names and local variables ... ensure local variables inside functions are declared with the local keyword".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_enable_ci.sh` around lines 338 - 347, The temporary task buffers
beads_tasks and github_tasks are being assigned inside main() without local,
leaking into global scope; change their assignments to declare them as local
(e.g., use local beads_tasks=$(fetch_beads_tasks "open" "$IMPORT_LIMIT"
2>/dev/null) and local github_tasks=$(fetch_github_tasks "$GITHUB_LABEL"
"$IMPORT_LIMIT" 2>/dev/null)) or add local declarations immediately before their
current assignments so the variables are function-scoped and do not persist
after main() returns.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/task_sources.sh`:
- Around line 186-189: The current code remaps limit=0 to 1000 which silently
truncates imports; update the function to honor "0 = all" by removing the
hard-cap mapping of the local variable limit and instead fetch issues with
automatic pagination using gh api /repos/OWNER/REPO/issues?state=open --paginate
(or equivalent gh api call) when limit is 0, and use gh issue list --limit when
a positive numeric limit is supplied; also adjust the help text that currently
says "default: 50" to reflect the real default "0" (all) so the CLI help and the
variable behavior match.

---

Outside diff comments:
In `@lib/task_sources.sh`:
- Around line 173-175: The parameter documentation for fetch_github_tasks is out
of sync: the comment for $2 (limit) says default 50 but the implementation
defaults to 0; update the inline comment block above fetch_github_tasks so the
$2 (limit) description matches the actual default used in the function (change
"default: 50" to "default: 0" or adjust the function to use 50 if that was
intended), and ensure any surrounding explanatory text is consistent with the
implementation.

---

Nitpick comments:
In `@ralph_enable_ci.sh`:
- Around line 338-347: The temporary task buffers beads_tasks and github_tasks
are being assigned inside main() without local, leaking into global scope;
change their assignments to declare them as local (e.g., use local
beads_tasks=$(fetch_beads_tasks "open" "$IMPORT_LIMIT" 2>/dev/null) and local
github_tasks=$(fetch_github_tasks "$GITHUB_LABEL" "$IMPORT_LIMIT" 2>/dev/null))
or add local declarations immediately before their current assignments so the
variables are function-scoped and do not persist after main() returns.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 98423a79-cd8f-479f-a7e6-041a6acd8a34

📥 Commits

Reviewing files that changed from the base of the PR and between 2265569 and 65c3299.

📒 Files selected for processing (3)
  • lib/task_sources.sh
  • ralph_enable.sh
  • ralph_enable_ci.sh

Comment thread lib/task_sources.sh
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
ralph_enable_ci.sh (1)

339-359: ⚠️ Potential issue | 🟠 Major

Fix zero-task counting before emitting CI output.

On Lines 343, 350, and 358, grep -c already prints 0 when nothing matches, so || echo "0" turns TASKS_IMPORTED into 0\n0. With --limit or a label filter that yields no tasks, the CI summary and --json output become malformed.

Suggested fix
-                TASKS_IMPORTED=$(echo "$imported_tasks" | grep -c '^\- \[' || echo "0")
+                TASKS_IMPORTED=$(printf '%s\n' "$imported_tasks" | awk '/^- \[/{count++} END{print count+0}')
@@
-                TASKS_IMPORTED=$(echo "$imported_tasks" | grep -c '^\- \[' || echo "0")
+                TASKS_IMPORTED=$(printf '%s\n' "$imported_tasks" | awk '/^- \[/{count++} END{print count+0}')
@@
-                    TASKS_IMPORTED=$(echo "$imported_tasks" | grep -c '^\- \[' || echo "0")
+                    TASKS_IMPORTED=$(printf '%s\n' "$imported_tasks" | awk '/^- \[/{count++} END{print count+0}')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ralph_enable_ci.sh` around lines 339 - 359, The TASKS_IMPORTED assignment in
the TASK_SOURCE case blocks uses "grep -c ... || echo '0'", causing duplicate
"0\n0" when grep -c already returns 0; remove the "|| echo '0'" fallback so
TASKS_IMPORTED is set directly from the grep output. Update the three locations
where TASKS_IMPORTED is assigned (the beads, github, and prd branches that use
fetch_beads_tasks, fetch_github_tasks, and extract_prd_tasks) to use
TASKS_IMPORTED=$(echo "$imported_tasks" | grep -c '^\- \[') and leave the rest
of the output_message calls unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/task_sources.sh`:
- Around line 221-235: The code can silently truncate results when a label
filter is used because `gh issue list` switches to the Search API which is
capped at 1000; update the logic around the gh_args construction (the label
variable and limit variable usage) to validate that when label is non-empty the
requested limit is <=1000 and fail fast with a clear error (return non-zero and
message) if it exceeds that, or alternatively replace the bounded `gh issue
list` branch with a paginated `gh api graphql` cursor-based fetch to honor
larger limits; implement the simpler fix by adding a check after the `if [[ -n
"$label" ]]; then` block that inspects `limit` and returns 1 with an explanatory
message if limit > 1000.

---

Outside diff comments:
In `@ralph_enable_ci.sh`:
- Around line 339-359: The TASKS_IMPORTED assignment in the TASK_SOURCE case
blocks uses "grep -c ... || echo '0'", causing duplicate "0\n0" when grep -c
already returns 0; remove the "|| echo '0'" fallback so TASKS_IMPORTED is set
directly from the grep output. Update the three locations where TASKS_IMPORTED
is assigned (the beads, github, and prd branches that use fetch_beads_tasks,
fetch_github_tasks, and extract_prd_tasks) to use TASKS_IMPORTED=$(echo
"$imported_tasks" | grep -c '^\- \[') and leave the rest of the output_message
calls unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6ac8a1e6-71a6-4fd4-bc9b-4172574df143

📥 Commits

Reviewing files that changed from the base of the PR and between 65c3299 and 2cce79c.

📒 Files selected for processing (3)
  • lib/task_sources.sh
  • ralph_enable.sh
  • ralph_enable_ci.sh

Comment thread lib/task_sources.sh
Comment on lines +221 to +235
else
# Bounded fetch via `gh issue list` which already excludes PRs
local gh_args=("issue" "list" "--state" "open" "--limit" "$limit" "--json" "number,title,labels")
if [[ -n "$label" ]]; then
gh_args+=("--label" "$label")
fi
if ! json_output=$(gh "${gh_args[@]}" 2>/dev/null); then
return 1
fi
if command -v jq &>/dev/null; then
tasks=$(echo "$json_output" | jq -r '
.[] |
"- [ ] [#\(.number)] \(.title)"
' 2>/dev/null)
fi
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In GitHub CLI, is gh issue list --limit Nhard-capped at 1000 even whenN > 1000? If so, what is the recommended way to fetch exactly the first N open issues when N may exceed 1000?

💡 Result:

No, gh issue list --limit N is not hard-capped at 1000 when N > 1000 for basic usage without filters that trigger the Search API (like --label or --search). It uses the GitHub REST API's List issues endpoint, which supports pagination up to the specified limit via GraphQL or repeated REST calls. However, certain filters switch to the Search API, which caps at 1000 results total. The recommended way to fetch exactly the first N open issues (sorted by most recent first, default) when N may exceed 1000 is to use gh issue list --state open --limit N. For repos with more than 1000 open issues, set N large enough to cover all (e.g., estimate or use a very high number like 10000), as there is no built-in unbounded fetch; users must specify a limit intentionally. To fetch ALL open issues reliably, use gh api graphql --paginate with a query like: query($owner: String!, $repo: String!, $cursor: String) { repository(owner: $owner, name: $repo) { issues(first: 100, after: $cursor, states: OPEN, orderBy: {field: CREATED_AT, direction: DESC}) { pageInfo { hasNextPage endCursor } nodes { number title } } } } This handles cursor pagination without limits other than API constraints.

Citations:


Guard against silent result truncation when --label filtering is combined with high limits.

When --label is present in the query (lines 224-226), gh issue list switches to the Search API, which is hard-capped at 1000 results regardless of the --limit value. A call requesting --limit 5000 with a label filter will silently return only 1000 issues. Either validate that limit <= 1000 when a label filter is active, or use paginated gh api graphql with cursor pagination to fetch beyond 1000 results when needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/task_sources.sh` around lines 221 - 235, The code can silently truncate
results when a label filter is used because `gh issue list` switches to the
Search API which is capped at 1000; update the logic around the gh_args
construction (the label variable and limit variable usage) to validate that when
label is non-empty the requested limit is <=1000 and fail fast with a clear
error (return non-zero and message) if it exceeds that, or alternatively replace
the bounded `gh issue list` branch with a paginated `gh api graphql`
cursor-based fetch to honor larger limits; implement the simpler fix by adding a
check after the `if [[ -n "$label" ]]; then` block that inspects `limit` and
returns 1 with an explanatory message if limit > 1000.

@visigoth
Copy link
Copy Markdown
Contributor Author

All three CodeRabbit findings addressed in the force-pushed commit:

  1. lib/task_sources.sh:173-175 — stale docstring (outside diff range). Updated the parameter block for fetch_github_tasks to say default: "0" = all and dropped the obsolete "default: 50" reference.

  2. lib/task_sources.sh:186-189 — hard-cap at 1000 issues (major). The limit=0 path no longer silently caps at 1000. It now falls back to gh api --method GET repos/{owner}/{repo}/issues --paginate with automatic pagination, so large repos are fetched in full. The positive-limit path still uses gh issue list --limit N (which already excludes PRs). The paginated path filters out pull requests via .pull_request | not, and label values are passed via -f "labels=$label" so they're URL-encoded safely.

  3. ralph_enable_ci.sh:338-347 — missing local (nitpick). beads_tasks, github_tasks, and prd_tasks are now declared local at the top of the import block so they don't leak out of main().

Full test suite passes (the two failing tests — BSD stat fallback in test_log_rotation.bats and Linux notify-send in test_notifications.bats — are pre-existing environmental failures on macOS and exist on upstream/main independent of this change).

Adds a --limit <n> flag to ralph-enable and ralph-enable-ci that
controls the maximum number of issues fetched from any source.
Default is 0 (all issues), which fixes the previous behavior where
bd list silently capped at 50 items and gh issue list at 50.

The limit is threaded through fetch_beads_tasks, fetch_github_tasks,
get_beads_count, and import_tasks_from_sources.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@visigoth
Copy link
Copy Markdown
Contributor Author

Also added 4 regression tests for fetch_github_tasks in tests/unit/test_task_sources.bats (27 total, up from 23) covering the new pagination path:

  1. limit=0 uses gh api --paginate — verifies the paginated code path is taken and the bounded issue list path is NOT, and that state=open / per_page=100 form fields are passed.
  2. limit=0 filters out pull requests — mixed response of issues + a PR (with .pull_request field); confirms the PR is stripped from output.
  3. Positive limit uses gh issue list --limit N — confirms bounded fetches still take the fast path and do not use --paginate.
  4. Label is plumbed as a -f labels=... form field — confirms the URL-encoding-safe passing of label values.

Tests use a mock gh binary on PATH that records its args and emits fixture JSON per subcommand, following the same convention as tests/unit/test_notifications.bats. All 27 tests pass locally.

Copy link
Copy Markdown
Owner

@frankbria frankbria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --limit feature is a good addition. A few issues to fix before it can merge:

Blocking

Same grep -c || echo "0" bug in newly added code
The PR being fixed by #251 is grep -c ... || echo "0" producing "0\n0". The new code in ralph_enable_ci.sh (~lines 343, 350, 358) introduces the same pattern for TASKS_IMPORTED. Since #251 fixes the pattern elsewhere, this branch needs to use the safe form too:

# Instead of:
TASKS_IMPORTED=$(echo "$imported_tasks" | grep -c '^\- \[' || echo "0")
# Use:
TASKS_IMPORTED=$(echo "$imported_tasks" | grep -c '^\- \[') || TASKS_IMPORTED=0

gh issue list silently truncates when using label filters
When limit=0 ("import all") is combined with a label filter, the code maps 0 → 1000 and uses gh issue list --limit 1000. GitHub's label-filtered list endpoint caps at 1000, so repos with more than 1000 matching issues will be silently truncated. Either document this cap clearly, or use paginated gh api calls when limit=0.

Minor (please fix)

Stale comment in fetch_github_tasks
Line ~174 in lib/task_sources.sh still says the $2 (limit) default is 50, but the implementation defaults to 0. Update the comment to match.

local declarations missing in main()
beads_tasks and github_tasks are assigned in ralph_enable_ci.sh:main() without local, leaking into global scope. Add local beads_tasks="" / local github_tasks="" before their first use.

Also: please rebase

PR #256 (a 1-line crash fix in ralph_loop.sh) is being merged first. Please rebase this branch onto main after that lands.

Once these are addressed, happy to re-review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants