stack5:feat: add task source expansion and conversion wiring#164
stack5:feat: add task source expansion and conversion wiring#164VX1D wants to merge 9 commits intomichaelshimeles:mainfrom
Conversation
|
@VX1D is attempting to deploy a commit to the Goshen Labs Team on Vercel. A member of the Team first needs to authorize it. |
|
Related Documentation 3 document(s) may need updating based on files changed in this PR: Goshen Labs's Space Multi-File PRD (Product Requirements Document) SupportView Suggested Changes@@ -36,10 +36,33 @@
3,Setup CI/CD,0,0,Configure GitHub Actions
```
-CSV PRD files can be created manually or generated from existing markdown, YAML, or JSON PRDs using the built-in conversion utility. The CLI provides `--convert-from` and `--convert-to` flags to convert PRD files between formats (YAML/MD/JSON → CSV). For example:
+CSV PRD files can be created manually or generated from existing markdown, YAML, or JSON PRDs using the built-in conversion utility. This conversion feature provides a convenient way to leverage CSV's token savings benefits when working with existing PRD files in other formats.
-```
-ralphy --convert-from PRD.md --convert-to tasks.csv
+### Converting Task Files
+
+Ralphy provides CLI flags to convert task files from supported formats to CSV:
+
+- `--convert-from <file>`: Specifies the source file to convert
+- `--convert-to <file>`: Specifies the output file path (optional)
+
+If `--convert-to` is omitted, the output defaults to a `.csv` file with the same name as the input file.
+
+Supported input formats include:
+- `.yaml`, `.yml` (YAML task files)
+- `.md`, `.markdown` (Markdown task files)
+- `.json` (JSON task files)
+
+**Examples:**
+
+```bash
+# Convert markdown to CSV (auto-generates PRD.csv)
+ralphy --convert-from PRD.md
+
+# Convert YAML to CSV with explicit output path
+ralphy --convert-from tasks.yaml --convert-to tasks.csv
+
+# Convert JSON to CSV
+ralphy --convert-from backlog.json --convert-to backlog.csv
```
This approach allows for efficient, structured workflows and easy integration with external systems. Each task from a CSV PRD file is assigned a unique identifier based on its row, ensuring accurate tracking and synchronization. CSV PRD support is recommended for large projects or when minimizing token usage is important.Multi-File PRD SupportView Suggested Changes@@ -30,9 +30,28 @@
Each task is assigned a unique ID that combines the source filename (basename) and the line number where the task appears, in the format `filename.md:lineNumber`. This ensures that every task can be traced back to its exact location in the original markdown file, even when tasks are spread across multiple files ([source](https://github.com/michaelshimeles/ralphy/blob/fc2df589969b5fe16d31eccb4e7ff91314e31776/cli/src/tasks/markdown-folder.ts#L6-L106)).
-**Example task IDs:**
+### Task ID Validation
+
+Task IDs are validated to ensure they reference valid markdown files within the PRD folder. The validation enforces the following constraints:
+
+- **Filename validation**: Task IDs must contain only valid filenames (basename validation) without directory separators
+- **File extension**: Filenames must end with the `.md` extension
+- **Line numbers**: Line numbers must be finite, positive integers (greater than zero)
+- **Path traversal protection**: Task IDs are validated to ensure they don't escape the PRD folder
+
+**Security Note**: The system validates task IDs to prevent path traversal attacks and ensures they reference valid markdown files within the PRD folder.
+
+**Valid task IDs:**
- `backend.md:3`
- `frontend.md:7`
+- `feature.md:42`
+
+**Invalid task IDs:**
+- `../outside.md:1` — path traversal attempt
+- `subdir/task.md:5` — contains directory separator
+- `task.txt:5` — wrong file extension
+- `doc.md:abc` — invalid line number (not a number)
+- `doc.md:0` — invalid line number (not positive)
## Sorting Behavior for Consistent Task Ordering
PRD Folder SupportView Suggested Changes@@ -1,4 +1,4 @@
-Ralphy supports using a folder of markdown files as your Product Requirements Document (PRD) source, enabling scalable management of requirements and tasks for large projects. With recent performance improvements, ralphy efficiently aggregates, tracks, and updates tasks across multiple files with minimal overhead and enhanced visibility.
+Ralphy supports using a folder of markdown files as your Product Requirements Document (PRD) source, enabling scalable management of requirements and tasks for large projects.
## Using a PRD Folder
@@ -16,15 +16,25 @@
When marking a task as complete, ralphy parses the task ID to locate the correct file and line, updates the line by replacing `- [ ]` with `- [x]`, and writes the change back to the file. This ensures updates are always applied to the correct source, preserving the integrity of your PRD documentation.
-## Performance Optimizations
+## Task ID Validation and Security
-Ralphy now includes several enhancements for handling large PRD folders and high task volumes:
+Task IDs are strictly validated to ensure security and correctness when updating tasks across multiple markdown files:
-- **File Content Caching:** All markdown files in the PRD folder are cached, including parsed tasks and completion counts. This avoids redundant file reads and speeds up repeated operations such as listing tasks or counting remaining/completed items.
-- **Automatic Cache Invalidation:** If any file is modified externally or by ralphy, the cache is invalidated and refreshed to ensure accuracy.
-- **Batched Progress Logging:** Progress updates are written asynchronously and batched within short windows, reducing filesystem contention and improving responsiveness during parallel execution.
-- **Operation Timing Visibility:** Task execution steps and batch operations are tracked with timing breakdowns, allowing users to see which phases are slow and monitor overall progress.
-- **Consistent Task Ordering:** Files are always processed in sorted order, ensuring predictable aggregation and updates.
+- **Path Traversal Protection:** Task IDs are checked to ensure they reference files within the PRD folder. Attempts to access parent directories or arbitrary filesystem locations are rejected.
+- **Filename Validation:** Task IDs must reference files with `.md` extension and cannot contain path separators or relative path components.
+- **Line Number Validation:** Line numbers must be finite, positive integers.
+
+### Validation Examples
+
+Valid task IDs:
+- `feature.md:10`
+- `backend.md:25`
+
+Invalid task IDs:
+- `../secret.md:1` - path traversal attempt
+- `doc.txt:5` - wrong file extension
+- `task.md:abc` - invalid line number
+- `task.md:0` - line number must be positive
## Example Folder Structure
@@ -35,8 +45,8 @@
infra.md # - [ ] setup CI/CD
```
-Ralphy will aggregate all tasks from `backend.md`, `frontend.md`, and `infra.md`, sort them by filename, and track completion status per file and line. With the new performance features, even large, modular PRDs are managed efficiently, with accurate updates and clear progress visibility.
+Ralphy will aggregate all tasks from `backend.md`, `frontend.md`, and `infra.md`, sort them by filename, and track completion status per file and line.
## Summary
-These improvements make ralphy ideal for managing complex requirements across many markdown files. Users benefit from fast aggregation, reliable task tracking, and real-time feedback on execution and progress, even in large-scale projects.
+Ralphy's folder-based PRD support makes it ideal for managing complex requirements across many markdown files. The strict task ID validation ensures secure and reliable task tracking, while consistent file ordering guarantees predictable task aggregation and updates across large-scale projects.Note: You must be authenticated to accept/decline updates. |
Greptile SummaryThis PR wires a full task-format conversion pipeline ( Key changes:
Issues found:
Confidence Score: 2/5
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
CLI["CLI: index.ts\n--convert-from / --convert-to"] --> CONV_CMD["runConvert()\nconvert.ts"]
CLI --> LOOP["runLoop()\nrun.ts"]
CONV_CMD --> EXT{File extension?}
EXT -- ".yaml/.yml" --> YTC["yamlToCsv()\nconverter.ts"]
EXT -- ".md/.markdown" --> MTC["mdToCsv()\nconverter.ts"]
EXT -- ".json" --> JTC["jsonToCsv()\nconverter.ts"]
EXT -- "other" --> ERR["throw: Unsupported format"]
YTC --> PP["hasPrototypePollution()\nyaml.ts"]
PP -- "polluted" --> ERR2["throw: malicious keys"]
PP -- "clean" --> ROWS["rowsToCsv()\ncsv.ts"]
MTC --> ROWS
JTC --> ROWS
ROWS --> WRITE["writeFileSync(to, csv)"]
LOOP --> SRC["createTaskSource()\ntasks/index.ts"]
SRC -- "csv" --> CSV_SRC["CsvTaskSource\ncsv.ts"]
SRC -- "yaml" --> YAML_SRC["YamlTaskSource\nyaml.ts"]
SRC -- "markdown-folder" --> MF_SRC["MarkdownFolderTaskSource\nmarkdown-folder.ts"]
SRC -- "github" --> GH_SRC["GitHubTaskSource\ngithub.ts"]
LOOP --> PAR["runParallel()\nparallel.ts"]
PAR --> COPY{prdSource?}
COPY -- "csv/yaml/json/md" --> CP_FILE["copyFileSync PRD → worktree/sandbox"]
COPY -- "markdown-folder" --> CP_DIR["cpSync PRD dir → worktree/sandbox"]
Last reviewed commit: 813f447 |
cli/src/tasks/yaml.ts
Outdated
| function hasPrototypePollution(obj: unknown): boolean { | ||
| const MAX_DEPTH = 20; | ||
| const MAX_NODES = 10000; | ||
| const dangerousKeys = new Set(["__proto__", "constructor", "prototype"]); |
There was a problem hiding this comment.
The dangerousKeys set includes "prototype" and "constructor", which are ordinary, valid JavaScript/YAML property names. A task file containing a field named prototype (e.g., from auto-generated YAML or a task titled "Refactor prototype layer") would be incorrectly rejected as malicious.
Only __proto__ is the actual attack vector in modern JS runtimes — constructor and prototype cannot cause prototype pollution through standard YAML.parse() output; they become regular object keys. Consider narrowing the guard:
| const dangerousKeys = new Set(["__proto__", "constructor", "prototype"]); | |
| const dangerousKeys = new Set(["__proto__"]); |
If you choose to keep constructor and prototype for extra caution, at minimum update the error message to clarify which key triggered the rejection so users can diagnose false positives.
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/tasks/yaml.ts
Line: 19
Comment:
The `dangerousKeys` set includes `"prototype"` and `"constructor"`, which are ordinary, valid JavaScript/YAML property names. A task file containing a field named `prototype` (e.g., from auto-generated YAML or a task titled "Refactor prototype layer") would be incorrectly rejected as malicious.
Only `__proto__` is the actual attack vector in modern JS runtimes — `constructor` and `prototype` cannot cause prototype pollution through standard `YAML.parse()` output; they become regular object keys. Consider narrowing the guard:
```suggestion
const dangerousKeys = new Set(["__proto__"]);
```
If you choose to keep `constructor` and `prototype` for extra caution, at minimum update the error message to clarify which key triggered the rejection so users can diagnose false positives.
How can I resolve this? If you propose a fix, please make it concise.| skills.push({ | ||
| name: parsed.name || entry.name, | ||
| desc: parsed.description, | ||
| content: parsed.content, | ||
| }); | ||
| } |
There was a problem hiding this comment.
🟡 Medium tasks/converter.ts:199
When a SKILL.md file inside a directory lacks frontmatter or fails parsing, parseSkillFile returns "SKILL" as the default name (line 169). In skillsToCsv, this truthy string overrides the intended fallback to entry.name (the directory name), so a skill in folder "python" is named "SKILL" instead of "python". Consider changing the fallback logic to prefer entry.name when parsed.name is empty or the placeholder "SKILL".
- if (parsed) {
- skills.push({
- name: parsed.name || entry.name,
- desc: parsed.description,
- content: parsed.content,
- });
- }
+ if (parsed) {
+ const name = parsed.name && parsed.name !== "SKILL" ? parsed.name : entry.name;
+ skills.push({
+ name,
+ desc: parsed.description,
+ content: parsed.content,
+ });
+ }🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file cli/src/tasks/converter.ts around lines 199-204:
When a `SKILL.md` file inside a directory lacks frontmatter or fails parsing, `parseSkillFile` returns `"SKILL"` as the default name (line 169). In `skillsToCsv`, this truthy string overrides the intended fallback to `entry.name` (the directory name), so a skill in folder "python" is named "SKILL" instead of "python". Consider changing the fallback logic to prefer `entry.name` when `parsed.name` is empty or the placeholder "SKILL".
Evidence trail:
cli/src/tasks/converter.ts lines 166-172: parseSkillFile returns name extracted from filename ("SKILL" for SKILL.md) when no frontmatter exists.
cli/src/tasks/converter.ts line 199: skillsToCsv uses `parsed.name || entry.name`, which won't fall back to entry.name when parsed.name is the truthy string "SKILL".
| function parseSkillFile(filePath: string): { name: string; description: string; content: string } | null { | ||
| try { | ||
| const content = readFileSync(filePath, "utf-8"); | ||
| const frontmatterMatch = content.match(/^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$/); |
There was a problem hiding this comment.
🟢 Low tasks/converter.ts:155
The frontmatter regex ^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$ strictly requires a trailing newline after the closing --- fence. When a file contains only frontmatter with no body content, the match fails and the function falls back to treating the entire frontmatter block as body content, losing the metadata. Consider making the trailing newline optional by changing \n to \n? after the closing fence.
- const frontmatterMatch = content.match(/^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$/);
+ const frontmatterMatch = content.match(/^---\s*\n([\s\S]*?)\n---\s*\n?([\s\S]*)$/);🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file cli/src/tasks/converter.ts around line 155:
The frontmatter regex `^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$` strictly requires a trailing newline after the closing `---` fence. When a file contains only frontmatter with no body content, the match fails and the function falls back to treating the entire frontmatter block as body content, losing the metadata. Consider making the trailing newline optional by changing `\n` to `\n?` after the closing fence.
Evidence trail:
cli/src/tasks/converter.ts lines 155-171 at REVIEWED_COMMIT - The regex on line 155 is `^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$` which requires a trailing newline after the closing fence. The fallback on lines 166-171 loses frontmatter metadata when the regex doesn't match.
| nodesVisited++; | ||
| if (nodesVisited > MAX_NODES) { | ||
| throw new Error("YAML file too complex to validate safely"); | ||
| } | ||
|
|
||
| if (current.depth > MAX_DEPTH) { | ||
| throw new Error("YAML file nesting exceeds safety limits"); | ||
| } | ||
|
|
||
| if (typeof current.value !== "object" || current.value === null) { | ||
| continue; | ||
| } | ||
|
|
||
| if (visited.has(current.value)) { | ||
| continue; | ||
| } | ||
| visited.add(current.value); |
There was a problem hiding this comment.
🟡 Medium tasks/yaml.ts:31
nodesVisited counts every queue item including primitive values (strings, numbers, booleans), so a moderate YAML file with ~2,500 tasks each having 3-4 properties triggers the 10,000 limit and throws. Move the increment after the primitive check so only actual objects contribute to the node count.
- nodesVisited++;
- if (nodesVisited > MAX_NODES) {
- throw new Error("YAML file too complex to validate safely");
- }
-
if (current.depth > MAX_DEPTH) {
throw new Error("YAML file nesting exceeds safety limits");
}
if (typeof current.value !== "object" || current.value === null) {
continue;
}
+
+ if (++nodesVisited > MAX_NODES) {
+ throw new Error("YAML file too complex to validate safely");
+ }🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file cli/src/tasks/yaml.ts around lines 31-47:
`nodesVisited` counts every queue item including primitive values (strings, numbers, booleans), so a moderate YAML file with ~2,500 tasks each having 3-4 properties triggers the 10,000 limit and throws. Move the increment after the primitive check so only actual objects contribute to the node count.
Evidence trail:
cli/src/tasks/yaml.ts lines 17-18: MAX_DEPTH = 20, MAX_NODES = 10000
cli/src/tasks/yaml.ts line 30-31: nodesVisited++ happens before the primitive check
cli/src/tasks/yaml.ts lines 39-41: primitive check (`typeof current.value !== "object" || current.value === null`) comes AFTER the increment
cli/src/tasks/yaml.ts lines 48-51: primitive values are pushed to queue when iterating object keys
| const line = lines[i]; | ||
|
|
||
| // Match incomplete tasks | ||
| const incompleteMatch = line.match(/^- \[ \] (.+)$/); |
There was a problem hiding this comment.
🟡 Medium tasks/markdown-folder.ts:97
Inconsistent handling of empty tasks: getAllTasks ignores - [ ] while countRemaining counts them, leading to mismatched totals. Suggest aligning the regexes so both treat empty descriptions the same (either include or exclude).
- const incompleteMatch = line.match(/^- \[ \] (.+)$/);
+ const incompleteMatch = line.match(/^- \[ \] (.*)$/);Also found in 1 other location(s)
cli/src/tasks/converter.ts:90
The
runConvertfunction performs a division by zero check fororiginalTokensbut not for the division logic itself in all cases. While line 67 handlesoriginalTokens === 0, it does not handle the case wherecsvTokensis larger thanoriginalTokens, which can happen if the CSV format adds overhead (e.g., lots of quoting/escaping) to a small file. This would result in a negative savings percentage, which is mathematically correct but potentially confusing if the intent is to show "savings". However, the more critical issue is thatcontent.lengthis used to estimateoriginalTokens(Math.ceil(content.length / 4)). Ifcontentis empty string,originalTokensis 0. The ternaryoriginalTokens === 0 ? 0 : ...handles the division by zero.
However, looking at the logic: writeFileSync is called before the logging. If writeFileSync fails (e.g. invalid path, permission denied), the program throws and the user sees no output. But if it succeeds, we log. The issue is in const savings = originalTokens === 0 ? 0 : Math.round((1 - csvTokens / originalTokens) * 100);. If originalTokens is 0 (empty file), savings is 0. If csvContent is not empty (e.g. just headers), csvTokens will be non-zero. The logic seems safe from crashing, but the calculation 1 - csvTokens / originalTokens implies that we expect CSV to be smaller. If CSV is larger, savings is negative.
Wait, re-reading the code in runConvert:
const taskCount = csvContent.split("\n").length - 1;
If csvContent is empty (which happens if yamlToCsv or others return empty string or just header), splitting by \n might result in [""] (length 1) -> taskCount 0. If csvContent is just the header id,title..., split length is 1 (no newline at end) or 2 (newline at end).
Let's look at yamlToCsv. It returns toCSV(rows). toCSV joins by \n. If rows has just header, it returns one line. split('\n').length is 1. 1 - 1 = 0. Correct.
If csvContent has a trailing newline (often preferred in files), split returns an empty string at the end, so length is rows + 1. toCSV joins with \n, so no trailing newline is added by toCSV itself unless the last element implies it, but join puts it between elements. So toCSV result usually has NO trailing newline. writeFileSync writes exactly that.
However, yamlToCsv etc return just the joined string.
Example: 1 header, 0 tasks. rows = [header]. toCSV returns header_string. split('\n') returns [header_string]. Length 1. taskCount = 0. Correct.
Example: 1 header, 1 task. rows = [header, task]. toCSV returns header\ntask. split returns [header, task]. Length 2. taskCount = 1. Correct.
The real issue is potentially in mdToCsv loop logic: const incompleteMatch = line.match(/^- \[ \] (.+)$/);. The regex requires a space after [ ]. Standard markdown checklists are - [ ] text. If there are multiple spaces or tabs, it fails. But the bigger issue is in runConvert.
In runConvert, const content = readFileSync(from, "utf-8"); reads the whole file. If the file is large, this is fine for a CLI tool usually.
The issue is here:
49: } catch (err) {
50: const message = err instanceof Error ? err.message : String(err);
51: logError(`Error parsing ${from}: ${message}`);
52: throw err;
53: }If an error occurs, it logs it AND throws it. If the caller (await main() in the usage example) doesn't catch it, it will print the unhandled rejection/error trace, which duplicates the error message. However, the CLI entry point usually handles this.
A more subtle bug: extname includes the dot.
26: switch (fromExt) {
27: case ".yaml":This is correct.
Let's check mdToCsv in cli/src/tasks/converter.ts.
84: const lines = mdContent.replace(/\r\n/g, "\n").split("\n");
...
90: const incompleteMatch = line.match(/^- \[ \] (.+)$/);If mdContent is empty, lines is [""]. Loop runs once. No match. Returns header. Correct.
If mdContent has - [ ] task, the regex ^- expects start of string. It does not handle indentation. Markdown lists are often indented. The regex anchors to ^ immediately followed by -. This will miss any indented tasks.
Let's verify reachable path.
User provides a file tasks.md:
# My Tasks
- [ ] Indented task
- [ ] Normal taskThe code executes mdToCsv.
Line 84 splits content.
Line 88 loops.
Line 90 tests ^- \[ \] (.+)$.
" - [ ] Indented task" fails the regex because of leading spaces.
"- [ ] Normal task" passes.
Result: The indented task is silently dropped.
This is a bug in the parser logic for Markdown. The intent is clearly to parse tasks, and markdown tasks are frequently indented.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file cli/src/tasks/markdown-folder.ts around line 97:
Inconsistent handling of empty tasks: `getAllTasks` ignores `- [ ] ` while `countRemaining` counts them, leading to mismatched totals. Suggest aligning the regexes so both treat empty descriptions the same (either include or exclude).
Evidence trail:
cli/src/tasks/markdown-folder.ts lines 91 and 114 at REVIEWED_COMMIT. Line 91 shows `getAllTasks` uses regex `/^- \[ \] (.+)$/` which requires at least one character after the checkbox (`.+`). Line 114 shows `countRemaining` uses regex `/^- \[ \] /gm` which only matches the prefix and would count empty-description tasks.
Also found in 1 other location(s):
- cli/src/tasks/converter.ts:90 -- The `runConvert` function performs a division by zero check for `originalTokens` but not for the division logic itself in all cases. While line 67 handles `originalTokens === 0`, it does not handle the case where `csvTokens` is larger than `originalTokens`, which can happen if the CSV format adds overhead (e.g., lots of quoting/escaping) to a small file. This would result in a negative savings percentage, which is mathematically correct but potentially confusing if the intent is to show "savings". However, the more critical issue is that `content.length` is used to estimate `originalTokens` (`Math.ceil(content.length / 4)`). If `content` is empty string, `originalTokens` is 0. The ternary `originalTokens === 0 ? 0 : ...` handles the division by zero.
However, looking at the logic: `writeFileSync` is called *before* the logging. If `writeFileSync` fails (e.g. invalid path, permission denied), the program throws and the user sees no output. But if it succeeds, we log. The issue is in `const savings = originalTokens === 0 ? 0 : Math.round((1 - csvTokens / originalTokens) * 100);`. If `originalTokens` is 0 (empty file), `savings` is 0. If `csvContent` is not empty (e.g. just headers), `csvTokens` will be non-zero. The logic seems safe from crashing, but the calculation `1 - csvTokens / originalTokens` implies that we expect CSV to be smaller. If CSV is larger, savings is negative.
Wait, re-reading the code in `runConvert`:
`const taskCount = csvContent.split("\n").length - 1;`
If `csvContent` is empty (which happens if `yamlToCsv` or others return empty string or just header), splitting by `\n` might result in `[""]` (length 1) -> taskCount 0. If `csvContent` is just the header `id,title...`, split length is 1 (no newline at end) or 2 (newline at end).
Let's look at `yamlToCsv`. It returns `toCSV(rows)`. `toCSV` joins by `\n`. If rows has just header, it returns one line. `split('\n').length` is 1. `1 - 1 = 0`. Correct.
If `csvContent` has a trailing newline (often preferred in files), `split` returns an empty string at the end, so length is `rows + 1`. `toCSV` joins with `\n`, so no trailing newline is added by `toCSV` itself unless the last element implies it, but `join` puts it between elements. So `toCSV` result usually has NO trailing newline. `writeFileSync` writes exactly that.
However, `yamlToCsv` etc return just the joined string.
Example: 1 header, 0 tasks. `rows` = `[header]`. `toCSV` returns `header_string`. `split('\n')` returns `[header_string]`. Length 1. `taskCount` = 0. Correct.
Example: 1 header, 1 task. `rows` = `[header, task]`. `toCSV` returns `header\ntask`. `split` returns `[header, task]`. Length 2. `taskCount` = 1. Correct.
The real issue is potentially in `mdToCsv` loop logic: `const incompleteMatch = line.match(/^- \[ \] (.+)$/);`. The regex requires a space after `[ ]`. Standard markdown checklists are `- [ ] text`. If there are multiple spaces or tabs, it fails. But the bigger issue is in `runConvert`.
In `runConvert`, `const content = readFileSync(from, "utf-8");` reads the whole file. If the file is large, this is fine for a CLI tool usually.
The issue is here:
[code fence]typescript
49: } catch (err) {
50: const message = err instanceof Error ? err.message : String(err);
51: logError(`Error parsing ${from}: ${message}`);
52: throw err;
53: }
[code fence]
If an error occurs, it logs it AND throws it. If the caller (`await main()` in the usage example) doesn't catch it, it will print the unhandled rejection/error trace, which duplicates the error message. However, the CLI entry point usually handles this.
A more subtle bug: `extname` includes the dot.
[code fence]typescript
26: switch (fromExt) {
27: case ".yaml":
[code fence]
This is correct.
Let's check `mdToCsv` in `cli/src/tasks/converter.ts`.
[code fence]typescript
84: const lines = mdContent.replace(/\r\n/g, "\n").split("\n");
...
90: const incompleteMatch = line.match(/^- \[ \] (.+)$/);
[code fence]
If `mdContent` is empty, `lines` is `[""]`. Loop runs once. No match. Returns header. Correct.
If `mdContent` has ` - [ ] task`, the regex `^- ` expects start of string. It does not handle indentation. Markdown lists are often indented. The regex anchors to `^` immediately followed by `-`. This will miss any indented tasks.
Let's verify reachable path.
User provides a file `tasks.md`:
[code fence]markdown
# My Tasks
- [ ] Indented task
- [ ] Normal task
[code fence]
The code executes `mdToCsv`.
Line 84 splits content.
Line 88 loops.
Line 90 tests `^- \[ \] (.+)$`.
" - [ ] Indented task" fails the regex because of leading spaces.
"- [ ] Normal task" passes.
Result: The indented task is silently dropped.
This is a bug in the parser logic for Markdown. The intent is clearly to parse tasks, and markdown tasks are frequently indented.
| function truncateDesc(desc: string | undefined, maxLen = 80): string { | ||
| if (!desc) return ""; | ||
| const clean = desc.replace(/\s+/g, " ").trim(); | ||
| if (clean.length <= maxLen) return clean; | ||
| return `${clean.slice(0, maxLen - 3)}...`; | ||
| } |
There was a problem hiding this comment.
🟡 Medium tasks/converter.ts:41
truncateDesc receives desc from parsed JSON/YAML without runtime validation, so a numeric description like {"description": 2023} passes the if (!desc) check (since 2023 is truthy) and then crashes on desc.replace with TypeError: desc.replace is not a function. Consider adding a type check: const str = typeof desc === 'string' ? desc : String(desc ?? '') before calling string methods.
function truncateDesc(desc: string | undefined, maxLen = 80): string {
if (!desc) return "";
- const clean = desc.replace(/\s+/g, " ").trim();
+ const str = typeof desc === "string" ? desc : String(desc ?? "");
+ const clean = str.replace(/\s+/g, " ").trim();
if (clean.length <= maxLen) return clean;
return `${clean.slice(0, maxLen - 3)}...`;
}🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file cli/src/tasks/converter.ts around lines 41-46:
`truncateDesc` receives `desc` from parsed JSON/YAML without runtime validation, so a numeric description like `{"description": 2023}` passes the `if (!desc)` check (since `2023` is truthy) and then crashes on `desc.replace` with `TypeError: desc.replace is not a function`. Consider adding a type check: `const str = typeof desc === 'string' ? desc : String(desc ?? '')` before calling string methods.
Evidence trail:
cli/src/tasks/converter.ts lines 41-45 (truncateDesc function definition), lines 48-54 (YamlTask interface with description?: string), lines 108-117 (JsonTask interface with description?: string), lines 63-77 (yamlToCsv using YAML.parse with type assertion), lines 121-142 (jsonToCsv using JSON.parse with type assertion). TypeScript type assertions like `as YamlTaskFile` are compile-time only and do not perform runtime validation.
| // Handle unhandled promise rejections globally | ||
| process.on("unhandledRejection", (reason, promise) => { | ||
| // BUG FIX: Preserve error stack traces for better debugging | ||
| const errorMessage = | ||
| reason instanceof Error ? `${reason.message}\n${reason.stack}` : String(reason); | ||
| logError(`Unhandled Promise Rejection: ${errorMessage}`); | ||
| logError(`Promise: ${promise}`); | ||
| // Don't crash, but log and continue - prevent uncaught exception | ||
| }); |
There was a problem hiding this comment.
🟡 Medium src/index.ts:15
The unhandledRejection handler swallows all promise rejections without terminating the process. When main() rejects due to a cleanup failure, the rejection is caught and logged, but the process exits with code 0 when the event loop drains — hiding the failure from CI/CD. Consider re-throwing or calling process.exit(1) after logging to ensure non-zero exit on unhandled rejections.
process.on("unhandledRejection", (reason, promise) => {
// BUG FIX: Preserve error stack traces for better debugging
const errorMessage =
reason instanceof Error ? `${reason.message}\n${reason.stack}` : String(reason);
logError(`Unhandled Promise Rejection: ${errorMessage}`);
logError(`Promise: ${promise}`);
- // Don't crash, but log and continue - prevent uncaught exception
+ process.exit(1);
});🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file cli/src/index.ts around lines 15-23:
The `unhandledRejection` handler swallows all promise rejections without terminating the process. When `main()` rejects due to a cleanup failure, the rejection is caught and logged, but the process exits with code 0 when the event loop drains — hiding the failure from CI/CD. Consider re-throwing or calling `process.exit(1)` after logging to ensure non-zero exit on unhandled rejections.
Evidence trail:
cli/src/index.ts lines 16-23: `unhandledRejection` handler logs but does not call `process.exit(1)` or re-throw. Lines 78-82: catch block calls `await runCleanup()` before `process.exit(1)` with no nested try/catch. Line 84: `await main()` at top level. If `runCleanup()` throws in the catch block, `process.exit(1)` is never reached, and the rejection goes to the handler that only logs.
| const GITHUB_TOKEN_PATTERNS = [ | ||
| /^ghp_[A-Za-z0-9]{36}$/, // Classic PAT | ||
| /^github_pat_[A-Za-z0-9_]{82}$/, // Fine-grained PAT | ||
| /^gho_[A-Za-z0-9]{52}$/, // OAuth token | ||
| ]; |
There was a problem hiding this comment.
🟠 High tasks/github.ts:8
Token validation is too strict. validateGitHubToken rejects valid GITHUB_TOKENs (e.g., ghs_*) and uses the wrong gho_* length, causing CI failures. Suggest removing client-side format checks and relying on GitHub to reject invalid tokens, or broadening GITHUB_TOKEN_PATTERNS to include ghs_* and correcting the gho_* length.
-const GITHUB_TOKEN_PATTERNS = [
- /^ghp_[A-Za-z0-9]{36}$/, // Classic PAT
- /^github_pat_[A-Za-z0-9_]{82}$/, // Fine-grained PAT
- /^gho_[A-Za-z0-9]{52}$/, // OAuth token
-];🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file cli/src/tasks/github.ts around lines 8-12:
Token validation is too strict. `validateGitHubToken` rejects valid `GITHUB_TOKEN`s (e.g., `ghs_*`) and uses the wrong `gho_*` length, causing CI failures. Suggest removing client-side format checks and relying on GitHub to reject invalid tokens, or broadening `GITHUB_TOKEN_PATTERNS` to include `ghs_*` and correcting the `gho_*` length.
Evidence trail:
cli/src/tasks/github.ts lines 8-12 (GITHUB_TOKEN_PATTERNS definition), lines 17-19 (validateGitHubToken function), lines 47-51 (where validation is called and error thrown). GitHub documentation: https://developer.github.com/changes/2021-04-05-behind-githubs-new-authentication-token-formats confirms ghs_ prefix for server-to-server tokens. Gist https://gist.github.com/magnetikonline/073afe7909ffdd6f10ef06a00bc3bc88 confirms `^ghs_[a-zA-Z0-9]{36}$` format for GitHub Actions tokens.
| interface CsvTask { | ||
| id: string; | ||
| title: string; | ||
| completed: boolean; | ||
| parallelGroup: number; | ||
| description: string; | ||
| } |
There was a problem hiding this comment.
🟠 High tasks/csv.ts:84
writeFile reconstructs the CSV with a hardcoded schema (id, title, done, group, desc), discarding any additional columns present in the original file. When markComplete is called, extra metadata like assignee or priority is silently deleted. Consider preserving unknown columns by storing the original row data and passing it through on write.
interface CsvTask {
id: string;
_title: string;
completed: boolean;
parallelGroup: number;
description: string;
_extra?: string[];
}
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file cli/src/tasks/csv.ts around lines 84-90:
`writeFile` reconstructs the CSV with a hardcoded schema (id, title, done, group, desc), discarding any additional columns present in the original file. When `markComplete` is called, extra metadata like `assignee` or `priority` is silently deleted. Consider preserving unknown columns by storing the original row data and passing it through on write.
Evidence trail:
cli/src/tasks/csv.ts lines 93-112 (readFile() only extracts 5 columns), lines 114-120 (writeFile() writes hardcoded 5-column schema), lines 134-141 (markComplete() flow: read -> modify -> write back, losing extra columns)
| case "csv": | ||
| if (!options.filePath) { | ||
| throw new Error("filePath is required for json task source"); | ||
| throw new Error("filePath is required for csv task source"); | ||
| } | ||
| return new JsonTaskSource(options.filePath); | ||
| return new CsvTaskSource(options.filePath); |
There was a problem hiding this comment.
🟠 High tasks/index.ts:50
The createTaskSource function replaces the json case with csv instead of adding csv as a new case. When callers pass type: "json" (still defined in TaskSourceType and used by runLoop), the switch falls through to the default case and throws Unknown task source type. Keep the json case and add csv as a separate case.
- case "csv":
+ case "json":
if (!options.filePath) {
- throw new Error("filePath is required for csv task source");
+ throw new Error("filePath is required for json task source");
}
- return new CsvTaskSource(options.filePath);
+ return new JsonTaskSource(options.filePath);
+
+ case "csv":
+ if (!options.filePath) {
+ throw new Error("filePath is required for csv task source");
+ }
+ return new CsvTaskSource(options.filePath);🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file cli/src/tasks/index.ts around lines 50-54:
The `createTaskSource` function replaces the `json` case with `csv` instead of adding `csv` as a new case. When callers pass `type: "json"` (still defined in `TaskSourceType` and used by `runLoop`), the switch falls through to the default case and throws `Unknown task source type`. Keep the `json` case and add `csv` as a separate case.
Evidence trail:
1. `git_diff base=MERGE_BASE head=REVIEWED_COMMIT path=cli/src/tasks/index.ts` shows `case "json":` replaced with `case "csv":` (lines 47-53 of diff)
2. `cli/src/tasks/types.ts` line 20: `export type TaskSourceType = "markdown" | "markdown-folder" | "yaml" | "json" | "github";` - json still in type
3. `cli/src/tasks/json.ts` - JsonTaskSource class still exists
4. `cli/src/tasks/index.ts` current code - switch has no `case "json":` handler, only `case "csv":`
| export function yamlToCsv(yamlContent: string): string { | ||
| const data = YAML.parse(yamlContent) as YamlTaskFile; | ||
| const tasks = data.tasks || []; |
There was a problem hiding this comment.
🟡 Medium tasks/converter.ts:62
yamlToCsv returns an empty CSV when the YAML contains a root-level array of tasks (e.g. - title: Foo), because it assumes data.tasks exists and never checks if data itself is the array. This is inconsistent with jsonToCsv, which handles both formats. Consider adding the same array check: const tasks = Array.isArray(data) ? data : (data.tasks || []).
- const data = YAML.parse(yamlContent) as YamlTaskFile;
- const tasks = data.tasks || [];
+ const data = YAML.parse(yamlContent) as YamlTaskFile | YamlTask[];
+ const tasks = Array.isArray(data) ? data : (data.tasks || []);🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file cli/src/tasks/converter.ts around lines 62-64:
`yamlToCsv` returns an empty CSV when the YAML contains a root-level array of tasks (e.g. `- title: Foo`), because it assumes `data.tasks` exists and never checks if `data` itself is the array. This is inconsistent with `jsonToCsv`, which handles both formats. Consider adding the same array check: `const tasks = Array.isArray(data) ? data : (data.tasks || [])`.
Evidence trail:
cli/src/tasks/converter.ts lines 61-63 (yamlToCsv): `const data = YAML.parse(yamlContent) as YamlTaskFile; const tasks = data.tasks || [];` - no Array.isArray check.
cli/src/tasks/converter.ts line 117 (jsonToCsv): `const tasks: JsonTask[] = Array.isArray(data) ? data : (data as { tasks?: JsonTask[] })?.tasks || [];` - correctly handles both array and object formats.
| function sanitizeCsvCell(value: string): string { | ||
| if (/^[=+\-@]/.test(value)) { | ||
| return `'${value}`; | ||
| } | ||
| return value; | ||
| } |
There was a problem hiding this comment.
sanitizeCsvCell unconditionally prepends ' to any string starting with -, +, =, or @. While this prevents CSV formula injection, it silently corrupts legitimate task data:
- A task titled
"- Remove deprecated API"becomes"'- Remove deprecated API"in the output CSV, with the apostrophe stored as part of the title. - Numeric or numeric-like strings such as negative numbers (e.g.,
"-1","+42") are unnecessarily mangled.
The check should skip sanitization for values that are already valid numbers, since numeric strings cannot contain formula injections:
| function sanitizeCsvCell(value: string): string { | |
| if (/^[=+\-@]/.test(value)) { | |
| return `'${value}`; | |
| } | |
| return value; | |
| } | |
| function sanitizeCsvCell(value: string): string { | |
| if (/^[=+\-@]/.test(value) && Number.isNaN(Number(value))) { | |
| return `'${value}`; | |
| } | |
| return value; | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/tasks/csv.ts
Line: 52-57
Comment:
`sanitizeCsvCell` unconditionally prepends `'` to any string starting with `-`, `+`, `=`, or `@`. While this prevents CSV formula injection, it silently corrupts legitimate task data:
- A task titled `"- Remove deprecated API"` becomes `"'- Remove deprecated API"` in the output CSV, with the apostrophe stored as part of the title.
- Numeric or numeric-like strings such as negative numbers (e.g., `"-1"`, `"+42"`) are unnecessarily mangled.
The check should skip sanitization for values that are already valid numbers, since numeric strings cannot contain formula injections:
```suggestion
function sanitizeCsvCell(value: string): string {
if (/^[=+\-@]/.test(value) && Number.isNaN(Number(value))) {
return `'${value}`;
}
return value;
}
```
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
Update the type to: export type TaskSourceType = "markdown" | "markdown-folder" | "yaml" | "csv" | "github";Prompt To Fix With AIThis is a comment left during a code review.
Path: cli/src/tasks/types.ts
Line: 20
Comment:
`TaskSourceType` does not include `"csv"` but should. `CsvTaskSource` in `csv.ts` declares `type = "csv" as const`, and the factory in `tasks/index.ts` routes on this value. However, `TaskSourceType` still includes the obsolete `"json"` literal and lacks `"csv"`, creating a type mismatch.
Update the type to:
```typescript
export type TaskSourceType = "markdown" | "markdown-folder" | "yaml" | "csv" | "github";
```
How can I resolve this? If you propose a fix, please make it concise. |
| * Task source type | ||
| */ | ||
| export type TaskSourceType = "markdown" | "markdown-folder" | "yaml" | "json" | "github"; | ||
| export type TaskSourceType = "markdown" | "markdown-folder" | "yaml" | "json" | "csv" | "github"; |
There was a problem hiding this comment.
"json" retained in TaskSourceType but has no handler
"json" is still present in the TaskSourceType union, but the createTaskSource switch in cli/src/tasks/index.ts replaced the case "json": with case "csv": and removed the json.ts export entirely. Any existing code or configuration that passes type: "json" to createTaskSource will now hit the default branch and throw Unknown task source type: json at runtime.
If the intent is to fully replace JSON sources with CSV, "json" should be removed from this union (which is a breaking change and warrants a migration note). If json should still be supported alongside csv, the handler and export need to be restored.
| export type TaskSourceType = "markdown" | "markdown-folder" | "yaml" | "json" | "csv" | "github"; | |
| export type TaskSourceType = "markdown" | "markdown-folder" | "yaml" | "csv" | "github"; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/tasks/types.ts
Line: 20
Comment:
**`"json"` retained in `TaskSourceType` but has no handler**
`"json"` is still present in the `TaskSourceType` union, but the `createTaskSource` switch in `cli/src/tasks/index.ts` replaced the `case "json":` with `case "csv":` and removed the `json.ts` export entirely. Any existing code or configuration that passes `type: "json"` to `createTaskSource` will now hit the `default` branch and throw `Unknown task source type: json` at runtime.
If the intent is to fully replace JSON sources with CSV, `"json"` should be removed from this union (which is a breaking change and warrants a migration note). If `json` should still be supported alongside `csv`, the handler and export need to be restored.
```suggestion
export type TaskSourceType = "markdown" | "markdown-folder" | "yaml" | "csv" | "github";
```
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (1)
Since the re-throw is sufficient for the caller to handle the error, consider removing the Prompt To Fix With AIThis is a comment left during a code review.
Path: cli/src/cli/commands/convert.ts
Line: 118-122
Comment:
**Error is logged twice in the conversion failure path**
`runConvert` logs the error with `logError(...)` and then rethrows it. The caller in `main()` (`cli/src/index.ts`) catches the same error and logs it again via `logError(standardizeError(error).message)`. A user running a failed conversion will see two error lines printed for one failure event.
Since the re-throw is sufficient for the caller to handle the error, consider removing the `logError` here and letting `main()` be the single error-reporting site:
```suggestion
} catch (err) {
throw err;
}
```
How can I resolve this? If you propose a fix, please make it concise. |
| // PRD loop mode | ||
| await runLoop(options); | ||
| } catch (error) { | ||
| logError(error instanceof Error ? error.message : String(error)); | ||
| process.exitCode = 1; | ||
| } finally { | ||
| // Ensure all progress writes are flushed before exit | ||
| await flushAllProgressWrites(); | ||
| logError(standardizeError(error).message); | ||
| await runCleanup(); | ||
| process.exit(1); | ||
| } | ||
| } | ||
|
|
||
| main(); | ||
| // BUG FIX: Await main() to prevent floating promise and ensure proper error handling | ||
| await main(); |
There was a problem hiding this comment.
Cleanup never runs on successful exit
The old code called flushAllProgressWrites() in a finally block, guaranteeing it ran regardless of whether main() succeeded or threw. The replacement moves runCleanup() exclusively into the catch block, so a normal (successful) run completes without ever calling cleanup.
Any registered cleanup functions — such as the JsonFileLogSink.dispose() timer teardown and final flush — will be skipped on success. Structured log entries buffered in JsonFileLogSink but not yet flushed will be silently lost when the process exits.
Consider restoring a finally block (or an explicit call before return) so cleanup always runs:
try {
...
await runLoop(options);
} catch (error) {
logError(standardizeError(error).message);
await runCleanup();
process.exit(1);
} finally {
await runCleanup();
}Or, since process.on("exit") runs synchronously, register the cleanup there — but note that async cleanup cannot be awaited from an exit listener, so the finally approach is safer.
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/index.ts
Line: 85-95
Comment:
**Cleanup never runs on successful exit**
The old code called `flushAllProgressWrites()` in a `finally` block, guaranteeing it ran regardless of whether `main()` succeeded or threw. The replacement moves `runCleanup()` exclusively into the `catch` block, so a normal (successful) run completes without ever calling cleanup.
Any registered cleanup functions — such as the `JsonFileLogSink.dispose()` timer teardown and final flush — will be skipped on success. Structured log entries buffered in `JsonFileLogSink` but not yet flushed will be silently lost when the process exits.
Consider restoring a `finally` block (or an explicit call before `return`) so cleanup always runs:
```typescript
try {
...
await runLoop(options);
} catch (error) {
logError(standardizeError(error).message);
await runCleanup();
process.exit(1);
} finally {
await runCleanup();
}
```
Or, since `process.on("exit")` runs synchronously, register the cleanup there — but note that async cleanup cannot be awaited from an `exit` listener, so the `finally` approach is safer.
How can I resolve this? If you propose a fix, please make it concise.
Additional Comments (5)
The function splits the inner The fix is to parse the entire inner CSV at once and skip the header row: export function allSkillsToCsv(skillDirs: string[]): string {
const allRows: string[][] = [["skill", "desc", "summary"]];
for (const dir of skillDirs) {
const csv = skillsToCsv(dir);
if (csv) {
const parsedRows = parseCSV(csv);
for (let i = 1; i < parsedRows.length; i++) {
allRows.push(parsedRows[i]);
}
}
}
if (allRows.length <= 1) return "";
return rowsToCsv(allRows);
}Prompt To Fix With AIThis is a comment left during a code review.
Path: cli/src/tasks/converter.ts
Line: 566-584
Comment:
**`allSkillsToCsv` breaks multi-line quoted CSV fields**
The function splits the inner `skillsToCsv` output by `"\n"` **before** calling `parseCSV(line)` on each fragment. When a skill description contains an embedded newline — which `rowsToCsv` correctly wraps in double-quotes, e.g. `"line1\nline2"` — the manual `split("\n")` tears the quoted field apart into two malformed fragments. `parseCSV` then receives `"line1` on one iteration and `line2"` on the next, yielding incorrect and potentially corrupted rows in the merged output.
The fix is to parse the entire inner CSV at once and skip the header row:
```typescript
export function allSkillsToCsv(skillDirs: string[]): string {
const allRows: string[][] = [["skill", "desc", "summary"]];
for (const dir of skillDirs) {
const csv = skillsToCsv(dir);
if (csv) {
const parsedRows = parseCSV(csv);
for (let i = 1; i < parsedRows.length; i++) {
allRows.push(parsedRows[i]);
}
}
}
if (allRows.length <= 1) return "";
return rowsToCsv(allRows);
}
```
How can I resolve this? If you propose a fix, please make it concise.
The constructor calls The periodic flush interval will silently catch-and-log this error repeatedly, eating memory in the error recovery buffer, until the directory is manually created. Additionally, Consider creating the directory in the constructor and relaxing the path restriction to prevent path traversal without artificially limiting usable log paths: constructor(filePath: string, ...) {
this.filePath = validateLogPath(filePath);
mkdirSync(path.dirname(this.filePath), { recursive: true });
...
}Prompt To Fix With AIThis is a comment left during a code review.
Path: cli/src/ui/logger.ts
Line: 1760-1773
Comment:
**`JsonFileLogSink` writes to file without creating the parent directory**
The constructor calls `validateLogPath(filePath)` which resolves the path, then immediately sets up a periodic flush timer. However, `appendFileSync` (used in `flush()`) will throw `ENOENT` if the `logs/` directory does not exist — there is no `mkdirSync` call before the first write attempt.
The periodic flush interval will silently catch-and-log this error repeatedly, eating memory in the error recovery buffer, until the directory is manually created.
Additionally, `validateLogPath` hard-codes an `ALLOWED_LOG_DIR = "logs"` restriction relative to `process.cwd()`. This rejects any absolute path (e.g. `/var/log/ralphy.log`) and any relative path outside `./logs/`, which would break standard production deployment configurations.
Consider creating the directory in the constructor and relaxing the path restriction to prevent path traversal without artificially limiting usable log paths:
```typescript
constructor(filePath: string, ...) {
this.filePath = validateLogPath(filePath);
mkdirSync(path.dirname(this.filePath), { recursive: true });
...
}
```
How can I resolve this? If you propose a fix, please make it concise.
For example, a task with Use const taskCount = parseCSV(csvContent).length - 1; // exclude header, using proper CSV parserPrompt To Fix With AIThis is a comment left during a code review.
Path: cli/src/cli/commands/convert.ts
Line: 124
Comment:
**Task count may over-count when CSV fields contain embedded newlines**
`csvContent.split("\n").length - 1` counts raw newlines in the string, including newlines that occur *inside* double-quoted CSV fields (e.g. a task description that spans two lines). The `rowsToCsv` / `escapeCsvValue` functions correctly quote such values, so the CSV is well-formed — but a naïve `split("\n")` does not distinguish quoted-embedded newlines from row separators.
For example, a task with `desc = "fix login\nand logout"` produces the CSV row `..., "fix login\nand logout"`, which `split("\n")` counts as two rows.
Use `parseCSV` to count actual rows instead:
```typescript
const taskCount = parseCSV(csvContent).length - 1; // exclude header, using proper CSV parser
```
How can I resolve this? If you propose a fix, please make it concise.
With let head = 0;
while (head < queue.length) {
const current = queue[head++];
// ... rest of loop body unchanged
}This eliminates the O(n) Prompt To Fix With AIThis is a comment left during a code review.
Path: cli/src/tasks/yaml.ts
Line: 1348-1360
Comment:
**BFS queue uses `Array.shift()` — O(n) per dequeue**
`queue.shift()` removes the first element of the array, which requires shifting every remaining element one position to the left. This makes each dequeue O(n), giving the entire traversal an O(n²) worst-case complexity.
With `MAX_NODES = 10_000`, that could mean up to ~50 million element shifts for a maximally complex YAML file. Consider using a simple index counter to simulate a queue in O(1):
```typescript
let head = 0;
while (head < queue.length) {
const current = queue[head++];
// ... rest of loop body unchanged
}
```
This eliminates the O(n) `shift()` cost while keeping the BFS behaviour identical.
How can I resolve this? If you propose a fix, please make it concise.
Every log call now emits a full ISO timestamp and component tag, e.g.: Previously users saw the concise Consider keeping Prompt To Fix With AIThis is a comment left during a code review.
Path: cli/src/ui/logger.ts
Line: 1589-1621
Comment:
**Console output is now very verbose for a CLI tool**
Every log call now emits a full ISO timestamp and component tag, e.g.:
```
[2026-03-04T12:00:00.000Z] [SUCCESS] [ralphy] Converted tasks.yaml → tasks.csv
[2026-03-04T12:00:00.000Z] [INFO] [ralphy] 5 tasks
```
Previously users saw the concise `[OK] Converted …` format. For an interactive CLI, timestamped structured output is typically reserved for file sinks; the console sink is usually kept minimal (matching the old format).
Consider keeping `ConsoleLogSink` concise (just a `[LEVEL]` prefix), and restricting the full structured format to `JsonFileLogSink`. This prevents a visible UX regression for end-users.
How can I resolve this? If you propose a fix, please make it concise. |
| async markComplete(id: string): Promise<void> { | ||
| const content = readFileSync(this.filePath, "utf-8"); | ||
| const rows = parseCSV(content); | ||
| if (rows.length < 2) { | ||
| return; | ||
| } | ||
|
|
||
| const header = rows[0].map((cell) => cell.trim().toLowerCase()); | ||
| const idIndex = header.indexOf("id"); | ||
| const titleIndex = header.indexOf("title"); | ||
| const doneIndex = Math.max(header.indexOf("done"), header.indexOf("completed")); | ||
| const resolvedIdIndex = idIndex >= 0 ? idIndex : 0; | ||
| const resolvedTitleIndex = titleIndex >= 0 ? titleIndex : 1; | ||
| const resolvedDoneIndex = doneIndex >= 0 ? doneIndex : 2; | ||
| if (doneIndex < 0) { | ||
| logWarn( | ||
| `CSV markComplete: no 'done'/'completed' header found for ${this.filePath}, falling back to column index 2.`, | ||
| ); | ||
| } | ||
|
|
||
| let updated = false; | ||
| for (let i = 1; i < rows.length; i++) { | ||
| const row = rows[i]; | ||
| const rowId = row[resolvedIdIndex] || ""; | ||
| const rowTitle = row[resolvedTitleIndex] || ""; | ||
| if (rowId === id || rowTitle === id) { | ||
| while (row.length <= resolvedDoneIndex) { | ||
| row.push(""); | ||
| } | ||
| row[resolvedDoneIndex] = "1"; | ||
| updated = true; | ||
| break; | ||
| } | ||
| } | ||
|
|
||
| if (updated) { | ||
| writeFileSync(this.filePath, rowsToCsv(rows), "utf-8"); | ||
| } | ||
| } |
There was a problem hiding this comment.
markComplete silently no-ops when task ID not found. When no CSV row matches the given id (by ID or title column), updated stays false and the function returns void without signaling any error. The caller in runLoop cannot distinguish "task was successfully marked complete" from "no matching task was found". This diverges from the contract of other TaskSource implementations, which throw on unrecognized IDs.
In practice, a bug passing the wrong ID will silently proceed, leaving the task perpetually in the queue.
| async markComplete(id: string): Promise<void> { | |
| const content = readFileSync(this.filePath, "utf-8"); | |
| const rows = parseCSV(content); | |
| if (rows.length < 2) { | |
| return; | |
| } | |
| const header = rows[0].map((cell) => cell.trim().toLowerCase()); | |
| const idIndex = header.indexOf("id"); | |
| const titleIndex = header.indexOf("title"); | |
| const doneIndex = Math.max(header.indexOf("done"), header.indexOf("completed")); | |
| const resolvedIdIndex = idIndex >= 0 ? idIndex : 0; | |
| const resolvedTitleIndex = titleIndex >= 0 ? titleIndex : 1; | |
| const resolvedDoneIndex = doneIndex >= 0 ? doneIndex : 2; | |
| if (doneIndex < 0) { | |
| logWarn( | |
| `CSV markComplete: no 'done'/'completed' header found for ${this.filePath}, falling back to column index 2.`, | |
| ); | |
| } | |
| let updated = false; | |
| for (let i = 1; i < rows.length; i++) { | |
| const row = rows[i]; | |
| const rowId = row[resolvedIdIndex] || ""; | |
| const rowTitle = row[resolvedTitleIndex] || ""; | |
| if (rowId === id || rowTitle === id) { | |
| while (row.length <= resolvedDoneIndex) { | |
| row.push(""); | |
| } | |
| row[resolvedDoneIndex] = "1"; | |
| updated = true; | |
| break; | |
| } | |
| } | |
| if (updated) { | |
| writeFileSync(this.filePath, rowsToCsv(rows), "utf-8"); | |
| } | |
| } | |
| if (updated) { | |
| writeFileSync(this.filePath, rowsToCsv(rows), "utf-8"); | |
| } else { | |
| throw new Error(`Task ID not found in CSV: ${id}`); | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/tasks/csv.ts
Line: 144-182
Comment:
`markComplete` silently no-ops when task ID not found. When no CSV row matches the given `id` (by ID or title column), `updated` stays `false` and the function returns `void` without signaling any error. The caller in `runLoop` cannot distinguish "task was successfully marked complete" from "no matching task was found". This diverges from the contract of other `TaskSource` implementations, which throw on unrecognized IDs.
In practice, a bug passing the wrong ID will silently proceed, leaving the task perpetually in the queue.
```suggestion
if (updated) {
writeFileSync(this.filePath, rowsToCsv(rows), "utf-8");
} else {
throw new Error(`Task ID not found in CSV: ${id}`);
}
```
How can I resolve this? If you propose a fix, please make it concise.| export function parseCSV(content: string): string[][] { | ||
| const normalized = content.replace(/\r\n/g, "\n").replace(/\r/g, "\n"); |
There was a problem hiding this comment.
parseCSV doesn't strip the UTF-8 Byte Order Mark (BOM), breaking header detection. CSV files from Excel, LibreOffice, or Windows tools commonly begin with \uFEFF. The parser passes this character through unchanged, so the first header cell becomes "\uFEFFid" instead of "id".
Downstream, markComplete calls header.indexOf("id"), which fails to match "\uFEFFid" and falls back to column-index 0 — which may happen to be correct by accident, but the same mismatch means header.indexOf("done") / header.indexOf("completed") could also fail, corrupting the wrong cell.
Strip the BOM at the top of parseCSV:
| export function parseCSV(content: string): string[][] { | |
| const normalized = content.replace(/\r\n/g, "\n").replace(/\r/g, "\n"); | |
| export function parseCSV(content: string): string[][] { | |
| const normalized = content | |
| .replace(/^\uFEFF/, "") | |
| .replace(/\r\n/g, "\n") | |
| .replace(/\r/g, "\n"); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/tasks/csv.ts
Line: 8-9
Comment:
`parseCSV` doesn't strip the UTF-8 Byte Order Mark (BOM), breaking header detection. CSV files from Excel, LibreOffice, or Windows tools commonly begin with `\uFEFF`. The parser passes this character through unchanged, so the first header cell becomes `"\uFEFFid"` instead of `"id"`.
Downstream, `markComplete` calls `header.indexOf("id")`, which fails to match `"\uFEFFid"` and falls back to column-index 0 — which may happen to be correct by accident, but the same mismatch means `header.indexOf("done")` / `header.indexOf("completed")` could also fail, corrupting the wrong cell.
Strip the BOM at the top of `parseCSV`:
```suggestion
export function parseCSV(content: string): string[][] {
const normalized = content
.replace(/^\uFEFF/, "")
.replace(/\r\n/g, "\n")
.replace(/\r/g, "\n");
```
How can I resolve this? If you propose a fix, please make it concise.| function parseSkillFile( | ||
| filePath: string, | ||
| ): { name: string; description: string; content: string } | null { | ||
| try { | ||
| const content = readFileSync(filePath, "utf-8"); | ||
| const frontmatterMatch = content.match(/^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$/); | ||
|
|
||
| if (frontmatterMatch) { | ||
| const frontmatter = YAML.parse(frontmatterMatch[1]) as SkillFrontmatter; | ||
| if (hasPrototypePollution(frontmatter)) { | ||
| return null; | ||
| } | ||
| const body = frontmatterMatch[2].trim(); | ||
|
|
||
| return { | ||
| name: frontmatter.name || "", | ||
| description: frontmatter.description || "", | ||
| content: truncateDesc(body, 200), | ||
| }; | ||
| } | ||
|
|
||
| // No frontmatter, use filename as name | ||
| const name = filePath.split(/[/\\]/).pop()?.replace(".md", "") || ""; | ||
| return { | ||
| name, | ||
| description: "", | ||
| content: truncateDesc(content, 200), | ||
| }; | ||
| } catch { | ||
| return null; | ||
| } | ||
| } |
There was a problem hiding this comment.
parseSkillFile applies the frontmatter regex to raw file content without normalizing line endings first. The regex /^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$/ expects LF (\n) literal line breaks. On Windows or in files with CRLF line endings, the --- separator fails to match, and the function falls through to the no-frontmatter path, using the entire raw file content (truncated to 200 chars) as the content field. This means any YAML-like preamble that fails the regex check ends up visible in the skills CSV.
Normalize line endings before applying the regex:
| function parseSkillFile( | |
| filePath: string, | |
| ): { name: string; description: string; content: string } | null { | |
| try { | |
| const content = readFileSync(filePath, "utf-8"); | |
| const frontmatterMatch = content.match(/^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$/); | |
| if (frontmatterMatch) { | |
| const frontmatter = YAML.parse(frontmatterMatch[1]) as SkillFrontmatter; | |
| if (hasPrototypePollution(frontmatter)) { | |
| return null; | |
| } | |
| const body = frontmatterMatch[2].trim(); | |
| return { | |
| name: frontmatter.name || "", | |
| description: frontmatter.description || "", | |
| content: truncateDesc(body, 200), | |
| }; | |
| } | |
| // No frontmatter, use filename as name | |
| const name = filePath.split(/[/\\]/).pop()?.replace(".md", "") || ""; | |
| return { | |
| name, | |
| description: "", | |
| content: truncateDesc(content, 200), | |
| }; | |
| } catch { | |
| return null; | |
| } | |
| } | |
| const content = readFileSync(filePath, "utf-8"); | |
| const normalized = content.replace(/\r\n/g, "\n"); | |
| const frontmatterMatch = normalized.match(/^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$/); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/src/tasks/converter.ts
Line: 126-157
Comment:
`parseSkillFile` applies the frontmatter regex to raw file content without normalizing line endings first. The regex `/^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$/` expects LF (`\n`) literal line breaks. On Windows or in files with CRLF line endings, the `---` separator fails to match, and the function falls through to the no-frontmatter path, using the entire raw file content (truncated to 200 chars) as the `content` field. This means any YAML-like preamble that fails the regex check ends up visible in the skills CSV.
Normalize line endings before applying the regex:
```suggestion
const content = readFileSync(filePath, "utf-8");
const normalized = content.replace(/\r\n/g, "\n");
const frontmatterMatch = normalized.match(/^---\s*\n([\s\S]*?)\n---\s*\n([\s\S]*)$/);
```
How can I resolve this? If you propose a fix, please make it concise.
PR5: task sources and convert
Summary
This PR expands task ingestion/conversion and wires it into CLI entrypoints. It introduces CSV conversion flow and hardens task source parsing and validation.
Why this PR exists
What it adds
cli/src/cli/commands/convert.tscli/src/cli/args.ts(--convert-from,--convert-to)cli/src/cli/commands/index.tscli/src/index.tsrouting updatescli/src/tasks/csv.tscli/src/tasks/converter.tscli/src/tasks/index.tscli/src/tasks/markdown.tscli/src/tasks/markdown-folder.tscli/src/tasks/yaml.tscli/src/tasks/github.tsPR preview
ralphy --convert-from tasks.yaml --convert-to tasks.csvralphy --convert-from PRD.md(auto-defaults output to.csv)Concrete scenarios this fixes
../x.md:abc: now rejected immediately.owner/repo.Security/reliability hardening
owner/repoformat.Validation
bun run check,bun tsc --noEmit, andbun testin order.Note
Add task source expansion and conversion wiring by introducing a CSV task source and a
cli.commands.runConvertCLI path that converts YAML/Markdown/JSON PRDs to CSVAdd a CSV-backed task source, a
--convert-from/--convert-toCLI path that runscli.commands.runConvert, and expand parallel execution to accept CSV PRDs. Harden task sources (GitHub, YAML, Markdown, markdown-folder) and sanitize/logging, add a transform pipeline, hooks, metrics, resource manager, and JSON/stream parsers. Install global cleanup and signal handling. Centralize structured logging with pluggable sinks.📍Where to Start
Start with
mainin cli/src/index.ts, then reviewcli.commands.runConvertin cli/src/cli/commands/convert.ts and CSV source wiring in cli/src/tasks/index.ts and cli/src/tasks/csv.ts.Macroscope summarized 813f447.