-
Notifications
You must be signed in to change notification settings - Fork 1.3k
feat: agent-output-validator hook + commander/oracle validation #702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
feat: agent-output-validator hook + commander/oracle validation #702
Conversation
|
Thank you for your contribution! Before we can merge this PR, we need you to sign our Contributor License Agreement (CLA). To sign the CLA, please comment on this PR with: This is a one-time requirement. Once signed, all your future contributions will be automatically accepted. I have read the CLA Document and I hereby sign the CLA opencode seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
I have read the CLA Document and I hereby sign the CLA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9 issues found across 27 files
Confidence score: 2/5
src/shared/commander-validator.tscurrently enforces section names/format that diverge from the Commander template, so every valid Commander response will still be rejected by the validator.src/features/builtin-commands/commands.tsandtemplates/commander.mdhard-code an unsupportedsubagent_type="commander", meaning the command invocation will always fail at runtime because only "explore" or "librarian" are accepted.src/shared/reviewer-validator.tscomputeshasTablebut never checks it, allowing malformed CRITERIA CHECK tables to slip through and undermining the reviewer contract.- Pay close attention to
src/shared/commander-validator.ts,src/features/builtin-commands/commands.ts, andsrc/features/builtin-commands/templates/commander.md- they need to align with the actual Commander contract and allowed subagent types.
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="patches/IMPLEMENTATION_SUMMARY.md">
<violation number="1" location="patches/IMPLEMENTATION_SUMMARY.md:82">
P2: Implementation summary is outdated/inaccurate: claims validators are not integrated, but the agent-output-validator hook is wired into runtime execution.</violation>
</file>
<file name="patches/AGENT_RESPONSIBILITY_VERIFICATION.md">
<violation number="1" location="patches/AGENT_RESPONSIBILITY_VERIFICATION.md:193">
P2: Documentation incorrectly claims Commander agent file does not exist even though `src/agents/commander.ts` is present</violation>
</file>
<file name="src/features/builtin-commands/templates/commander.md">
<violation number="1" location="src/features/builtin-commands/templates/commander.md:7">
P2: Template instructs calling task tool with unsupported subagent_type "commander"; runtime only allows "explore" or "librarian", so this command will always be rejected.</violation>
</file>
<file name="src/agents/oracle.ts">
<violation number="1" location="src/agents/oracle.ts:98">
P2: System prompt falsely claims no intermediate processing even though the agent-output-validator hook appends content to outputs on validation failure, causing misaligned instructions.</violation>
<violation number="2" location="src/agents/oracle.ts:119">
P2: Oracle prompt contains contradictory directives (advisor with action plan vs reviewer forbidden to give solutions/forced PASS/FAIL format), making the role unclear and outputs likely non-compliant.</violation>
</file>
<file name="src/features/builtin-commands/commands.ts">
<violation number="1" location="src/features/builtin-commands/commands.ts:52">
P1: Commander command hard-codes subagent_type="commander" but the task tool only allows "explore" or "librarian", so the command will fail at runtime.</violation>
</file>
<file name="src/shared/reviewer-validator.ts">
<violation number="1" location="src/shared/reviewer-validator.ts:45">
P2: CRITERIA CHECK table format is not enforced: `hasTable` is computed but never validated, so malformed tables still pass</violation>
<violation number="2" location="src/shared/reviewer-validator.ts:48">
P2: Criteria row detection is not scoped to the CRITERIA CHECK section, allowing unrelated table rows elsewhere to satisfy the required entry check.</violation>
</file>
<file name="src/shared/commander-validator.ts">
<violation number="1" location="src/shared/commander-validator.ts:34">
P1: Commander validator hardcodes section names/format that contradict the actual Commander template, causing guaranteed false failures</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| commander: { | ||
| description: "(builtin) Call Commander agent for architecture and planning tasks", | ||
| template: `<command-instruction> | ||
| Use task tool with subagent_type="commander" to invoke Commander agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Commander command hard-codes subagent_type="commander" but the task tool only allows "explore" or "librarian", so the command will fail at runtime.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/builtin-commands/commands.ts, line 52:
<comment>Commander command hard-codes subagent_type="commander" but the task tool only allows "explore" or "librarian", so the command will fail at runtime.</comment>
<file context>
@@ -38,9 +39,29 @@ ${CANCEL_RALPH_TEMPLATE}
+ commander: {
+ description: "(builtin) Call Commander agent for architecture and planning tasks",
+ template: `<command-instruction>
+Use task tool with subagent_type="commander" to invoke Commander agent.
+
+Task will call Commander with the following system prompt:
</file context>
| const duplicates: string[] = []; | ||
|
|
||
| // Define required section markers | ||
| const requiredSections = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P1: Commander validator hardcodes section names/format that contradict the actual Commander template, causing guaranteed false failures
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/shared/commander-validator.ts, line 34:
<comment>Commander validator hardcodes section names/format that contradict the actual Commander template, causing guaranteed false failures</comment>
<file context>
@@ -0,0 +1,91 @@
+ const duplicates: string[] = [];
+
+ // Define required section markers
+ const requiredSections = [
+ "SPEC",
+ "ACCEPTANCE CRITERIA",
</file context>
|
|
||
| 1. **Automatic routing** based on complexity - GLM-4.7 will handle everything by default | ||
| 2. **Escalation logic** - No automatic escalation to Commander/Codex | ||
| 3. **Prompt-level validation** - Validators exist but are not integrated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Implementation summary is outdated/inaccurate: claims validators are not integrated, but the agent-output-validator hook is wired into runtime execution.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At patches/IMPLEMENTATION_SUMMARY.md, line 82:
<comment>Implementation summary is outdated/inaccurate: claims validators are not integrated, but the agent-output-validator hook is wired into runtime execution.</comment>
<file context>
@@ -0,0 +1,181 @@
+
+1. **Automatic routing** based on complexity - GLM-4.7 will handle everything by default
+2. **Escalation logic** - No automatic escalation to Commander/Codex
+3. **Prompt-level validation** - Validators exist but are not integrated
+4. **Structured decision packets** - Not implemented due to encoding issues
+
</file context>
| 3. **Prompt-level validation** - Validators exist but are not integrated | |
| 3. **Prompt-level validation** - Runtime agent-output-validator hook is integrated (Oracle/Commander output validation active) |
|
|
||
| ### Commander Agent Status | ||
|
|
||
| **Status**: ⚠️ Commander agent file does not exist in codebase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Documentation incorrectly claims Commander agent file does not exist even though src/agents/commander.ts is present
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At patches/AGENT_RESPONSIBILITY_VERIFICATION.md, line 193:
<comment>Documentation incorrectly claims Commander agent file does not exist even though `src/agents/commander.ts` is present</comment>
<file context>
@@ -0,0 +1,528 @@
+
+### Commander Agent Status
+
+**Status**: ⚠️ Commander agent file does not exist in codebase
+
+**But**: Commander is configured in `oh-my-opencode.json` and may be invoked via `/commander` slash command
</file context>
| **Status**: ⚠️ Commander agent file does not exist in codebase | |
| **Status**: ✅ Commander agent implemented (`src/agents/commander.ts`) |
|
|
||
| # /commander | ||
|
|
||
| Use task tool with subagent_type="commander" to invoke Commander agent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Template instructs calling task tool with unsupported subagent_type "commander"; runtime only allows "explore" or "librarian", so this command will always be rejected.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/features/builtin-commands/templates/commander.md, line 7:
<comment>Template instructs calling task tool with unsupported subagent_type "commander"; runtime only allows "explore" or "librarian", so this command will always be rejected.</comment>
<file context>
@@ -0,0 +1,33 @@
+
+# /commander
+
+Use task tool with subagent_type="commander" to invoke Commander agent.
+
+Task will call Commander with the following system prompt:
</file context>
| ## Critical Note | ||
| Your response goes directly to the user with no intermediate processing. Make your final message self-contained: a clear recommendation they can act on immediately, covering both what to do and why.` | ||
| Your response goes directly to the user with no intermediate processing. Make your final message self-contained: a clear recommendation they can act on immediately, covering both what to do and why. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: System prompt falsely claims no intermediate processing even though the agent-output-validator hook appends content to outputs on validation failure, causing misaligned instructions.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/agents/oracle.ts, line 98:
<comment>System prompt falsely claims no intermediate processing even though the agent-output-validator hook appends content to outputs on validation failure, causing misaligned instructions.</comment>
<file context>
@@ -83,19 +83,62 @@ Organize your final answer in three tiers:
## Critical Note
-Your response goes directly to the user with no intermediate processing. Make your final message self-contained: a clear recommendation they can act on immediately, covering both what to do and why.`
+Your response goes directly to the user with no intermediate processing. Make your final message self-contained: a clear recommendation they can act on immediately, covering both what to do and why.
+
+---
</file context>
| Your response goes directly to the user with no intermediate processing. Make your final message self-contained: a clear recommendation they can act on immediately, covering both what to do and why. | |
| Your response is validated before delivery; ensure it meets the required format, or validation errors may be appended. Make your final message self-contained: a clear recommendation they can act on immediately, covering both what to do and why. |
| **You MUST**: | ||
| 1. **Provide structured reviews only** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Oracle prompt contains contradictory directives (advisor with action plan vs reviewer forbidden to give solutions/forced PASS/FAIL format), making the role unclear and outputs likely non-compliant.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/agents/oracle.ts, line 119:
<comment>Oracle prompt contains contradictory directives (advisor with action plan vs reviewer forbidden to give solutions/forced PASS/FAIL format), making the role unclear and outputs likely non-compliant.</comment>
<file context>
@@ -83,19 +83,62 @@ Organize your final answer in three tiers:
+
+**You MUST**:
+
+1. **Provide structured reviews only**
+ - Output MUST start with: VERDICT: [PASS|FAIL]
+ - Output MUST include: CRITERIA CHECK table with format:
</file context>
| const hasTable = /\|\s*\d+\s*\|\s*(Yes|No)\s*\|/i.test(output); | ||
|
|
||
| // 5. Check for at least one criteria row | ||
| const criteriaEntries = output.match(/\|\s*\d+\s*\|[^\n]+/gm); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: Criteria row detection is not scoped to the CRITERIA CHECK section, allowing unrelated table rows elsewhere to satisfy the required entry check.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/shared/reviewer-validator.ts, line 48:
<comment>Criteria row detection is not scoped to the CRITERIA CHECK section, allowing unrelated table rows elsewhere to satisfy the required entry check.</comment>
<file context>
@@ -0,0 +1,64 @@
+ const hasTable = /\|\s*\d+\s*\|\s*(Yes|No)\s*\|/i.test(output);
+
+ // 5. Check for at least one criteria row
+ const criteriaEntries = output.match(/\|\s*\d+\s*\|[^\n]+/gm);
+ if (!criteriaEntries) {
+ result.errors.push("CRITERIA CHECK table must have at least one entry");
</file context>
| } | ||
|
|
||
| // 4. Validate CRITERIA CHECK table format | ||
| const hasTable = /\|\s*\d+\s*\|\s*(Yes|No)\s*\|/i.test(output); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P2: CRITERIA CHECK table format is not enforced: hasTable is computed but never validated, so malformed tables still pass
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/shared/reviewer-validator.ts, line 45:
<comment>CRITERIA CHECK table format is not enforced: `hasTable` is computed but never validated, so malformed tables still pass</comment>
<file context>
@@ -0,0 +1,64 @@
+ }
+
+ // 4. Validate CRITERIA CHECK table format
+ const hasTable = /\|\s*\d+\s*\|\s*(Yes|No)\s*\|/i.test(output);
+
+ // 5. Check for at least one criteria row
</file context>
|
recheck |
Summary
This PR introduces a runtime agent output validation mechanism to enforce clear responsibility boundaries between agents (e.g. Commander vs Oracle) and make violations detectable, testable, and auditable.
Key goals:
Changes
Core
agent-output-validatorhook to the runtime execution chainAgent Responsibilities
Validation & Logging
Validating Commander output: PASS/FAILValidating Oracle output: PASS/FAILSlash Command
/commanderbuiltin command templateSchema & Wiring
Testing
Manual verification performed:
bun run buildpasses (571 modules, 0 errors)Example log evidence:
Summary by cubic
Adds a runtime agent-output-validator hook that enforces role boundaries for Commander and Oracle by validating output format and detecting implementation code. Violations are surfaced with deterministic PASS/FAIL logs and an appended error message, preventing bad outputs from reaching users.
New Features
Migration
Written for commit 5a80b7c. Summary will update on new commits.