Skip to content

feat(engine): harness engineering hooks + README currency + BSL 1.1#85

Merged
jeremylongshore merged 3 commits intomainfrom
feat/harness-engineering-hooks
Feb 18, 2026
Merged

feat(engine): harness engineering hooks + README currency + BSL 1.1#85
jeremylongshore merged 3 commits intomainfrom
feat/harness-engineering-hooks

Conversation

@jeremylongshore
Copy link
Contributor

@jeremylongshore jeremylongshore commented Feb 18, 2026

Summary

  • 5 harness engineering hooks for agent quality improvement: budget management, environment onboarding, loop detection, self-test, and trace analysis hooks with full test coverage
  • 12 README currency fixes: version 0.9.0, 8 agents in diagram, mcp-server + sandbox in monorepo tree, gwi-mcp-server deployment service, simulate command marked available, RAG status corrected, security audit date updated, alpha status
  • Enterprise roadmap (Phases 6-8): Enterprise Infrastructure, Analytics & Scale, Enterprise Identity added to Mermaid Gantt + status table
  • License change: MIT to BSL 1.1 (change date 2030-02-17, converts to Apache 2.0)
  • PRD document: 000-docs/240-PP-RMAP-enterprise-architecture-roadmap.md — 8-layer GCP reference architecture, gap analysis, implementation roadmap with OpenTofu resource specs, cost estimates

Test plan

  • npm run build passes
  • npm run typecheck passes
  • npm run test — all new hook tests pass (5 test files, ~1300 lines)
  • npm run arv passes
  • README Mermaid diagrams render correctly on GitHub
  • LICENSE contains valid BSL 1.1 text
  • package.json license field is BUSL-1.1
  • PRD follows 6767 filing standard naming

Summary by CodeRabbit

  • New Features

    • Budget tracking with cost estimates, in-run warnings and optional blocking.
    • Loop detection that flags repeated or highly similar outputs and can nudge or block.
    • Self-test validation reporting test coverage/quality with optional blocking.
    • Environment profiling injected into runs/steps (languages, tools, config files).
    • Trace analysis summarizing failures, metrics, and detected patterns.
  • Tests

    • Comprehensive test suites added for all new monitoring, validation, profiling, and blocking behaviors.
  • Documentation

    • New enterprise architecture roadmap and README updates; license changed to BUSL-1.1.

…vement

Implements patterns from the LangChain DeepAgents "Harness Engineering" post
to close identified gaps in GWI's agent pipeline:

- Trace Analysis (Pattern 5): mines decision traces for failure patterns
- Self-Test (Pattern 1): validates CODER output includes tests
- Environment Onboarding (Pattern 2): profiles repo tech stack on run start
- Loop Detection (Pattern 3): detects semantic loops in CODER output
- Budget Management (Pattern 6): tracks token/cost budgets per run

All hooks follow existing patterns (AgentHook interface, env-var toggles,
factory functions, custom error classes). 125 tests pass across all hook
test suites.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@qodo-code-review
Copy link

Review Summary by Qodo

Add 5 harness engineering hooks for agent pipeline quality

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Implements 5 new harness engineering hooks for agent quality improvement
  - Trace Analysis: mines decision traces for failure patterns and agent failure rates
  - Self-Test: validates CODER output includes tests with framework detection
  - Environment Onboarding: profiles repo tech stack on run start
  - Loop Detection: detects semantic loops in CODER output via Dice coefficient
  - Budget Management: tracks token usage and cost per run with warning levels
• All hooks follow existing patterns with AgentHook interface, env-var toggles, and factory
  functions
• Adds 82 new tests across 5 test suites validating all hook functionality
• Updates hook configuration to register all 8 hooks (3 existing + 5 new)
Diagram
flowchart LR
  A["Agent Pipeline"] --> B["5 New Hooks"]
  B --> C["Trace Analysis"]
  B --> D["Self-Test"]
  B --> E["Environment Onboarding"]
  B --> F["Loop Detection"]
  B --> G["Budget Management"]
  C --> H["Failure Patterns"]
  D --> I["Test Validation"]
  E --> J["Tech Stack Profile"]
  F --> K["Loop Detection"]
  G --> L["Budget Warnings"]
  H --> M["Context Metadata"]
  I --> M
  J --> M
  K --> M
  L --> M
Loading

Grey Divider

File Changes

1. packages/engine/src/hooks/trace-analysis-hook.ts ✨ Enhancement +296/-0

Trace analysis hook for failure pattern detection

packages/engine/src/hooks/trace-analysis-hook.ts


2. packages/engine/src/hooks/self-test-hook.ts ✨ Enhancement +253/-0

Self-test validation hook for test inclusion

packages/engine/src/hooks/self-test-hook.ts


3. packages/engine/src/hooks/environment-onboarding-hook.ts ✨ Enhancement +290/-0

Environment profiling hook for tech stack discovery

packages/engine/src/hooks/environment-onboarding-hook.ts


View more (10)
4. packages/engine/src/hooks/loop-detection-hook.ts ✨ Enhancement +261/-0

Loop detection hook with Dice coefficient similarity

packages/engine/src/hooks/loop-detection-hook.ts


5. packages/engine/src/hooks/budget-management-hook.ts ✨ Enhancement +278/-0

Budget tracking hook for token and cost limits

packages/engine/src/hooks/budget-management-hook.ts


6. packages/engine/src/hooks/__tests__/trace-analysis-hook.test.ts 🧪 Tests +227/-0

Comprehensive tests for trace analysis hook

packages/engine/src/hooks/tests/trace-analysis-hook.test.ts


7. packages/engine/src/hooks/__tests__/self-test-hook.test.ts 🧪 Tests +244/-0

Comprehensive tests for self-test validation hook

packages/engine/src/hooks/tests/self-test-hook.test.ts


8. packages/engine/src/hooks/__tests__/environment-onboarding-hook.test.ts 🧪 Tests +224/-0

Comprehensive tests for environment onboarding hook

packages/engine/src/hooks/tests/environment-onboarding-hook.test.ts


9. packages/engine/src/hooks/__tests__/loop-detection-hook.test.ts 🧪 Tests +264/-0

Comprehensive tests for loop detection hook

packages/engine/src/hooks/tests/loop-detection-hook.test.ts


10. packages/engine/src/hooks/__tests__/budget-management-hook.test.ts 🧪 Tests +287/-0

Comprehensive tests for budget management hook

packages/engine/src/hooks/tests/budget-management-hook.test.ts


11. packages/engine/src/hooks/__tests__/hook-config.test.ts 🧪 Tests +7/-2

Update hook config test to verify all 8 hooks

packages/engine/src/hooks/tests/hook-config.test.ts


12. packages/engine/src/hooks/config.ts ⚙️ Configuration changes +60/-0

Register 5 new hooks in default hook runner

packages/engine/src/hooks/config.ts


13. packages/engine/src/hooks/index.ts ✨ Enhancement +57/-0

Export new hooks and types from public API

packages/engine/src/hooks/index.ts


Grey Divider

Qodo Logo

@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

📝 Walkthrough

Walkthrough

Adds five new engine hooks (TraceAnalysis, SelfTest, EnvironmentOnboarding, LoopDetection, BudgetManagement) with TypeScript implementations, tests, registration/exports, lifecycle behaviors (onRunStart/onBeforeStep/onAfterStep/onRunEnd), metadata injection, optional blocking, and related docs/license/readme changes.

Changes

Cohort / File(s) Summary
New Hook Implementations
packages/engine/src/hooks/budget-management-hook.ts, packages/engine/src/hooks/environment-onboarding-hook.ts, packages/engine/src/hooks/loop-detection-hook.ts, packages/engine/src/hooks/self-test-hook.ts, packages/engine/src/hooks/trace-analysis-hook.ts
Adds five hook modules: configs, types, default configs, error classes, core logic (budget/costs, env profiling, similarity-based loop detection, self-test validation, trace analysis), lifecycle handlers, factories, and exports.
Hook Tests
packages/engine/src/hooks/__tests__/*-hook.test.ts
Adds comprehensive tests for the five hooks covering lifecycle flows, metadata injection, warning/blocking behavior, callbacks, config variations, edge cases, and isEnabled toggles.
Hook Integration & Exports
packages/engine/src/hooks/config.ts, packages/engine/src/hooks/index.ts
Registers new hooks in default runner (env-gated), adds debug logging, and expands public exports/types (including calculateSimilarity).
Hook Registration Tests & Updates
packages/engine/src/hooks/__tests__/hook-config.test.ts
Updates registration test to expect the expanded hook set (now 8 hooks) and include the newly added hooks.
Docs, License, Metadata
000-docs/240-PP-RMAP-enterprise-architecture-roadmap.md, README.md, LICENSE, package.json
Adds enterprise architecture roadmap doc, updates README (version bump, roadmap/diagrams/license), changes license to BUSL-1.1 (and package.json license field).

Sequence Diagram(s)

sequenceDiagram
    participant Agent as Agent Execution
    participant Runner as Hook Runner
    participant Hook as Hook (Budget/Loop/Test/Trace/Env)
    participant Ctx as AgentRunContext
    participant CB as Optional Callbacks

    Agent->>Runner: onRunStart(ctx)
    Runner->>Hook: onRunStart(ctx)
    Hook->>Ctx: initialize per-run state / profile

    Agent->>Runner: onBeforeStep(ctx)
    Runner->>Hook: onBeforeStep(ctx)
    Hook->>Ctx: compute/inject metadata (warnings, env profile, nudges)
    Hook->>CB: call callbacks if thresholds met

    Agent->>Agent: execute step

    Agent->>Runner: onAfterStep(ctx)
    Runner->>Hook: onAfterStep(ctx)
    Hook->>Hook: analyze outputs (tokens, similarity, tests, traces)
    Hook->>Ctx: attach results (budgetStatus, loopDetection, selfTest, traceAnalysis)
    Hook->>CB: invoke on detection/enforcement callbacks

    Agent->>Runner: onRunEnd(ctx, success)
    Runner->>Hook: onRunEnd(ctx, success)
    Hook->>Ctx: emit aggregated results, cleanup state
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 I hopped through runs at break of day,

counted tokens, nudged loops along the way.
I profiled homes and flagged each test,
rang alarms when budgets failed the quest.
Hooray for hooks — the rabbit says: well-played!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 43.75% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding five harness engineering hooks to the engine, plus README updates and license change to BSL 1.1. It is concise, specific, and clearly highlights the primary deliverable.
Description check ✅ Passed The PR description covers all key aspects: detailed summary of the five hooks with their functionality, README updates, enterprise roadmap, license change, and PRD document. It includes a comprehensive test plan with specific validation steps and follows the template structure with Type of Change, Testing, and Checklist sections.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/harness-engineering-hooks

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @jeremylongshore, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the agent pipeline by integrating five new 'Harness Engineering' patterns. These additions provide advanced capabilities for monitoring, validating, and optimizing agent behavior, ensuring more robust, efficient, and self-aware operations. The new hooks enable proactive identification of issues like repetitive actions, budget overruns, and lack of testing, while also enriching the agent's contextual understanding of the project environment.

Highlights

  • Trace Analysis Hook (Pattern 5): Implemented a new hook that mines the DecisionTraceStore on run end to identify failure patterns, agent failure rates, low-confidence decisions, and slow failures.
  • Self-Test Hook (Pattern 1): Added a hook to validate CODER output, ensuring it includes test files, detecting test frameworks (vitest/jest/pytest/go-testing), and checking the test-to-source ratio.
  • Environment Onboarding Hook (Pattern 2): Introduced a hook that profiles the repository's tech stack (language, framework, test runner, monorepo) at the start of a run and injects this information into all step contexts.
  • Loop Detection Hook (Pattern 3): Developed a hook that tracks CODER output similarity using Dice coefficient, flagging potential loops after multiple similar outputs.
  • Budget Management Hook (Pattern 6): Created a hook to accumulate token usage and cost per run, injecting warnings at 75% and critical alerts at 90% of the budget.
Changelog
  • packages/engine/src/hooks/tests/budget-management-hook.test.ts
    • Added tests for the Budget Management Hook, covering token tracking, cost estimation, warning levels, blocking mode, cost-based budgets, run end cleanup, and enablement.
  • packages/engine/src/hooks/tests/environment-onboarding-hook.test.ts
    • Added tests for the Environment Onboarding Hook, verifying environment profiling from file lists, detection of various project types (Python, Go, Rust), monorepo detection, context injection, and enablement.
  • packages/engine/src/hooks/tests/hook-config.test.ts
    • Updated hook configuration tests to include the registration of the 5 new agent hooks.
  • packages/engine/src/hooks/tests/loop-detection-hook.test.ts
    • Added tests for the Loop Detection Hook, including similarity calculation, role filtering, loop detection with identical and near-identical outputs, blocking mode, callback functionality, run end cleanup, and configuration options.
  • packages/engine/src/hooks/tests/self-test-hook.test.ts
    • Added tests for the Self-Test Hook, covering role filtering, test file detection, framework detection, assertion detection, 'requireTests' mode, blocking mode, minimum test ratio, and various test pattern matching.
  • packages/engine/src/hooks/tests/trace-analysis-hook.test.ts
    • Added tests for the Trace Analysis Hook, verifying trace analysis on run end, detection of repeated failures, low confidence failures, handling of no traces, callback functionality, agent failure rate calculation, token usage aggregation, and enablement.
  • packages/engine/src/hooks/budget-management-hook.ts
    • Implemented the Budget Management Hook, including its configuration, 'BudgetExceededError', and methods for tracking token usage, estimating cost, injecting warnings, and enforcing blocking.
  • packages/engine/src/hooks/config.ts
    • Updated the hook configuration to import and register the new 'TraceAnalysisHook', 'SelfTestHook', 'EnvironmentOnboardingHook', 'LoopDetectionHook', and 'BudgetManagementHook', along with their respective environment variable toggles.
  • packages/engine/src/hooks/environment-onboarding-hook.ts
    • Implemented the Environment Onboarding Hook, including its configuration, 'EnvironmentProfile' interface, detection rules for various tech stacks, and methods for profiling the environment and injecting the profile into contexts.
  • packages/engine/src/hooks/index.ts
    • Exported the new 'TraceAnalysisHook', 'SelfTestHook', 'EnvironmentOnboardingHook', 'LoopDetectionHook', and 'BudgetManagementHook', along with their related types and factory functions.
  • packages/engine/src/hooks/loop-detection-hook.ts
    • Implemented the Loop Detection Hook, including its configuration, 'LoopDetectionError', a 'calculateSimilarity' function, and methods for tracking CODER outputs, detecting semantic loops, and enforcing blocking.
  • packages/engine/src/hooks/self-test-hook.ts
    • Implemented the Self-Test Hook, including its configuration, 'SelfTestError', 'SelfTestValidation' interface, test framework detection patterns, and methods for validating generated code for test inclusion and ratio.
  • packages/engine/src/hooks/trace-analysis-hook.ts
    • Implemented the Trace Analysis Hook, including its configuration, 'FailurePattern' and 'TraceAnalysisResult' interfaces, and methods for analyzing decision traces, detecting failure patterns, and aggregating metrics.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link

🤖 Gemini 2.5 Flash Code Review

Files Reviewed: 13

Summary

This PR introduces five new harness engineering hooks: Budget Management, Environment Onboarding, Loop Detection, Self-Test, and Trace Analysis. These hooks enhance the agent's capabilities by providing context, detecting issues, and enforcing constraints, aligning with best practices for robust AI agent development. The changes include new hook implementations, comprehensive unit tests, and integration into the default hook runner configuration.

Critical Issues 🔴

packages/engine/src/hooks/budget-management-hook.ts

  • L197: Division by Zero Risk in calculateStatus
    The calculateStatus method (which is implied to exist and perform calculations like tokenBudgetPercent and costBudgetPercent) might face division by zero if `this.config.max

Powered by Gemini 2.5 Flash

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent pull request that introduces five valuable new agent hooks, significantly enhancing the engine's capabilities with harness engineering patterns. The implementation of each hook is robust, well-structured, and accompanied by a comprehensive suite of tests. The code quality is high, and the separation of concerns is clear. I have two medium-severity suggestions aimed at improving maintainability by reducing code duplication in the hook registration and simplifying one of the new hooks by removing a redundant method. Overall, this is a great contribution.

Comment on lines +98 to +146
// Register trace analysis hook (enabled by default, opt-out via env)
if (process.env.GWI_TRACE_ANALYSIS_ENABLED !== 'false') {
const traceAnalysisHook = new TraceAnalysisHook();
runner.register(traceAnalysisHook);

if (config.debug) {
logger.debug('Trace analysis hook registered');
}
}

// Register self-test hook (enabled by default, opt-out via env)
if (process.env.GWI_SELF_TEST_HOOK_ENABLED !== 'false') {
const selfTestHook = new SelfTestHook();
runner.register(selfTestHook);

if (config.debug) {
logger.debug('Self-test hook registered');
}
}

// Register environment onboarding hook (enabled by default, opt-out via env)
if (process.env.GWI_ENVIRONMENT_ONBOARDING_ENABLED !== 'false') {
const envOnboardingHook = new EnvironmentOnboardingHook();
runner.register(envOnboardingHook);

if (config.debug) {
logger.debug('Environment onboarding hook registered');
}
}

// Register loop detection hook (enabled by default, opt-out via env)
if (process.env.GWI_LOOP_DETECTION_ENABLED !== 'false') {
const loopDetectionHook = new LoopDetectionHook();
runner.register(loopDetectionHook);

if (config.debug) {
logger.debug('Loop detection hook registered');
}
}

// Register budget management hook (enabled by default, opt-out via env)
if (process.env.GWI_BUDGET_MANAGEMENT_ENABLED !== 'false') {
const budgetHook = new BudgetManagementHook();
runner.register(budgetHook);

if (config.debug) {
logger.debug('Budget management hook registered');
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The registration logic for each of the new hooks is duplicated. This pattern, while clear, makes it more verbose to add or manage hooks in the future. To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, you could refactor this by using a configuration array and iterating over it to register the hooks. A similar refactoring could also be applied to the existing hook registrations above this block.

  const newHooks: { envVar: string; hook: () => AgentHook; name: string }[] = [
    { envVar: 'GWI_TRACE_ANALYSIS_ENABLED', hook: () => new TraceAnalysisHook(), name: 'Trace analysis' },
    { envVar: 'GWI_SELF_TEST_HOOK_ENABLED', hook: () => new SelfTestHook(), name: 'Self-test' },
    { envVar: 'GWI_ENVIRONMENT_ONBOARDING_ENABLED', hook: () => new EnvironmentOnboardingHook(), name: 'Environment onboarding' },
    { envVar: 'GWI_LOOP_DETECTION_ENABLED', hook: () => new LoopDetectionHook(), name: 'Loop detection' },
    { envVar: 'GWI_BUDGET_MANAGEMENT_ENABLED', hook: () => new BudgetManagementHook(), name: 'Budget management' },
  ];

  for (const { envVar, hook, name } of newHooks) {
    if (process.env[envVar] !== 'false') {
      runner.register(hook());
      if (config.debug) {
        logger.debug(`${name} hook registered`);
      }
    }
  }
References
  1. Don't Repeat Yourself (DRY): Avoid duplicating code by abstracting common patterns. This makes the code easier to maintain and less prone to errors when changes are needed.

Comment on lines +181 to +188
async onAfterStep(ctx: AgentRunContext): Promise<void> {
if (!this.config.injectIntoSteps) return;

const profile = this.runProfiles.get(ctx.runId);
if (profile && ctx.metadata) {
ctx.metadata.environmentProfile = profile;
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The onAfterStep method appears to be redundant. The environment profile is generated in onRunStart and injected into the initial context. For subsequent steps, the onBeforeStep method already injects the profile, ensuring it's available to the agent during its execution. The onAfterStep method re-injects the same static data, which is unnecessary. Removing this method would simplify the hook's implementation without affecting its functionality.

References
  1. Code should be as simple as possible. Remove redundant or unnecessary code to improve clarity and maintainability.

@greptile-apps
Copy link

greptile-apps bot commented Feb 18, 2026

Greptile Summary

Adds 5 new agent pipeline hooks implementing LangChain DeepAgents "Harness Engineering" patterns for production agent quality:

  • Trace Analysis (Pattern 5): Mines DecisionTraceStore on run completion to detect agent failure patterns, low-confidence decisions, and timeouts
  • Self-Test (Pattern 1): Validates CODER output includes tests, detects frameworks (vitest/jest/pytest/go-testing), enforces test-to-source ratios
  • Environment Onboarding (Pattern 2): Profiles repo tech stack on run start from context metadata, injects profile into all step contexts
  • Loop Detection (Pattern 3): Tracks CODER output similarity via Dice coefficient, flags loops after 3+ outputs at 80%+ similarity
  • Budget Management (Pattern 6): Accumulates token usage and cost per run, injects warnings at 75% and critical alerts at 90%

All hooks follow established patterns: AgentHook interface, env-var opt-out toggles (default ON), factory functions, custom error classes, onBlocked callbacks. Implementation quality is excellent with 82 new tests across 5 test suites, comprehensive coverage, and clean integration with existing hook infrastructure.

Confidence Score: 5/5

  • Safe to merge with high confidence
  • All hooks follow established patterns, comprehensive test coverage (82 new tests), clean integration with existing infrastructure, and proper error handling
  • No files require special attention

Important Files Changed

Filename Overview
packages/engine/src/hooks/budget-management-hook.ts Implements Pattern 6 token/cost tracking with warnings at 75%/90% thresholds, clean implementation
packages/engine/src/hooks/environment-onboarding-hook.ts Implements Pattern 2 environment profiling from context metadata, comprehensive config detection
packages/engine/src/hooks/loop-detection-hook.ts Implements Pattern 3 with Dice coefficient similarity, tracks CODER outputs effectively
packages/engine/src/hooks/self-test-hook.ts Implements Pattern 1 test validation with framework detection and test-to-source ratio checks
packages/engine/src/hooks/trace-analysis-hook.ts Implements Pattern 5 trace mining, analyzes failure patterns and agent failure rates from DecisionTraceStore
packages/engine/src/hooks/config.ts Registers all 8 hooks with env-var opt-out toggles, follows existing pattern

Flowchart

flowchart TD
    Start[Run Start] --> EnvOnboard[Environment Onboarding Hook<br/>Pattern 2: Profile tech stack]
    EnvOnboard --> StepLoop{For Each Step}
    
    StepLoop --> BeforeStep[Before Step]
    BeforeStep --> BudgetCheck[Budget Management Hook<br/>Pattern 6: Check token/cost budget]
    BudgetCheck --> RiskCheck[Risk Enforcement Hook<br/>Check operation risk tier]
    RiskCheck -->|Approved| StepExec[Execute Agent Step]
    RiskCheck -->|Blocked| BlockError[Throw Error]
    
    StepExec --> AfterStep[After Step]
    AfterStep --> SelfTest{Is CODER?}
    SelfTest -->|Yes| SelfTestHook[Self-Test Hook<br/>Pattern 1: Validate tests included]
    SelfTest -->|No| LoopDetect
    SelfTestHook --> LoopDetect{Is CODER?}
    
    LoopDetect -->|Yes| LoopHook[Loop Detection Hook<br/>Pattern 3: Check output similarity]
    LoopDetect -->|No| CodeQuality
    LoopHook --> CodeQuality{Is CODER?}
    
    CodeQuality -->|Yes| QualityHook[Code Quality Hook<br/>Check slop score]
    CodeQuality -->|No| BudgetUpdate
    QualityHook --> BudgetUpdate[Budget Management Hook<br/>Update token usage]
    
    BudgetUpdate --> DecisionTrace[Decision Trace Hook<br/>Phase 35: Record trace]
    DecisionTrace --> StepLoop
    
    StepLoop --> RunEnd[Run End]
    RunEnd --> TraceAnalysis[Trace Analysis Hook<br/>Pattern 5: Mine failure patterns]
    TraceAnalysis --> End[End]
    
    style EnvOnboard fill:#e1f5ff
    style SelfTestHook fill:#e1f5ff
    style LoopHook fill:#e1f5ff
    style BudgetCheck fill:#e1f5ff
    style BudgetUpdate fill:#e1f5ff
    style TraceAnalysis fill:#e1f5ff
    style QualityHook fill:#fff9e1
    style RiskCheck fill:#ffe1e1
    style DecisionTrace fill:#f0e1ff
Loading

Last reviewed commit: c554e6f

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@qodo-code-review
Copy link

qodo-code-review bot commented Feb 18, 2026

Code Review by Qodo

🐞 Bugs (4) 📘 Rule violations (3) 📎 Requirement gaps (0)

Grey Divider


Action required

1. CI gates not validated 📘 Rule violation ⛯ Reliability
Description
The PR description indicates CI has not yet validated the required build, typecheck, test, and
arv gates. This does not meet the compliance requirement to demonstrate passing verification for
the proposed changes.
Code

packages/engine/src/hooks/config.ts[R98-146]

+  // Register trace analysis hook (enabled by default, opt-out via env)
+  if (process.env.GWI_TRACE_ANALYSIS_ENABLED !== 'false') {
+    const traceAnalysisHook = new TraceAnalysisHook();
+    runner.register(traceAnalysisHook);
+
+    if (config.debug) {
+      logger.debug('Trace analysis hook registered');
+    }
+  }
+
+  // Register self-test hook (enabled by default, opt-out via env)
+  if (process.env.GWI_SELF_TEST_HOOK_ENABLED !== 'false') {
+    const selfTestHook = new SelfTestHook();
+    runner.register(selfTestHook);
+
+    if (config.debug) {
+      logger.debug('Self-test hook registered');
+    }
+  }
+
+  // Register environment onboarding hook (enabled by default, opt-out via env)
+  if (process.env.GWI_ENVIRONMENT_ONBOARDING_ENABLED !== 'false') {
+    const envOnboardingHook = new EnvironmentOnboardingHook();
+    runner.register(envOnboardingHook);
+
+    if (config.debug) {
+      logger.debug('Environment onboarding hook registered');
+    }
+  }
+
+  // Register loop detection hook (enabled by default, opt-out via env)
+  if (process.env.GWI_LOOP_DETECTION_ENABLED !== 'false') {
+    const loopDetectionHook = new LoopDetectionHook();
+    runner.register(loopDetectionHook);
+
+    if (config.debug) {
+      logger.debug('Loop detection hook registered');
+    }
+  }
+
+  // Register budget management hook (enabled by default, opt-out via env)
+  if (process.env.GWI_BUDGET_MANAGEMENT_ENABLED !== 'false') {
+    const budgetHook = new BudgetManagementHook();
+    runner.register(budgetHook);
+
+    if (config.debug) {
+      logger.debug('Budget management hook registered');
+    }
+  }
Evidence
Compliance requires CI (or equivalent verification) success for build, typecheck, test, and
arv, but the PR explicitly shows CI is not validated yet while introducing new hook registrations.

CLAUDE.md
packages/engine/src/hooks/config.ts[98-146]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The PR does not currently provide CI (or equivalent) passing results for the required `build`, `typecheck`, `test`, and `arv` gates.
## Issue Context
The PR description explicitly indicates CI validation is still pending.
## Fix Focus Areas
- packages/engine/src/hooks/config.ts[98-146]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


✅ 2. metadata.files not Zod-validated 📘 Rule violation ⛨ Security
Description
ctx.metadata.files and ctx.metadata.repoRoot are used via type casts without Zod validation or
bounds, even though they are external/untrusted inputs carried in context metadata. This can allow
malformed/oversized inputs to affect profiling logic and downstream context injection.
Code

packages/engine/src/hooks/environment-onboarding-hook.ts[R231-234]

+    // Extract file list from metadata (populated by triage or previous steps)
+    const files = (ctx.metadata?.files as string[]) ?? [];
+    const repoRoot = ctx.metadata?.repoRoot as string | undefined;
+
Evidence
The compliance rule requires Zod validation with bounded constraints for external/untrusted inputs,
but the code directly casts ctx.metadata?.files/repoRoot and proceeds without schema parsing or
size limits.

AGENTS.md
packages/engine/src/hooks/environment-onboarding-hook.ts[231-234]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`EnvironmentOnboardingHook` reads `ctx.metadata.files` and `ctx.metadata.repoRoot` via type casts without Zod validation or bounded constraints.
## Issue Context
These values originate from context metadata and should be treated as untrusted/external.
## Fix Focus Areas
- packages/engine/src/hooks/environment-onboarding-hook.ts[231-234]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Self-test block won't block 🐞 Bug ✓ Correctness
Description
SelfTestHook attempts to enforce blocking by throwing from onAfterStep, but
AgentHookRunner.afterStep catches hook errors and continues, so pipeline execution will not be
blocked. This makes enforceBlocking misleading/non-functional in real runs.
Code

packages/engine/src/hooks/self-test-hook.ts[R158-163]

+    const reason = `Self-test validation failed: ${validation.failureReasons.join('; ')}`;
+
+    if (this.config.enforceBlocking) {
+      await this.config.onBlocked?.(ctx, reason);
+      throw new SelfTestError(reason, validation);
+    }
Evidence
SelfTestHook throws onAfterStep when enforceBlocking is enabled, but the hook contract explicitly
says onAfterStep should not throw and the hook runner’s afterStep path executes hooks via a wrapper
that catches errors and returns a result (no rethrow), so the pipeline will continue.

packages/engine/src/hooks/self-test-hook.ts[158-163]
packages/engine/src/hooks/types.ts[151-169]
packages/engine/src/hooks/runner.ts[203-231]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`SelfTestHook` currently enforces blocking by throwing from `onAfterStep`, but the hook runner swallows `onAfterStep` errors, so this does not block the pipeline.
## Issue Context
The hook system contract indicates only `onBeforeStep` is intended to block execution. The autopilot flow uses `beforeStep` for operations like `write_file` (apply patches).
## Fix Focus Areas
- packages/engine/src/hooks/self-test-hook.ts[130-171]
- packages/engine/src/hooks/types.ts[151-169]
- packages/engine/src/hooks/runner.ts[203-237]
- packages/engine/src/run/autopilot-executor.ts[292-302]
## Implementation notes
- Add internal per-run state (e.g., `Map&amp;amp;amp;lt;runId, SelfTestValidation&amp;amp;amp;gt;`)
- In `onAfterStep`, compute &amp;amp;amp;amp; store validation, but do not throw
- Implement `onBeforeStep`:
- if `enforceBlocking` and operation/step implies applying changes (e.g., `ctx.metadata?.operation === &amp;amp;amp;#x27;write_file&amp;amp;amp;#x27;`) and stored validation failed, call `onBlocked` and throw `SelfTestError`
- Clear stored state in `onRunEnd`

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (2)
4. Budget resets on runStart 🐞 Bug ✓ Correctness
Description
BudgetManagementHook overwrites the budget state on every onRunStart call. The engine calls
hookRunner.runStart multiple times per runId (e.g., FOREMAN run-start and later CODER validation),
causing budget tracking to reset mid-run and undercount usage.
Code

packages/engine/src/hooks/budget-management-hook.ts[R125-132]

+  async onRunStart(ctx: AgentRunContext): Promise<void> {
+    this.runBudgets.set(ctx.runId, {
+      inputTokens: 0,
+      outputTokens: 0,
+      stepsCompleted: 0,
+      startTime: Date.now(),
+    });
+  }
Evidence
BudgetManagementHook uses Map.set unconditionally on onRunStart. AutopilotExecutor invokes
hookRunner.runStart more than once for the same runId as part of role validation, so the stored
budget will be replaced and totals reset.

packages/engine/src/hooks/budget-management-hook.ts[125-132]
packages/engine/src/run/autopilot-executor.ts[237-265]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`BudgetManagementHook.onRunStart` resets budget tracking state every time it’s invoked. The engine uses `runStart` more than once per runId, which can reset tracking mid-run.
## Issue Context
Autopilot executes `hookRunner.runStart` for FOREMAN and later CODER role validation using the same `runId`.
## Fix Focus Areas
- packages/engine/src/hooks/budget-management-hook.ts[125-132]
- packages/engine/src/run/autopilot-executor.ts[237-265]
## Implementation notes
- Change `onRunStart` to:
- `if (this.runBudgets.has(ctx.runId)) return;`
- otherwise initialize
- Consider whether `startTime` should be preserved from the first initialization
- Add/adjust unit tests to cover multiple `onRunStart` calls with the same runId

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Trace analysis lacks tenant 🐞 Bug ⛨ Security
Description
TraceAnalysisHook queries traces by runId only, ignoring ctx.tenantId even though the system is
multi-tenant. This can mix traces across tenants (if runId collides or the store is shared) and leak
data via ctx.metadata.traceAnalysis.
Code

packages/engine/src/hooks/trace-analysis-hook.ts[R108-114]

+  async onRunEnd(ctx: AgentRunContext, success: boolean): Promise<void> {
+    try {
+      const traces = await this.store.getTracesForRun(ctx.runId);
+
+      if (traces.length === 0) {
+        logger.debug('No traces found for run, skipping analysis', { runId: ctx.runId });
+        return;
Evidence
The pipeline contract includes tenantId; decision traces are tenant-scoped; and the store supports
tenant filtering via listTraces. TraceAnalysisHook currently uses getTracesForRun(runId) without
tenant filtering, then attaches analysis to ctx.metadata.

packages/engine/src/hooks/trace-analysis-hook.ts[108-114]
packages/core/src/a2a/contracts.ts[460-465]
packages/core/src/context-graph/decision-trace.ts[166-173]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`TraceAnalysisHook` fetches traces by `runId` only and ignores `tenantId`, which can cause cross-tenant mixing/leakage in multi-tenant environments.
## Issue Context
- Pipeline requests include `tenantId`.
- DecisionTraceStore supports filtering by `tenantId`.
## Fix Focus Areas
- packages/engine/src/hooks/trace-analysis-hook.ts[108-116]
- packages/core/src/context-graph/decision-trace.ts[166-209]
## Implementation notes
- Replace `getTracesForRun(ctx.runId)` with:
- `ctx.tenantId ? store.listTraces({ runId: ctx.runId, tenantId: ctx.tenantId }) : store.getTracesForRun(ctx.runId)`
- Add unit test ensuring tenantId is applied when present
- Consider adding tenantId to analysis logs for correlation

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

6. Budget config lacks bounds 📘 Rule violation ⛯ Reliability
Description
Budget percentage calculations can produce Infinity/NaN when maxTotalTokens or maxCostUsd
are configured as 0 or non-finite values. This is an unhandled edge case that can break
warning/blocking behavior.
Code

packages/engine/src/hooks/budget-management-hook.ts[R240-252]

+    const tokenBudgetPercent = (totalTokens / this.config.maxTotalTokens) * 100;
+    const costBudgetPercent = (estimatedCostUsd / this.config.maxCostUsd) * 100;
+
+    const maxPercent = Math.max(tokenBudgetPercent, costBudgetPercent);
+
+    let warningLevel: BudgetStatus['warningLevel'] = 'none';
+    if (maxPercent >= 100) {
+      warningLevel = 'exceeded';
+    } else if (maxPercent >= this.config.criticalThresholdPercent) {
+      warningLevel = 'critical';
+    } else if (maxPercent >= this.config.warningThresholdPercent) {
+      warningLevel = 'approaching';
+    }
Evidence
The compliance rule requires explicit handling of boundary values; dividing by configurable values
without validating they are positive and finite can lead to incorrect warnings or crashes
downstream.

Rule 3: Generic: Robust Error Handling and Edge Case Management
packages/engine/src/hooks/budget-management-hook.ts[240-252]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Budget calculations divide by configurable values without guarding against zero/non-finite inputs.
## Issue Context
`maxTotalTokens` and `maxCostUsd` can be overridden via constructor config.
## Fix Focus Areas
- packages/engine/src/hooks/budget-management-hook.ts[118-120]
- packages/engine/src/hooks/budget-management-hook.ts[240-252]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. Budget warnings misleading 🐞 Bug ✓ Correctness
Description
Budget warning level is computed using the max of token and cost budget percentages, but the
injected warning/error messages always report tokenBudgetPercent. If cost is the limiting factor,
users will see an incorrect % in warnings/exceeded errors.
Code

packages/engine/src/hooks/budget-management-hook.ts[R147-155]

+      if (status.warningLevel === 'approaching') {
+        ctx.metadata.budgetWarning =
+          `Budget ${status.tokenBudgetPercent.toFixed(0)}% consumed. ` +
+          `Prioritize completing current work and verifying output.`;
+      } else if (status.warningLevel === 'critical') {
+        ctx.metadata.budgetWarning =
+          `Budget ${status.tokenBudgetPercent.toFixed(0)}% consumed (CRITICAL). ` +
+          `Wrap up immediately. Skip non-essential work. Verify and finalize.`;
+      }
Evidence
calculateStatus chooses warningLevel based on max(tokenBudgetPercent, costBudgetPercent), but
onBeforeStep warning strings and exceeded error message only mention tokenBudgetPercent, which can
diverge from the actual trigger metric.

packages/engine/src/hooks/budget-management-hook.ts[240-252]
packages/engine/src/hooks/budget-management-hook.ts[147-155]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Budget warnings are triggered by either token or cost thresholds (whichever is higher), but the human-readable warning strings always show token percentage, which can be misleading.
## Issue Context
`warningLevel` is based on `maxPercent = max(tokenBudgetPercent, costBudgetPercent)`.
## Fix Focus Areas
- packages/engine/src/hooks/budget-management-hook.ts[147-165]
- packages/engine/src/hooks/budget-management-hook.ts[240-263]
## Implementation notes
- Compute `limiting = tokenBudgetPercent &amp;amp;amp;gt;= costBudgetPercent ? &amp;amp;amp;#x27;tokens&amp;amp;amp;#x27; : &amp;amp;amp;#x27;cost&amp;amp;amp;#x27;`
- Include both values in the warning (e.g., `tokens=20%, cost=92% (limiting)`)
- For exceeded errors, include which budget exceeded and the relevant percentage

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment on lines +98 to +146
// Register trace analysis hook (enabled by default, opt-out via env)
if (process.env.GWI_TRACE_ANALYSIS_ENABLED !== 'false') {
const traceAnalysisHook = new TraceAnalysisHook();
runner.register(traceAnalysisHook);

if (config.debug) {
logger.debug('Trace analysis hook registered');
}
}

// Register self-test hook (enabled by default, opt-out via env)
if (process.env.GWI_SELF_TEST_HOOK_ENABLED !== 'false') {
const selfTestHook = new SelfTestHook();
runner.register(selfTestHook);

if (config.debug) {
logger.debug('Self-test hook registered');
}
}

// Register environment onboarding hook (enabled by default, opt-out via env)
if (process.env.GWI_ENVIRONMENT_ONBOARDING_ENABLED !== 'false') {
const envOnboardingHook = new EnvironmentOnboardingHook();
runner.register(envOnboardingHook);

if (config.debug) {
logger.debug('Environment onboarding hook registered');
}
}

// Register loop detection hook (enabled by default, opt-out via env)
if (process.env.GWI_LOOP_DETECTION_ENABLED !== 'false') {
const loopDetectionHook = new LoopDetectionHook();
runner.register(loopDetectionHook);

if (config.debug) {
logger.debug('Loop detection hook registered');
}
}

// Register budget management hook (enabled by default, opt-out via env)
if (process.env.GWI_BUDGET_MANAGEMENT_ENABLED !== 'false') {
const budgetHook = new BudgetManagementHook();
runner.register(budgetHook);

if (config.debug) {
logger.debug('Budget management hook registered');
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Ci gates not validated 📘 Rule violation ⛯ Reliability

The PR description indicates CI has not yet validated the required build, typecheck, test, and
arv gates. This does not meet the compliance requirement to demonstrate passing verification for
the proposed changes.
Agent Prompt
## Issue description
The PR does not currently provide CI (or equivalent) passing results for the required `build`, `typecheck`, `test`, and `arv` gates.

## Issue Context
The PR description explicitly indicates CI validation is still pending.

## Fix Focus Areas
- packages/engine/src/hooks/config.ts[98-146]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +231 to +234
// Extract file list from metadata (populated by triage or previous steps)
const files = (ctx.metadata?.files as string[]) ?? [];
const repoRoot = ctx.metadata?.repoRoot as string | undefined;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. metadata.files not zod-validated 📘 Rule violation ⛨ Security

ctx.metadata.files and ctx.metadata.repoRoot are used via type casts without Zod validation or
bounds, even though they are external/untrusted inputs carried in context metadata. This can allow
malformed/oversized inputs to affect profiling logic and downstream context injection.
Agent Prompt
## Issue description
`EnvironmentOnboardingHook` reads `ctx.metadata.files` and `ctx.metadata.repoRoot` via type casts without Zod validation or bounded constraints.

## Issue Context
These values originate from context metadata and should be treated as untrusted/external.

## Fix Focus Areas
- packages/engine/src/hooks/environment-onboarding-hook.ts[231-234]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +158 to +163
const reason = `Self-test validation failed: ${validation.failureReasons.join('; ')}`;

if (this.config.enforceBlocking) {
await this.config.onBlocked?.(ctx, reason);
throw new SelfTestError(reason, validation);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Self-test block won't block 🐞 Bug ✓ Correctness

SelfTestHook attempts to enforce blocking by throwing from onAfterStep, but
AgentHookRunner.afterStep catches hook errors and continues, so pipeline execution will not be
blocked. This makes enforceBlocking misleading/non-functional in real runs.
Agent Prompt
## Issue description
`SelfTestHook` currently enforces blocking by throwing from `onAfterStep`, but the hook runner swallows `onAfterStep` errors, so this does not block the pipeline.

## Issue Context
The hook system contract indicates only `onBeforeStep` is intended to block execution. The autopilot flow uses `beforeStep` for operations like `write_file` (apply patches).

## Fix Focus Areas
- packages/engine/src/hooks/self-test-hook.ts[130-171]
- packages/engine/src/hooks/types.ts[151-169]
- packages/engine/src/hooks/runner.ts[203-237]
- packages/engine/src/run/autopilot-executor.ts[292-302]

## Implementation notes
- Add internal per-run state (e.g., `Map<runId, SelfTestValidation>`)
- In `onAfterStep`, compute & store validation, but do not throw
- Implement `onBeforeStep`:
  - if `enforceBlocking` and operation/step implies applying changes (e.g., `ctx.metadata?.operation === 'write_file'`) and stored validation failed, call `onBlocked` and throw `SelfTestError`
- Clear stored state in `onRunEnd`

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +125 to +132
async onRunStart(ctx: AgentRunContext): Promise<void> {
this.runBudgets.set(ctx.runId, {
inputTokens: 0,
outputTokens: 0,
stepsCompleted: 0,
startTime: Date.now(),
});
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

4. Budget resets on runstart 🐞 Bug ✓ Correctness

BudgetManagementHook overwrites the budget state on every onRunStart call. The engine calls
hookRunner.runStart multiple times per runId (e.g., FOREMAN run-start and later CODER validation),
causing budget tracking to reset mid-run and undercount usage.
Agent Prompt
## Issue description
`BudgetManagementHook.onRunStart` resets budget tracking state every time it’s invoked. The engine uses `runStart` more than once per runId, which can reset tracking mid-run.

## Issue Context
Autopilot executes `hookRunner.runStart` for FOREMAN and later CODER role validation using the same `runId`.

## Fix Focus Areas
- packages/engine/src/hooks/budget-management-hook.ts[125-132]
- packages/engine/src/run/autopilot-executor.ts[237-265]

## Implementation notes
- Change `onRunStart` to:
  - `if (this.runBudgets.has(ctx.runId)) return;`
  - otherwise initialize
- Consider whether `startTime` should be preserved from the first initialization
- Add/adjust unit tests to cover multiple `onRunStart` calls with the same runId

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines +108 to +114
async onRunEnd(ctx: AgentRunContext, success: boolean): Promise<void> {
try {
const traces = await this.store.getTracesForRun(ctx.runId);

if (traces.length === 0) {
logger.debug('No traces found for run, skipping analysis', { runId: ctx.runId });
return;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

5. Trace analysis lacks tenant 🐞 Bug ⛨ Security

TraceAnalysisHook queries traces by runId only, ignoring ctx.tenantId even though the system is
multi-tenant. This can mix traces across tenants (if runId collides or the store is shared) and leak
data via ctx.metadata.traceAnalysis.
Agent Prompt
## Issue description
`TraceAnalysisHook` fetches traces by `runId` only and ignores `tenantId`, which can cause cross-tenant mixing/leakage in multi-tenant environments.

## Issue Context
- Pipeline requests include `tenantId`.
- DecisionTraceStore supports filtering by `tenantId`.

## Fix Focus Areas
- packages/engine/src/hooks/trace-analysis-hook.ts[108-116]
- packages/core/src/context-graph/decision-trace.ts[166-209]

## Implementation notes
- Replace `getTracesForRun(ctx.runId)` with:
  - `ctx.tenantId ? store.listTraces({ runId: ctx.runId, tenantId: ctx.tenantId }) : store.getTracesForRun(ctx.runId)`
- Add unit test ensuring tenantId is applied when present
- Consider adding tenantId to analysis logs for correlation

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/engine/src/hooks/environment-onboarding-hook.ts`:
- Around line 78-85: The detection mistakenly tags repos as TypeScript because
the file descriptor for { file: 'package.json', language: 'typescript' } and the
broad pattern '\.[jt]sx?$' force TS; remove the explicit language: 'typescript'
from the package.json descriptor and split the language detection so JS vs TS
are determined by specific file patterns (use /\.tsx?$/ to mark TypeScript and
/\.jsx?$/ to mark JavaScript) and/or presence of tsconfig.json rather than
package.json; apply the same fix to the second occurrence noted around lines
251-258 so both lists no longer assign TypeScript based solely on package.json
or the mixed '\.[jt]sx?$' pattern.
- Around line 48-60: Add Zod validation/schema for ctx.metadata.files before it
is used: define a filesSchema using z.array(z.object({...})) with appropriate
per-file fields and apply .max(1000) to bound entries, import z from "zod",
parse ctx.metadata.files (e.g., const files =
filesSchema.parse(ctx.metadata.files)) and then use the validated files variable
in the rest of the logic; update any code that referenced ctx.metadata.files
directly to use the validated variable so unbounded arrays are never processed
(use names like filesSchema and filesParsed or parseFiles to locate changes).

In `@packages/engine/src/hooks/loop-detection-hook.ts`:
- Around line 39-60: The LoopDetectionConfig is accepted without validation and
the output summary length is unbounded; add a Zod schema for LoopDetectionConfig
and validate the merged config before use (create a zod schema matching
LoopDetectionConfig, enforce similarityThreshold ∈ [0,1], maxSimilarOutputs and
maxTrackedOutputs ≥ 1, and add a new maxOutputChars with sensible min/max
bounds), add maxOutputChars to DEFAULT_LOOP_DETECTION_CONFIG, and truncate any
outputSummary text to maxOutputChars before running the similarity algorithm
(update wherever the config is merged/used and reference LoopDetectionConfig,
DEFAULT_LOOP_DETECTION_CONFIG, and the similarity routine that reads
outputSummary to apply the truncation).

In `@packages/engine/src/hooks/self-test-hook.ts`:
- Around line 189-196: The loop that classifies entries from files into
testFiles and sourceFiles currently treats all non-test files as source, which
inflates sourceFileCount; change the classification to first filter by a
whitelist of code file extensions (e.g., .ts, .js, .tsx, .jsx, .py, .go, .java,
.c, .cpp, .rs, etc.) before deciding test vs source so docs/config and config
files are ignored—implement an allowedExtensions array and check the file.path
extension (or mimetype) before running this.config.testPatterns; apply the same
filtering logic to the second similar classification block (the other files ->
testFiles/sourceFiles loop) so both places only count real source code files
when computing minTestRatio.
- Around line 47-76: Add Zod schema validation for the SelfTestConfig and for
ctx.metadata.generatedFiles: create a zod schema SelfTestConfigSchema that
validates requireTests (boolean), minTestRatio (number between 0 and 1
inclusive), enforceBlocking (boolean), and testPatterns (array of strings or
regex-like strings) and use DEFAULT_SELF_TEST_CONFIG as defaults via
zod.default/transform when parsing the incoming config; also create a
GeneratedFileSchema and GeneratedFilesSchema validating an array of objects with
filename (string), size (number >=0) and enforce bounded constraints (max files
e.g., 1000, max total size e.g., 50_000_000 bytes) and parse
ctx.metadata.generatedFiles through this schema before any use. Replace any
direct merging/usage of the raw config and any direct access to
ctx.metadata.generatedFiles (locations around SelfTestConfig related merges and
the code that reads generatedFiles at the referenced lines) to use the
parsed/validated values and throw or call onBlocked when validation fails.

In `@packages/engine/src/hooks/trace-analysis-hook.ts`:
- Around line 60-96: Add a Zod schema for TraceAnalysisConfig (validate
minPatternCount as an integer >= 1 and logResults as boolean, onAnalysis as
optional function) and use it in the TraceAnalysisHook constructor to validate
the merged config (DEFAULT_TRACE_ANALYSIS_CONFIG + config) before assigning to
this.config; if validation fails, throw or normalize to safe defaults so invalid
values (e.g., minPatternCount <= 0) cannot reach pattern detection logic in
methods that reference minPatternCount. Update types to derive from the Zod
schema so TraceAnalysisConfig aligns with the validator and ensure the
constructor uses the schema parse/parseAsync call to enforce the bounded
constraints.

Comment on lines +48 to +60
export interface EnvironmentOnboardingConfig {
/** Whether to inject profile into all step contexts. @default true */
injectIntoSteps: boolean;
/** Callback when environment is profiled */
onProfiled?: (profile: EnvironmentProfile) => Promise<void>;
}

/**
* Default configuration
*/
export const DEFAULT_ENV_ONBOARDING_CONFIG: EnvironmentOnboardingConfig = {
injectIntoSteps: true,
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and read the environment-onboarding-hook.ts file
find . -name "environment-onboarding-hook.ts" -type f

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 141


🏁 Script executed:

# Get file size first to see if we can read it entirely
wc -l packages/engine/src/hooks/environment-onboarding-hook.ts 2>/dev/null || echo "File not found in expected path"

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 143


🏁 Script executed:

# Check the actual file location and contents
fd "environment-onboarding-hook" --type f

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 211


🏁 Script executed:

# Search more broadly for the file
find . -name "*environment-onboarding*" -type f

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 215


🏁 Script executed:

# Check the repo structure
ls -la packages/engine/src/hooks/ 2>/dev/null | head -20

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1100


🏁 Script executed:

# List TypeScript files in hooks to understand available files
fd "\.ts$" packages/engine/src/hooks/ 2>/dev/null | head -20

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1132


🏁 Script executed:

# Read the environment-onboarding-hook.ts file with line numbers
cat -n packages/engine/src/hooks/environment-onboarding-hook.ts | head -80

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 3002


🏁 Script executed:

# Read around lines 231-233
cat -n packages/engine/src/hooks/environment-onboarding-hook.ts | sed -n '225,240p'

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 839


🏁 Script executed:

# Check for existing Zod usage in the hooks directory
rg -n "zod|Zod" packages/engine/src/hooks --type ts

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 65


🏁 Script executed:

# Check imports at the top of the environment-onboarding-hook.ts file
cat -n packages/engine/src/hooks/environment-onboarding-hook.ts | head -30

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1242


🏁 Script executed:

# Search for Zod usage more broadly in the codebase
rg -n "from 'zod'|from \"zod\"" --type ts --type tsx | head -20

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 110


🏁 Script executed:

# Check the types.ts file to understand AgentRunContext structure
cat -n packages/engine/src/hooks/types.ts | head -100

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 3074


🏁 Script executed:

# Search for where metadata.files is populated/set
rg -n "metadata\.files\s*=" --type ts --type tsx -B 2 -A 2

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 110


🏁 Script executed:

# Check if there are validation patterns used elsewhere for similar array inputs
rg -n "max.*entries|max.*items|bounded.*array" --type ts --type tsx packages/engine/src/hooks/ | head -20

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 110


🏁 Script executed:

# Search for Zod usage without specifying file type
rg -n "from 'zod'|from \"zod\"|import.*zod" packages/engine/src/ | head -20

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 512


🏁 Script executed:

# Check where metadata.files gets populated
rg -n "metadata.files\|metadata\['files'\]" --type ts packages/ -B 3 -A 3 | head -50

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 65


🏁 Script executed:

# Check the full environment-onboarding-hook.ts file from line 200 onwards
cat -n packages/engine/src/hooks/environment-onboarding-hook.ts | sed -n '200,290p'

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 3797


🏁 Script executed:

# Search for validation patterns in the codebase
rg -n "z\.array|z\.string|\.max\(" packages/engine/src/ --type ts | grep -i "schema\|validate" | head -20

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1933


🏁 Script executed:

# Check where ctx.metadata gets initialized or populated
rg -n "metadata\s*[:=]|\.metadata\s*=" packages/engine/src/ --type ts -B 2 -A 2 | head -60

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 4012


🏁 Script executed:

# Check the AgentRunContext type definition more fully
cat -n packages/engine/src/hooks/types.ts | sed -n '100,160p'

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1915


🏁 Script executed:

# Look at how other hooks handle ctx.metadata
rg -n "ctx\.metadata" packages/engine/src/hooks/ --type ts -B 1 -A 1 | head -50

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 3934


🏁 Script executed:

# Check validation patterns in step-contract/validation.ts to understand Zod usage patterns in this codebase
cat -n packages/engine/src/step-contract/validation.ts | head -60

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1940


🏁 Script executed:

# Check if there are existing patterns in other hooks for validating array inputs
rg -n "const.*files.*=.*metadata" packages/engine/src/ --type ts -B 2 -A 5

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 926


🏁 Script executed:

# Look at self-test-hook.ts to see how it handles metadata without validation
cat -n packages/engine/src/hooks/self-test-hook.ts | sed -n '130,150p'

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 844


🏁 Script executed:

# Verify the language detection logic more closely
cat -n packages/engine/src/hooks/environment-onboarding-hook.ts | sed -n '75,85p'

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 636


🏁 Script executed:

# Check if there are input validation guards anywhere in the codebase for similar cases
rg -n "files.*length|files.*size|array.*bound" packages/engine/src/ --type ts | head -10

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1023


🏁 Script executed:

# Verify the exact location of the issue - check lines around 231-233
cat -n packages/engine/src/hooks/environment-onboarding-hook.ts | sed -n '220,245p'

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1450


🏁 Script executed:

# Check if Zod is already imported in this file
head -20 packages/engine/src/hooks/environment-onboarding-hook.ts | grep -n "import"

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 189


🏁 Script executed:

# Look at how Zod schemas with bounded arrays are defined in the codebase
rg -A 2 "z\.array.*max" packages/engine/src/ --type ts | head -20

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 298


🏁 Script executed:

# Check if metadata.files is populated from external sources
rg -n "metadata.*files\s*=" packages/ --type ts -B 3 | head -40

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 65


🏁 Script executed:

# Verify where the snippet (lines 48-60) is in the file vs the actual issue
cat -n packages/engine/src/hooks/environment-onboarding-hook.ts | sed -n '48,60p'

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 571


🏁 Script executed:

# Double-check if there's any validation of ctx.metadata.files anywhere in the function
cat -n packages/engine/src/hooks/environment-onboarding-hook.ts | sed -n '222,277p'

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 2610


🏁 Script executed:

# Check if similar metadata access patterns in other hooks have validation
rg -n "metadata\?" packages/engine/src/hooks/ --type ts -A 1 | grep -E "(validate|schema|z\.|Zod)" | head -10

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 65


🏁 Script executed:

# Verify the test file to see if there are any validation tests for this function
cat -n packages/engine/src/hooks/__tests__/environment-onboarding-hook.test.ts | head -80

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 2799


🏁 Script executed:

# Check if there's a pattern of how Zod is used with string array validation
rg -B 2 -A 2 "z\.array\(z\.string" packages/engine/src/ --type ts | head -30

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 2113


Add Zod validation with bounded constraints for ctx.metadata.files at line 232.

The file list is extracted without validation at line 232 where it's read from ctx.metadata.files. Per the coding guidelines, all external data must use Zod schemas with bounded constraints (max 1000 file entries per constraints). Add validation to prevent unbounded array processing.

Suggested change (schema + parse)
 import { getLogger } from '@gwi/core';
+import { z } from 'zod';
 import type { AgentHook, AgentRunContext } from './types.js';
 
 const logger = getLogger('environment-onboarding-hook');
 
+const FilesSchema = z.array(z.string().max(512)).max(1000);
+
 // =============================================================================
 // Types
 // =============================================================================
@@ -230,7 +231,7 @@
     let isMonorepo = false;
 
     // Extract file list from metadata (populated by triage or previous steps)
-    const files = (ctx.metadata?.files as string[]) ?? [];
+    const files = FilesSchema.parse((ctx.metadata?.files as string[]) ?? []);
     const repoRoot = ctx.metadata?.repoRoot as string | undefined;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/engine/src/hooks/environment-onboarding-hook.ts` around lines 48 -
60, Add Zod validation/schema for ctx.metadata.files before it is used: define a
filesSchema using z.array(z.object({...})) with appropriate per-file fields and
apply .max(1000) to bound entries, import z from "zod", parse ctx.metadata.files
(e.g., const files = filesSchema.parse(ctx.metadata.files)) and then use the
validated files variable in the rest of the logic; update any code that
referenced ctx.metadata.files directly to use the validated variable so
unbounded arrays are never processed (use names like filesSchema and filesParsed
or parseFiles to locate changes).

Comment on lines +78 to +85
// JavaScript/TypeScript ecosystem
{ file: 'package.json', language: 'typescript', packageManager: 'npm' },
{ file: 'package-lock.json', packageManager: 'npm' },
{ file: 'yarn.lock', packageManager: 'yarn' },
{ file: 'pnpm-lock.yaml', packageManager: 'pnpm' },
{ file: 'bun.lockb', packageManager: 'bun' },
{ file: 'tsconfig.json', language: 'typescript' },
{ file: 'vitest.config.ts', testRunner: 'vitest' },
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Language detection misclassifies JS as TypeScript.

package.json and \.[jt]sx?$ both force typescript, so JS-only repos are mislabeled. Split JS/TS detection and avoid assigning TS from package.json.

Suggested change (JS vs TS split)
-  { file: 'package.json', language: 'typescript', packageManager: 'npm' },
+  { file: 'package.json', packageManager: 'npm' },
@@
-      if (/\.[jt]sx?$/.test(filePath)) languages.add('typescript');
+      if (/\.tsx?$/.test(filePath)) languages.add('typescript');
+      else if (/\.jsx?$/.test(filePath)) languages.add('javascript');

Also applies to: 251-258

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/engine/src/hooks/environment-onboarding-hook.ts` around lines 78 -
85, The detection mistakenly tags repos as TypeScript because the file
descriptor for { file: 'package.json', language: 'typescript' } and the broad
pattern '\.[jt]sx?$' force TS; remove the explicit language: 'typescript' from
the package.json descriptor and split the language detection so JS vs TS are
determined by specific file patterns (use /\.tsx?$/ to mark TypeScript and
/\.jsx?$/ to mark JavaScript) and/or presence of tsconfig.json rather than
package.json; apply the same fix to the second occurrence noted around lines
251-258 so both lists no longer assign TypeScript based solely on package.json
or the mixed '\.[jt]sx?$' pattern.

Comment on lines +39 to +60
export interface LoopDetectionConfig {
/** Similarity threshold to consider outputs "the same" (0-1). @default 0.8 */
similarityThreshold: number;
/** Number of similar outputs before flagging. @default 3 */
maxSimilarOutputs: number;
/** Maximum outputs to track per run. @default 10 */
maxTrackedOutputs: number;
/** Block on loop detection (throw) or just warn. @default false */
enforceBlocking: boolean;
/** Callback when loop is detected */
onLoopDetected?: (ctx: AgentRunContext, result: LoopDetectionResult) => Promise<void>;
}

/**
* Default configuration
*/
export const DEFAULT_LOOP_DETECTION_CONFIG: LoopDetectionConfig = {
similarityThreshold: 0.8,
maxSimilarOutputs: 3,
maxTrackedOutputs: 10,
enforceBlocking: false,
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, locate and read the loop-detection-hook.ts file
find . -name "loop-detection-hook.ts" -type f

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 133


🏁 Script executed:

#!/bin/bash
# Check Zod usage patterns in hooks directory
rg -n "zod|Zod|import.*z" packages/engine/src/hooks -A 2 -B 2

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 65


🏁 Script executed:

#!/bin/bash
# Search for ConfigSchema patterns in hooks
rg -n "Schema.*parse|\.parse\(" packages/engine/src/hooks

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 65


🏁 Script executed:

#!/bin/bash
# Read the loop-detection-hook.ts file to examine actual implementation
wc -l packages/engine/src/hooks/loop-detection-hook.ts

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 135


🏁 Script executed:

#!/bin/bash
# Read the file to see actual implementation
cat -n packages/engine/src/hooks/loop-detection-hook.ts

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 9820


🏁 Script executed:

#!/bin/bash
# Check if Zod is used anywhere in the packages/engine/src directory
rg -n "from.*zod|import.*zod" packages/engine/src --type ts

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 512


🏁 Script executed:

#!/bin/bash
# Check what validation patterns are used in other parts of the engine
rg -n "this\.config.*=.*\{" packages/engine/src -A 1 -B 1

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 3747


🏁 Script executed:

#!/bin/bash
# Verify the similarity calculation's performance characteristics
# by looking at how large outputSummary could be
rg -n "outputSummary" packages/engine/src -B 2 -A 2

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 10633


🏁 Script executed:

#!/bin/bash
# Check if any hooks use Zod validation for config
rg -n "Schema.*parse\|z\.object" packages/engine/src/hooks

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 65


🏁 Script executed:

#!/bin/bash
# Look at how config is handled in other components with Zod
grep -A 10 "Schema.*parse" packages/engine/src/run/autopilot-executor.ts | head -20

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 534


🏁 Script executed:

#!/bin/bash
# Check the guidelines file to confirm requirements
cat AGENTS.md | head -100

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 3627


Add Zod validation and bound outputSummary length.

Config is merged without validation (line 148), and the similarity algorithm processes unbounded text which can cause performance issues. Add a bounded maxOutputChars parameter and validate the config with Zod (similarity threshold in [0,1], counts ≥ 1, output chars in reasonable bounds).

Suggested change (schema + truncate)
-import { getLogger } from '@gwi/core';
+import { getLogger } from '@gwi/core';
+import { z } from 'zod';
@@
 export interface LoopDetectionConfig {
@@
   /** Block on loop detection (throw) or just warn. `@default` false */
   enforceBlocking: boolean;
+  /** Max chars to keep from outputSummary for similarity. `@default` 20000 */
+  maxOutputChars: number;
@@
 export const DEFAULT_LOOP_DETECTION_CONFIG: LoopDetectionConfig = {
   similarityThreshold: 0.8,
   maxSimilarOutputs: 3,
   maxTrackedOutputs: 10,
   enforceBlocking: false,
+  maxOutputChars: 20000,
 };
+
+const LoopDetectionConfigSchema = z.object({
+  similarityThreshold: z.number().min(0).max(1),
+  maxSimilarOutputs: z.number().int().min(2).max(20),
+  maxTrackedOutputs: z.number().int().min(1).max(50),
+  enforceBlocking: z.boolean(),
+  maxOutputChars: z.number().int().min(100).max(50_000),
+  onLoopDetected: z.any().optional(),
+});
@@
   constructor(config?: Partial<LoopDetectionConfig>) {
-    this.config = { ...DEFAULT_LOOP_DETECTION_CONFIG, ...config };
+    const merged = { ...DEFAULT_LOOP_DETECTION_CONFIG, ...config };
+    this.config = LoopDetectionConfigSchema.parse(merged);
   }
@@
-    const output = ctx.outputSummary ?? '';
+    const output = (ctx.outputSummary ?? '').slice(0, this.config.maxOutputChars);

As per coding guidelines: "Use Zod schemas for validation of all external data" and "Use Zod schemas with bounded constraints for input validation" (lines 39–60, 147–191).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/engine/src/hooks/loop-detection-hook.ts` around lines 39 - 60, The
LoopDetectionConfig is accepted without validation and the output summary length
is unbounded; add a Zod schema for LoopDetectionConfig and validate the merged
config before use (create a zod schema matching LoopDetectionConfig, enforce
similarityThreshold ∈ [0,1], maxSimilarOutputs and maxTrackedOutputs ≥ 1, and
add a new maxOutputChars with sensible min/max bounds), add maxOutputChars to
DEFAULT_LOOP_DETECTION_CONFIG, and truncate any outputSummary text to
maxOutputChars before running the similarity algorithm (update wherever the
config is merged/used and reference LoopDetectionConfig,
DEFAULT_LOOP_DETECTION_CONFIG, and the similarity routine that reads
outputSummary to apply the truncation).

Comment on lines +47 to +76
export interface SelfTestConfig {
/** Require tests for generated code. @default false (warn only) */
requireTests: boolean;
/** Minimum test-to-source ratio. @default 0 (no minimum) */
minTestRatio: number;
/** Block on failure (throw) or just warn. @default false */
enforceBlocking: boolean;
/** File patterns considered test files */
testPatterns: RegExp[];
/** Callback when validation fails */
onBlocked?: (ctx: AgentRunContext, reason: string) => Promise<void>;
}

/**
* Default configuration
*/
export const DEFAULT_SELF_TEST_CONFIG: SelfTestConfig = {
requireTests: false,
minTestRatio: 0,
enforceBlocking: false,
testPatterns: [
/\.test\.[jt]sx?$/,
/\.spec\.[jt]sx?$/,
/__tests__\//,
/\.test\.py$/,
/test_.*\.py$/,
/\.test\.go$/,
/_test\.go$/,
],
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check the actual file to see current implementation
cat -n packages/engine/src/hooks/self-test-hook.ts | head -250

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 9271


🏁 Script executed:

# Check for existing Zod usage in hooks directory
rg -n "zod|Zod|Schema" packages/engine/src/hooks -g '*.ts' -A 2

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 65


🏁 Script executed:

# Check for similar validation patterns in related core files
rg -n "SelfTestConfig|generatedFiles" packages/engine/src/hooks/self-test-hook.ts -B 2 -A 5

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1581


🏁 Script executed:

# Check if Zod is used elsewhere in the repository
rg -l "import.*zod|from.*zod" packages/ --type ts --type tsx -A 2 | head -20

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 110


🏁 Script executed:

# Check for validation patterns in the codebase
rg "z\.object\(|z\.array\(|SomeSchema" packages/ --type ts --type tsx -B 1 | head -50

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 110


🏁 Script executed:

# Verify the CodeGenerationResult type to understand the structure of generatedFiles
rg "CodeGenerationResult|generatedFiles" packages/core -A 5 --type ts

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 828


🏁 Script executed:

# Check if there's validation happening elsewhere in AgentRunContext or types
rg "AgentRunContext|metadata" packages/engine/src/hooks/types.ts -B 2 -A 10

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 2270


🏁 Script executed:

# Check Zod usage patterns in the entire repository
rg "from 'zod'|import.*zod" packages/ -t ts | head -20

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1460


🏁 Script executed:

# Check if bounded constraints are defined anywhere for file inputs
rg "max.*1000|max.*1048576|max.*1024" packages/ -t ts -B 2 -A 2

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 29191


Add Zod schema validation with bounded constraints for SelfTestConfig and generatedFiles.

External inputs—both the config parameter and ctx.metadata.generatedFiles—require validation per guidelines. Currently unchecked (ctx.metadata is Record<string, unknown> and the config is merged without schema validation). Add Zod schemas with bounded constraints (e.g., ratio 0–1, file count/size limits).

Suggested implementation
-import { getLogger, type CodeGenerationResult } from '@gwi/core';
+import { getLogger, type CodeGenerationResult } from '@gwi/core';
+import { z } from 'zod';
+
+const GeneratedFileSchema = z.object({
+  path: z.string().max(1024),
+  content: z.string().max(200_000),
+  action: z.enum(['create', 'modify', 'delete']),
+  explanation: z.string().max(1000),
+});
+
+const GeneratedFilesSchema = z.array(GeneratedFileSchema).max(1000);
+
+const SelfTestConfigSchema = z.object({
+  requireTests: z.boolean(),
+  minTestRatio: z.number().min(0).max(1),
+  enforceBlocking: z.boolean(),
+  testPatterns: z.array(z.instanceof(RegExp)).min(1),
+  onBlocked: z.any().optional(),
+});

 export interface SelfTestConfig {
   constructor(config?: Partial<SelfTestConfig>) {
-    this.config = { ...DEFAULT_SELF_TEST_CONFIG, ...config };
+    const merged = { ...DEFAULT_SELF_TEST_CONFIG, ...config };
+    this.config = SelfTestConfigSchema.parse(merged);
   }
     const generatedFiles = ctx.metadata?.generatedFiles as CodeGenerationResult['files'] | undefined;
+    const parsed = GeneratedFilesSchema.safeParse(generatedFiles ?? []);
+    if (!parsed.success) {
+      logger.warn('Invalid generatedFiles in metadata', { errors: parsed.error });
+      return;
+    }

-    if (!generatedFiles || generatedFiles.length === 0) {
+    if (!parsed.data || parsed.data.length === 0) {
       return;
     }

-    const validation = this.validate(generatedFiles);
+    const validation = this.validate(parsed.data);

Per guidelines: "Use Zod schemas for validation of all external data" and "Use Zod schemas with bounded constraints for input validation."

Also applies to: lines 123–142, 183–214.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/engine/src/hooks/self-test-hook.ts` around lines 47 - 76, Add Zod
schema validation for the SelfTestConfig and for ctx.metadata.generatedFiles:
create a zod schema SelfTestConfigSchema that validates requireTests (boolean),
minTestRatio (number between 0 and 1 inclusive), enforceBlocking (boolean), and
testPatterns (array of strings or regex-like strings) and use
DEFAULT_SELF_TEST_CONFIG as defaults via zod.default/transform when parsing the
incoming config; also create a GeneratedFileSchema and GeneratedFilesSchema
validating an array of objects with filename (string), size (number >=0) and
enforce bounded constraints (max files e.g., 1000, max total size e.g.,
50_000_000 bytes) and parse ctx.metadata.generatedFiles through this schema
before any use. Replace any direct merging/usage of the raw config and any
direct access to ctx.metadata.generatedFiles (locations around SelfTestConfig
related merges and the code that reads generatedFiles at the referenced lines)
to use the parsed/validated values and throw or call onBlocked when validation
fails.

Comment on lines +189 to +196
for (const file of files) {
const isTest = this.config.testPatterns.some((p) => p.test(file.path));

if (isTest) {
testFiles.push(file.path);
} else {
sourceFiles.push(file.path);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Exclude non-source artifacts from the test ratio.

All non-test files are treated as source, so docs/configs can inflate sourceFileCount and cause false failures when minTestRatio is enforced. Consider filtering to code extensions only.

Suggested change (source file filter)
-      if (isTest) {
+      const isSource = /\.(ts|tsx|js|jsx|py|go|rs|java|kt|rb|cs|php)$/.test(file.path);
+      if (isTest) {
         testFiles.push(file.path);
-      } else {
+      } else if (isSource) {
         sourceFiles.push(file.path);
       }

Also applies to: 211-214

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/engine/src/hooks/self-test-hook.ts` around lines 189 - 196, The loop
that classifies entries from files into testFiles and sourceFiles currently
treats all non-test files as source, which inflates sourceFileCount; change the
classification to first filter by a whitelist of code file extensions (e.g.,
.ts, .js, .tsx, .jsx, .py, .go, .java, .c, .cpp, .rs, etc.) before deciding test
vs source so docs/config and config files are ignored—implement an
allowedExtensions array and check the file.path extension (or mimetype) before
running this.config.testPatterns; apply the same filtering logic to the second
similar classification block (the other files -> testFiles/sourceFiles loop) so
both places only count real source code files when computing minTestRatio.

Comment on lines +60 to +96
export interface TraceAnalysisConfig {
/** Minimum failure count to report a pattern. @default 2 */
minPatternCount: number;
/** Whether to log analysis results. @default true */
logResults: boolean;
/** Callback when analysis completes */
onAnalysis?: (result: TraceAnalysisResult) => Promise<void>;
}

/**
* Default configuration
*/
export const DEFAULT_TRACE_ANALYSIS_CONFIG: TraceAnalysisConfig = {
minPatternCount: 2,
logResults: true,
};

// =============================================================================
// Hook Implementation
// =============================================================================

/**
* Trace Analysis Hook
*
* Runs on onRunEnd to analyze all decision traces from the completed run.
* Identifies failure patterns, calculates agent failure rates, and generates
* improvement suggestions that can feed back into prompt tuning or hook config.
*/
export class TraceAnalysisHook implements AgentHook {
readonly name = 'trace-analysis';
private config: TraceAnalysisConfig;
private store: DecisionTraceStore;

constructor(config?: Partial<TraceAnalysisConfig>, store?: DecisionTraceStore) {
this.config = { ...DEFAULT_TRACE_ANALYSIS_CONFIG, ...config };
this.store = store ?? getDecisionTraceStore();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, check if zod is already a dependency
find . -name "package.json" -type f | xargs grep -l "\"zod\"" | head -5

# Check existing Zod usage patterns in hooks
rg -n "zod|Zod" packages/engine/src/hooks -g '*.ts' -A 2

# Check for any ConfigSchema patterns
rg -n "ConfigSchema|z\\.object|z\\.parse" packages/engine/src/hooks -g '*.ts'

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 237


🏁 Script executed:

# Verify the actual code at the specified file and lines
cat -n packages/engine/src/hooks/trace-analysis-hook.ts | sed -n '60,96p'

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1630


🏁 Script executed:

# Check if there are other hook configs to understand the pattern
fd -e ts packages/engine/src/hooks -type f | head -10 | xargs grep -l "Config.*constructor" | head -3

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 253


🏁 Script executed:

# Check imports in the file to confirm current dependencies
head -30 packages/engine/src/hooks/trace-analysis-hook.ts

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 939


🏁 Script executed:

# Find other hook files to see if any use Zod validation
fd -e ts packages/engine/src/hooks -type f | xargs ls -1

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 850


🏁 Script executed:

# Check how other hooks handle config (look for patterns in other hooks)
rg -n "constructor.*config" packages/engine/src/hooks -A 3 -B 1 -g '*.ts' | head -40

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 2862


🏁 Script executed:

# Check if minPatternCount <= 0 is actually used/validated later in the hook
rg -n "minPatternCount" packages/engine/src/hooks/trace-analysis-hook.ts -B 2 -A 2

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1074


🏁 Script executed:

# Check if there are any tests for config validation in hooks
find . -path "*/test/*" -name "*hook*" -type f | xargs grep -l "config.*validation\|Zod\|schema" 2>/dev/null | head -5

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 65


🏁 Script executed:

# Check if TraceAnalysisConfig is used elsewhere to understand impact scope
rg -n "TraceAnalysisConfig" packages/engine/src --type ts -B 1 -A 1

Repository: intent-solutions-io/iam-git-with-intent

Length of output: 1626


Add Zod validation with bounded constraints for TraceAnalysisConfig.

The config parameter is merged without validation in the constructor (line 94), allowing invalid values like minPatternCount ≤ 0 to silently break pattern detection logic at lines 235, 251, and 267. Add Zod schema validation to enforce bounds.

Suggested change
-import { getLogger } from '@gwi/core';
+import { getLogger } from '@gwi/core';
+import { z } from 'zod';
 import type {
   DecisionTraceStore,
   AgentDecisionTrace,
   AgentType,
 } from '@gwi/core';
 import { getDecisionTraceStore } from '@gwi/core';
 
 export interface TraceAnalysisConfig {
   minPatternCount: number;
   logResults: boolean;
   onAnalysis?: (result: TraceAnalysisResult) => Promise<void>;
 }
 
+const TraceAnalysisConfigSchema = z.object({
+  minPatternCount: z.number().int().min(1).max(50),
+  logResults: z.boolean(),
+  onAnalysis: z.function().optional(),
+});
+
 export const DEFAULT_TRACE_ANALYSIS_CONFIG: TraceAnalysisConfig = {
   minPatternCount: 2,
   logResults: true,
 };
 
 export class TraceAnalysisHook implements AgentHook {
   readonly name = 'trace-analysis';
   private config: TraceAnalysisConfig;
   private store: DecisionTraceStore;
 
   constructor(config?: Partial<TraceAnalysisConfig>, store?: DecisionTraceStore) {
     const merged = { ...DEFAULT_TRACE_ANALYSIS_CONFIG, ...config };
     this.config = TraceAnalysisConfigSchema.parse(merged);
     this.store = store ?? getDecisionTraceStore();
   }

Per coding guidelines: "Use Zod schemas for validation of all external data" and "Use Zod schemas with bounded constraints for input validation."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export interface TraceAnalysisConfig {
/** Minimum failure count to report a pattern. @default 2 */
minPatternCount: number;
/** Whether to log analysis results. @default true */
logResults: boolean;
/** Callback when analysis completes */
onAnalysis?: (result: TraceAnalysisResult) => Promise<void>;
}
/**
* Default configuration
*/
export const DEFAULT_TRACE_ANALYSIS_CONFIG: TraceAnalysisConfig = {
minPatternCount: 2,
logResults: true,
};
// =============================================================================
// Hook Implementation
// =============================================================================
/**
* Trace Analysis Hook
*
* Runs on onRunEnd to analyze all decision traces from the completed run.
* Identifies failure patterns, calculates agent failure rates, and generates
* improvement suggestions that can feed back into prompt tuning or hook config.
*/
export class TraceAnalysisHook implements AgentHook {
readonly name = 'trace-analysis';
private config: TraceAnalysisConfig;
private store: DecisionTraceStore;
constructor(config?: Partial<TraceAnalysisConfig>, store?: DecisionTraceStore) {
this.config = { ...DEFAULT_TRACE_ANALYSIS_CONFIG, ...config };
this.store = store ?? getDecisionTraceStore();
}
import { getLogger } from '@gwi/core';
import { z } from 'zod';
import type {
DecisionTraceStore,
AgentDecisionTrace,
AgentType,
} from '@gwi/core';
import { getDecisionTraceStore } from '@gwi/core';
export interface TraceAnalysisConfig {
/** Minimum failure count to report a pattern. `@default` 2 */
minPatternCount: number;
/** Whether to log analysis results. `@default` true */
logResults: boolean;
/** Callback when analysis completes */
onAnalysis?: (result: TraceAnalysisResult) => Promise<void>;
}
const TraceAnalysisConfigSchema = z.object({
minPatternCount: z.number().int().min(1).max(50),
logResults: z.boolean(),
onAnalysis: z.function().optional(),
});
/**
* Default configuration
*/
export const DEFAULT_TRACE_ANALYSIS_CONFIG: TraceAnalysisConfig = {
minPatternCount: 2,
logResults: true,
};
// =============================================================================
// Hook Implementation
// =============================================================================
/**
* Trace Analysis Hook
*
* Runs on onRunEnd to analyze all decision traces from the completed run.
* Identifies failure patterns, calculates agent failure rates, and generates
* improvement suggestions that can feed back into prompt tuning or hook config.
*/
export class TraceAnalysisHook implements AgentHook {
readonly name = 'trace-analysis';
private config: TraceAnalysisConfig;
private store: DecisionTraceStore;
constructor(config?: Partial<TraceAnalysisConfig>, store?: DecisionTraceStore) {
const merged = { ...DEFAULT_TRACE_ANALYSIS_CONFIG, ...config };
this.config = TraceAnalysisConfigSchema.parse(merged);
this.store = store ?? getDecisionTraceStore();
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/engine/src/hooks/trace-analysis-hook.ts` around lines 60 - 96, Add a
Zod schema for TraceAnalysisConfig (validate minPatternCount as an integer >= 1
and logResults as boolean, onAnalysis as optional function) and use it in the
TraceAnalysisHook constructor to validate the merged config
(DEFAULT_TRACE_ANALYSIS_CONFIG + config) before assigning to this.config; if
validation fails, throw or normalize to safe defaults so invalid values (e.g.,
minPatternCount <= 0) cannot reach pattern detection logic in methods that
reference minPatternCount. Update types to derive from the Zod schema so
TraceAnalysisConfig aligns with the validator and ensure the constructor uses
the schema parse/parseAsync call to enforce the bounded constraints.

- DRY config.ts: refactor 5 hook registrations to array loop
- Fix self-test blocking: move enforcement from onAfterStep to onBeforeStep
  (onAfterStep errors are swallowed by the hook runner)
- Fix budget reset: guard onRunStart with idempotent check per runId
  (autopilot calls runStart multiple times)
- Fix trace analysis tenant isolation: use listTraces with tenantId
  when available (multi-tenant safety)
- Remove redundant onAfterStep from environment-onboarding-hook
- Add Zod-style validation for metadata.files and metadata.repoRoot
- Add tests for all fixes (idempotent budget, tenant-scoped traces,
  onBeforeStep blocking, run-end cleanup)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

🤖 Gemini 2.5 Flash Code Review

Files Reviewed: 13

Summary

This PR introduces five new "harness engineering hooks" to enhance the reliability, efficiency, and contextual awareness of agent runs: Budget Management, Environment Onboarding, Loop Detection, Self-Test, and Trace Analysis. These hooks integrate seamlessly into the existing hook runner, providing valuable insights and control over agent behavior through context metadata and configurable blocking mechanisms.

Critical Issues 🔴

  1. LoopDetectionHook - Potential Performance/DoS with Large Outputs
    The calculateSimilarity function, which uses Dice's coefficient on character bigrams, processes the outputSummary string. If an agent generates extremely long outputSummary strings, the creation of bigram maps and the subsequent similarity calculation can become computationally expensive. While maxTrackedOutputs limits the history size, the

Powered by Gemini 2.5 Flash

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
packages/engine/src/hooks/budget-management-hook.ts (1)

118-120: Add Zod validation for BudgetManagementConfig.

The config is merged without schema validation. Invalid values like maxTotalTokens ≤ 0, warningThresholdPercent > criticalThresholdPercent, or negative cost rates could cause unexpected behavior or division issues.

Suggested Zod schema
 import { getLogger } from '@gwi/core';
+import { z } from 'zod';
 import type { AgentHook, AgentRunContext } from './types.js';

+const BudgetManagementConfigSchema = z.object({
+  maxTotalTokens: z.number().int().min(1).max(100_000_000),
+  maxCostUsd: z.number().min(0.01).max(1000),
+  warningThresholdPercent: z.number().min(0).max(100),
+  criticalThresholdPercent: z.number().min(0).max(100),
+  enforceBlocking: z.boolean(),
+  costPer1kInputTokens: z.number().min(0),
+  costPer1kOutputTokens: z.number().min(0),
+  onBudgetExceeded: z.function().optional(),
+}).refine(
+  (c) => c.warningThresholdPercent <= c.criticalThresholdPercent,
+  { message: 'warningThresholdPercent must be <= criticalThresholdPercent' }
+);

   constructor(config?: Partial<BudgetManagementConfig>) {
-    this.config = { ...DEFAULT_BUDGET_CONFIG, ...config };
+    const merged = { ...DEFAULT_BUDGET_CONFIG, ...config };
+    this.config = BudgetManagementConfigSchema.parse(merged);
   }

Per coding guidelines: "Use Zod schemas for validation of all external data" and "Use Zod schemas with bounded constraints for input validation."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/engine/src/hooks/budget-management-hook.ts` around lines 118 - 120,
Create a Zod schema for BudgetManagementConfig (e.g.,
BudgetManagementConfigSchema) with bounded constraints: maxTotalTokens as
number().int().positive(), warningThresholdPercent and criticalThresholdPercent
as numbers().min(0).max(100) with refinement warning < critical, and cost rates
(e.g., promptCostPerToken, responseCostPerToken) as number().min(0). Then, in
the constructor of the BudgetManagementHook (where DEFAULT_BUDGET_CONFIG is
merged), merge defaults and passed config as now, validate the result with
BudgetManagementConfigSchema.parse (or safeParse and throw a clear Error on
failure), and assign the parsed value to this.config so only validated config
enters the instance. Ensure the schema symbol names match BudgetManagementConfig
and DEFAULT_BUDGET_CONFIG usages for easy location.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/engine/src/hooks/environment-onboarding-hook.ts`:
- Line 79: The current language mapping incorrectly treats package.json as
TypeScript and uses the regex /\.[jt]sx?$/ to map both .js and .ts to
'typescript'; update the mapping so { file: 'package.json', language:
'typescript' } is not hard-coded to TypeScript (remove the language key or set
it to 'json') and change the extension-regex logic to distinguish JS vs TS (use
separate checks: /\.tsx?$/ => 'typescript' and /\.jsx?$/ => 'javascript'), and
for package.json only mark TypeScript when a TS indicator like 'tsconfig.json'
or dependencies referencing TypeScript is present (adjust the code paths that
build this file list / detectLanguage accordingly).

In `@packages/engine/src/hooks/self-test-hook.ts`:
- Around line 126-128: Add Zod schema validation for the SelfTestConfig used in
the constructor and for the ctx.metadata.generatedFiles before casting: define a
zod schema (e.g., SelfTestConfigSchema) that mirrors DEFAULT_SELF_TEST_CONFIG
with bounded constraints (types, ranges, lengths, optionals) and validate the
merged config in the constructor (constructor -> this.config). Also create a
GeneratedFilesSchema for ctx.metadata.generatedFiles (shape, array length
limits, filename/content constraints) and validate ctx.metadata.generatedFiles
where it is read (referenced as ctx.metadata.generatedFiles) before casting or
using it; throw or log a clear error on validation failure to prevent unsafe
inputs.
- Around line 211-218: The loop currently classifies every non-matching path as
source (affecting test ratio); update the logic in the for-loop that builds
testFiles/sourceFiles to first filter by code file extensions (e.g., consult
this.config.codeExtensions or add a default array) and only push into
sourceFiles when the file path has a code extension and isTest is false;
non-code files (docs, configs) should be ignored entirely for the minTestRatio
calculation while still adding true tests to testFiles using the existing
this.config.testPatterns.

In `@packages/engine/src/hooks/trace-analysis-hook.ts`:
- Around line 93-96: The constructor currently merges
DEFAULT_TRACE_ANALYSIS_CONFIG with the provided TraceAnalysisConfig without
validating bounds; add a Zod schema for TraceAnalysisConfig (covering fields
like minPatternCount with constraints such as min(1), and any other bounded
numeric/string fields) and use schema.parse or safeParse to validate the merged
config before assigning to this.config in the constructor of TraceAnalysisHook
(or the class containing the constructor). If validation fails, throw a clear
error (or surface it) so invalid configs cannot silently break pattern detection
used by methods referencing minPatternCount; reference
DEFAULT_TRACE_ANALYSIS_CONFIG, TraceAnalysisConfig, and the constructor when
making the change.

---

Nitpick comments:
In `@packages/engine/src/hooks/budget-management-hook.ts`:
- Around line 118-120: Create a Zod schema for BudgetManagementConfig (e.g.,
BudgetManagementConfigSchema) with bounded constraints: maxTotalTokens as
number().int().positive(), warningThresholdPercent and criticalThresholdPercent
as numbers().min(0).max(100) with refinement warning < critical, and cost rates
(e.g., promptCostPerToken, responseCostPerToken) as number().min(0). Then, in
the constructor of the BudgetManagementHook (where DEFAULT_BUDGET_CONFIG is
merged), merge defaults and passed config as now, validate the result with
BudgetManagementConfigSchema.parse (or safeParse and throw a clear Error on
failure), and assign the parsed value to this.config so only validated config
enters the instance. Ensure the schema symbol names match BudgetManagementConfig
and DEFAULT_BUDGET_CONFIG usages for easy location.

…BSL 1.1

- Fix version (0.8.0 -> 0.9.0), remove phantom IDE Plugins node, update
  agents diagram (4 -> 8 agents), add mcp-server app and sandbox package
  to monorepo structure, mark gwi simulate as available, fix RAG status
  to In Progress, add gwi-mcp-server service, update security audit date
  to Feb 2026, change status from pre-alpha to alpha
- Extend roadmap with Phases 6-8 (Enterprise Infra, Analytics & Scale,
  Enterprise Identity) as Mermaid Gantt + status table
- Change license from MIT to BSL 1.1 (change date 2030-02-17, converts
  to Apache 2.0)
- Add 000-docs/240-PP-RMAP enterprise architecture roadmap PRD

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link

🤖 Gemini 2.5 Flash Code Review

Files Reviewed: 14

Summary

This PR introduces five new "harness engineering hooks" to the engine, focusing on improving agent reliability, efficiency, and awareness. The new hooks cover budget management, environment onboarding, loop detection, self-testing, and trace analysis. The implementation includes robust testing, clear configuration, and appropriate state management for each hook.

Critical Issues 🔴

None. The code appears to be well-structured, tested, and addresses the stated goals without obvious critical flaws in security, correctness, or architecture.

Suggestions 🟡

packages/engine/src/hooks/budget-management-hook.ts

  1. Correctness/Robustness - Handle ctx.tokensUsed being undefined or null
    • **Line 1

Powered by Gemini 2.5 Flash

@jeremylongshore jeremylongshore changed the title feat(engine): add 5 harness engineering hooks feat(engine): harness engineering hooks + README currency + BSL 1.1 Feb 18, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
README.md (1)

289-304: ⚠️ Potential issue | 🟡 Minor

Duplicate node ID orchestrator in Mermaid diagram.

The node ID orchestrator is defined twice: once in the "Workflow Engine" subgraph (line 290) and again in the "AI Agents" subgraph (line 296). In Mermaid, node IDs must be unique—reusing the same ID references the existing node rather than creating a new one, which will cause unexpected diagram rendering.

🔧 Proposed fix: Rename one of the node IDs
     subgraph Engine["Workflow Engine"]
-        orchestrator[Orchestrator]
+        engine_orchestrator[Orchestrator]
         runner[Step Runner]
         approval[Approval Gate]
     end
 
     subgraph Agents["AI Agents (8)"]
-        orchestrator[Orchestrator - Gemini Flash]
+        agent_orchestrator[Orchestrator - Gemini Flash]
         foreman[Foreman - Workflow Coordinator]

Also update the edge reference on line 319:

-    orchestrator --> runner
+    engine_orchestrator --> runner
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@README.md` around lines 289 - 304, The Mermaid diagram defines the node ID
orchestrator twice; rename the duplicate in the "AI Agents" subgraph (the node
labeled "Orchestrator - Gemini Flash", currently written as
orchestrator[Orchestrator - Gemini Flash]) to a unique ID such as
orchestrator_agent, and update any edges that reference the old orchestrator ID
(the edge referencing that agent) to use the new ID so all node IDs are unique
and the diagram renders correctly.
🧹 Nitpick comments (1)
000-docs/240-PP-RMAP-enterprise-architecture-roadmap.md (1)

235-248: Add language identifier to fenced code block.

The dependency graph code block should specify a language for better Markdown rendering and accessibility. Since this is a text-based diagram (not executable code), use text or plaintext.

📝 Proposed fix
-```
+```text
 Phase 6.2 (External LB)
   └── Phase 6.1 (Cloud Armor WAF) -- WAF attaches to LB
   └── Phase 6.3 (Memorystore) -- needs VPC connector shared with LB
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@000-docs/240-PP-RMAP-enterprise-architecture-roadmap.md` around lines 235 -
248, The fenced dependency graph block in
000-docs/240-PP-RMAP-enterprise-architecture-roadmap.md should include a
language identifier to improve rendering and accessibility; update the opening
fence for the block that begins with "Phase 6.2 (External LB)" to use a
plaintext identifier (e.g., change ``` to ```text) so the diagram (including
lines referencing "Phase 6.1 (Cloud Armor WAF)", "Phase 6.3 (Memorystore)", and
subsequent phases) is treated as preformatted text by Markdown renderers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@README.md`:
- Around line 289-304: The Mermaid diagram defines the node ID orchestrator
twice; rename the duplicate in the "AI Agents" subgraph (the node labeled
"Orchestrator - Gemini Flash", currently written as orchestrator[Orchestrator -
Gemini Flash]) to a unique ID such as orchestrator_agent, and update any edges
that reference the old orchestrator ID (the edge referencing that agent) to use
the new ID so all node IDs are unique and the diagram renders correctly.

---

Nitpick comments:
In `@000-docs/240-PP-RMAP-enterprise-architecture-roadmap.md`:
- Around line 235-248: The fenced dependency graph block in
000-docs/240-PP-RMAP-enterprise-architecture-roadmap.md should include a
language identifier to improve rendering and accessibility; update the opening
fence for the block that begins with "Phase 6.2 (External LB)" to use a
plaintext identifier (e.g., change ``` to ```text) so the diagram (including
lines referencing "Phase 6.1 (Cloud Armor WAF)", "Phase 6.3 (Memorystore)", and
subsequent phases) is treated as preformatted text by Markdown renderers.

@jeremylongshore jeremylongshore merged commit 7f68e33 into main Feb 18, 2026
11 checks passed
@jeremylongshore jeremylongshore deleted the feat/harness-engineering-hooks branch February 18, 2026 05:18
jeremylongshore added a commit that referenced this pull request Feb 20, 2026
Release prepared by: jeremy
Approval timestamp: 2026-02-20T00:21:16Z
Approval SHA: 10989d0

Changes in this release:
- Scale & Ops Maturity epic: agent-first infrastructure (#86)
- Harness engineering hooks + README currency + BSL 1.1 (#85)
- Code Factory gap closure: incident-to-harness + browser evidence (#84)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant