Skip to content

feat(agents): modular SDK runtimes + Anthropic fixes + OpenCode SDK#293

Merged
rubenmarcus merged 14 commits intomainfrom
feat/anthropic-sdk-agent
Mar 13, 2026
Merged

feat(agents): modular SDK runtimes + Anthropic fixes + OpenCode SDK#293
rubenmarcus merged 14 commits intomainfrom
feat/anthropic-sdk-agent

Conversation

@rubenmarcus
Copy link
Copy Markdown
Member

@rubenmarcus rubenmarcus commented Mar 12, 2026

Summary

This PR upgrades agent execution to support SDK-first workflows while keeping existing CLI behavior.

It now includes:

  • A fixed and production-safe Anthropic SDK agent
  • A new OpenCode SDK agent integration
  • A modularized agent runtime structure (including Amp SDK) to keep the code maintainable

What Changed

  1. Added full opencode-sdk agent support using @opencode-ai/sdk.
  2. Hardened anthropic-sdk execution and fixed tool-loop behavior for tool-use responses.
  3. Split monolithic src/loop/agents.ts into focused modules:
    • src/loop/agents/amp-sdk.ts
    • src/loop/agents/anthropic-sdk.ts
    • src/loop/agents/opencode-sdk.ts
    • src/loop/agents/output-collector.ts
  4. Kept src/loop/agents.ts as the central dispatcher/detection layer.
  5. Updated CLI help text to include anthropic-sdk and opencode-sdk in --agent options.
  6. Expanded tests for agent detection and availability logic.

Agent Priority

Current preference order:

claude-code > amp > anthropic-sdk > opencode-sdk > cursor > codex > opencode > openclaw

Safety / Correctness Notes

  1. Anthropic tool path checks enforce workspace boundaries safely.
  2. Anthropic tool-use loop now executes tool calls whenever present, even if stop reason is end_turn.
  3. OpenCode SDK availability now requires both API key presence and opencode binary availability (required by SDK server startup).

Validation

  • pnpm build
  • pnpm test
  • pnpm typecheck

All passing on this branch.

rubenmarcus and others added 2 commits March 12, 2026 23:40
- Add 'anthropic-sdk' to AgentType union
- Add apiKey field to AgentRunOptions and LoopOptions
- Implement runAnthropicSdkAgent using @anthropic-ai/sdk with streaming
- Update detectAvailableAgents/detectBestAgent to accept apiKeys option
- SDK agents are available when API key is provided (no CLI binary needed)
- Enables ralph-starter usage in web applications without CLI dependencies

Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386
Co-authored-by: Amp <amp@ampcode.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@github-actions
Copy link
Copy Markdown
Contributor

Issue Linking Reminder

This PR doesn't appear to have a linked issue. Consider linking to:

  • This repo: Closes #123
  • ralph-ideas: Closes multivmlabs/ralph-ideas#123

Using Closes, Fixes, or Resolves will auto-close the issue when this PR is merged.


If this PR doesn't need an issue, you can ignore this message.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 12, 2026

✔️ Bundle Size Analysis

Metric Value
Base 2590.27 KB
PR 2635.67 KB
Diff 45.39 KB (1.00%)
Bundle breakdown
156K	dist/auth
80K	dist/automation
4.0K	dist/cli.d.ts
4.0K	dist/cli.d.ts.map
20K	dist/cli.js
12K	dist/cli.js.map
584K	dist/commands
28K	dist/config
4.0K	dist/index.d.ts
4.0K	dist/index.d.ts.map
4.0K	dist/index.js
4.0K	dist/index.js.map
896K	dist/integrations
100K	dist/llm
1.2M	dist/loop
188K	dist/mcp
60K	dist/presets
92K	dist/setup
40K	dist/skills
392K	dist/sources
76K	dist/ui
144K	dist/utils
336K	dist/wizard

@github-actions github-actions bot added the tests label Mar 12, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 12, 2026

Greptile Summary

This PR modularizes the agent runtime by splitting src/loop/agents.ts into focused sub-modules (amp-sdk, anthropic-sdk, opencode-sdk, output-collector), adds a production-hardened Anthropic SDK agent with file tools and path-traversal protection, and introduces a new OpenCode SDK agent that wraps a locally-spawned OpenCode server. The agent dispatcher and CLI help text are updated accordingly.

Key changes:

  • anthropic-sdk.ts: Multi-turn tool-use loop using client.messages.create. Uses execa (v9-compatible, correct API), gates shell execution via allowShellExecution, and performs path traversal checks using realpathSync + sep-based boundary validation. Previously flagged issues (wrong timeout position, max_tokens from maxTurns, execaCommand, path traversal bypass) appear to have been addressed in this version.
  • opencode-sdk.ts: Wraps OpenCode's local server via @opencode-ai/sdk. Handles streaming events incrementally, guards streamPromise against unhandled rejection, warns on model format issues and keyless scenarios. Several issues remain from prior review threads (provider config key mismatch, single-shot HTTP timeout risk, etc.).
  • output-collector.ts: Clean shared helper that correctly buffers partial lines before dispatching to onOutput, fixing a previously flagged contract violation.
  • agents.ts: Refactored dispatcher with correct checkAgentAvailable logic for the two new SDK agents. detectAvailableAgents still does not forward OPENCODE_API_KEY to opencode-sdk availability checks (flagged previously).
  • Tests: beforeEach correctly stubs ANTHROPIC_API_KEY and OPENCODE_API_KEY to empty strings. However, detectBestAgent tests are missing a dedicated execa mock for opencode-sdk's CLI check (7th call), relying on persistent mock state from an earlier test — making them fragile.

Confidence Score: 2/5

  • PR introduces a working Anthropic SDK agent but the OpenCode SDK agent carries several unresolved issues from prior review rounds that may cause subtle runtime failures in production.
  • The Anthropic SDK agent looks substantially improved from prior iterations — path traversal is properly guarded, execa v9 API is used correctly, and shell execution is opt-in. However, multiple issues flagged in prior review threads on opencode-sdk.ts and agents.ts remain open (provider config key mismatch, OPENCODE_API_KEY not forwarded in detectAvailableAgents, potential HTTP-layer timeout on single-shot prompt, text duplication on streaming parts). The test suite also has a fragile mock-ordering issue for the new agent priority. Score reflects that existing functionality is unbroken and the Anthropic SDK path is solid, but the OpenCode SDK integration has enough unresolved edge cases to warrant another review pass before merging.
  • Pay close attention to src/loop/agents/opencode-sdk.ts (provider config / auth key routing) and src/loop/agents.ts (OPENCODE_API_KEY not forwarded to opencode-sdk availability check). src/loop/__tests__/agents.test.ts also needs the missing execa mock for opencode-sdk's CLI check in the detectBestAgent test suite.

Important Files Changed

Filename Overview
src/loop/agents/anthropic-sdk.ts New Anthropic SDK agent module with tool support. Uses correct execa v9 API. Path traversal protection is present and uses sep for proper boundary checking. Shell execution is gated behind allowShellExecution flag. run_command combines stdout and stderr correctly with filter/join.
src/loop/agents/opencode-sdk.ts New OpenCode SDK agent with streaming event support. Several issues flagged in prior review (streamPromise unhandled rejection guarded but promptAsync race still exists, provider config key mismatch, text duplication). API key for non-model case is now warned about. Overall complex async flow with remaining risks.
src/loop/agents/output-collector.ts Clean, well-factored output buffering helper. Properly buffers lines before calling onOutput, handles memory limits, and flushes remaining content. No issues found.
src/loop/agents/amp-sdk.ts Clean extraction of the existing Amp SDK + CLI fallback logic from the monolithic agents.ts. No behavioral changes detected; code is functionally equivalent to the original.
src/loop/agents.ts Central dispatcher refactored cleanly. checkAgentAvailable correctly handles anthropic-sdk (env var) and opencode-sdk (CLI + optional key). detectAvailableAgents silently ignores OPENCODE_API_KEY for opencode-sdk (previously flagged). runSubprocessAgent retains correct remaining-buffer flush on close.
src/loop/tests/agents.test.ts Tests correctly stub API key env vars in beforeEach and update length assertion to 8. However, detectBestAgent tests are missing an execa mock for opencode-sdk's CLI check (7th call), relying on persistent default mock state from an earlier test to avoid flaking.
src/cli.ts Simple CLI help text update to include the two new agent names. No logic changes.
package.json Adds @opencode-ai/sdk ^1.2.25 dependency and bumps version to 0.4.4. Straightforward dependency addition.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[runAgent called] --> B{agent.type}
    B -->|claude-code / cursor / codex / opencode / openclaw| C[Build CLI args]
    C --> D[runSubprocessAgent\nspawn subprocess]
    D --> E[Stream stdout/stderr\nline-by-line via onOutput]
    E --> F[Resolve on close / timeout]

    B -->|amp| G[runAmpAgent]
    G --> G1{amp-sdk importable?}
    G1 -->|yes| G2[ampSdk.execute\nasync generator]
    G1 -->|no| G3[runAmpCli\nspawn amp --stream-json]

    B -->|anthropic-sdk| H[runAnthropicSdkAgent]
    H --> H1[client.messages.create\nwith tools]
    H1 --> H2{toolCalls.length > 0?}
    H2 -->|yes| H3[executeAnthropicTool\nread_file / write_file / list_directory / run_command]
    H3 --> H4[Append tool_result to messages]
    H4 --> H1
    H2 -->|no| H5[Return output]

    B -->|opencode-sdk| I[runOpencodeSdkAgent]
    I --> I1[createOpencode\nstart local server]
    I1 --> I2[client.session.create]
    I2 --> I3[client.event.subscribe\nSSE stream]
    I3 --> I4[client.session.promptAsync\nfire-and-forget]
    I4 --> I5[await streamPromise\nmessage.part.updated events]
    I5 --> I6{session.idle or\nsession.error}
    I6 --> I7[Return output]

    subgraph OutputCollector
        OC1[append text] --> OC2[Buffer lines]
        OC2 --> OC3[onOutput per line]
    end

    H --> OutputCollector
    I --> OutputCollector
Loading

Comments Outside Diff (1)

  1. src/loop/__tests__/agents.test.ts, line 148-172 (link)

    Missing execa mock for opencode-sdk CLI check in detectBestAgent tests

    detectAvailableAgents checks 8 agents. The opencode-sdk agent calls execa('opencode', ['--version'], ...) as its availability check — this is one execa call in addition to the opencode (CLI) agent's check. Both tests below only provide 6 once mocks (one for each of: claude-code, cursor, codex, opencode, openclaw, amp), leaving the 7th call for opencode-sdk without a dedicated mock.

    The tests currently pass because vi.clearAllMocks() clears call history but not mock implementations, so the persistent mockResolvedValue set by should prefer claude-code over others is still active for the 7th call — making opencode-sdk silently available in both tests. Since amp has higher priority than opencode-sdk in the preference order, the outcome is still correct by accident.

    This is fragile: if vi.resetAllMocks() is ever used, or if test execution order changes, the 7th call behaviour changes and the tests may start failing or testing the wrong thing.

    Add a 7th rejection mock for each test to be explicit:

    // should fall back to amp if claude-code is not available
    mockExeca
      .mockRejectedValueOnce(new Error('not found')) // claude-code
      .mockRejectedValueOnce(new Error('not found')) // cursor
      .mockRejectedValueOnce(new Error('not found')) // codex
      .mockRejectedValueOnce(new Error('not found')) // opencode (CLI agent)
      .mockRejectedValueOnce(new Error('not found')) // openclaw
      .mockResolvedValueOnce({ stdout: '1.0.0', exitCode: 0 } as any) // amp
      .mockRejectedValueOnce(new Error('not found')); // opencode-sdk CLI check (opencode binary)

    The same fix applies to should prefer amp over cursor (line 161). The detectAvailableAgents test at line 105 has the same gap and should also include a 7th mock.

Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/loop/__tests__/agents.test.ts
Line: 148-172

Comment:
**Missing execa mock for `opencode-sdk` CLI check in `detectBestAgent` tests**

`detectAvailableAgents` checks 8 agents. The `opencode-sdk` agent calls `execa('opencode', ['--version'], ...)` as its availability check — this is one execa call **in addition to** the `opencode` (CLI) agent's check. Both tests below only provide 6 `once` mocks (one for each of: `claude-code`, `cursor`, `codex`, `opencode`, `openclaw`, `amp`), leaving the 7th call for `opencode-sdk` without a dedicated mock.

The tests currently pass because `vi.clearAllMocks()` clears call history but **not** mock implementations, so the persistent `mockResolvedValue` set by `should prefer claude-code over others` is still active for the 7th call — making `opencode-sdk` silently available in both tests. Since `amp` has higher priority than `opencode-sdk` in the preference order, the outcome is still correct by accident.

This is fragile: if `vi.resetAllMocks()` is ever used, or if test execution order changes, the 7th call behaviour changes and the tests may start failing or testing the wrong thing.

Add a 7th rejection mock for each test to be explicit:

```typescript
// should fall back to amp if claude-code is not available
mockExeca
  .mockRejectedValueOnce(new Error('not found')) // claude-code
  .mockRejectedValueOnce(new Error('not found')) // cursor
  .mockRejectedValueOnce(new Error('not found')) // codex
  .mockRejectedValueOnce(new Error('not found')) // opencode (CLI agent)
  .mockRejectedValueOnce(new Error('not found')) // openclaw
  .mockResolvedValueOnce({ stdout: '1.0.0', exitCode: 0 } as any) // amp
  .mockRejectedValueOnce(new Error('not found')); // opencode-sdk CLI check (opencode binary)
```

The same fix applies to `should prefer amp over cursor` (line 161). The `detectAvailableAgents` test at line 105 has the same gap and should also include a 7th mock.

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: 962d42b

@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

- Add file tools (read_file, write_file, list_directory, run_command)
- Implement multi-turn tool-use loop so agent can actually edit files
- Fix timeout: pass as RequestOptions (2nd arg), not in message body
- Fix max_tokens: use fixed 16384 instead of deriving from maxTurns
- Fix system prompt: accurately describes tool capabilities

Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386
Co-authored-by: Amp <amp@ampcode.com>
@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

- Prevent path traversal: resolve + realpath guard on all file tools
- Gate run_command behind allowShellExecution option (default: false)
- Exclude run_command from tool list entirely when shell disabled
- Use remaining time budget for per-request timeout instead of full value

Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386
Co-authored-by: Amp <amp@ampcode.com>
@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

@rubenmarcus rubenmarcus changed the title feat: add Anthropic SDK agent for web/serverless usage feat(agents): modular SDK runtimes + Anthropic fixes + OpenCode SDK Mar 13, 2026
@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

@rubenmarcus
Copy link
Copy Markdown
Member Author

@greptileai

@rubenmarcus rubenmarcus merged commit fa5307a into main Mar 13, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant