feat(agents): modular SDK runtimes + Anthropic fixes + OpenCode SDK by rubenmarcus · Pull Request #293 · multivmlabs/ralph-starter

rubenmarcus · 2026-03-12T23:43:23Z

Summary

This PR upgrades agent execution to support SDK-first workflows while keeping existing CLI behavior.

It now includes:

A fixed and production-safe Anthropic SDK agent
A new OpenCode SDK agent integration
A modularized agent runtime structure (including Amp SDK) to keep the code maintainable

What Changed

Added full opencode-sdk agent support using @opencode-ai/sdk.
Hardened anthropic-sdk execution and fixed tool-loop behavior for tool-use responses.
Split monolithic src/loop/agents.ts into focused modules:
- src/loop/agents/amp-sdk.ts
- src/loop/agents/anthropic-sdk.ts
- src/loop/agents/opencode-sdk.ts
- src/loop/agents/output-collector.ts
Kept src/loop/agents.ts as the central dispatcher/detection layer.
Updated CLI help text to include anthropic-sdk and opencode-sdk in --agent options.
Expanded tests for agent detection and availability logic.

Agent Priority

Current preference order:

claude-code > amp > anthropic-sdk > opencode-sdk > cursor > codex > opencode > openclaw

Safety / Correctness Notes

Anthropic tool path checks enforce workspace boundaries safely.
Anthropic tool-use loop now executes tool calls whenever present, even if stop reason is end_turn.
OpenCode SDK availability now requires both API key presence and opencode binary availability (required by SDK server startup).

Validation

pnpm build
pnpm test
pnpm typecheck

All passing on this branch.

- Add 'anthropic-sdk' to AgentType union - Add apiKey field to AgentRunOptions and LoopOptions - Implement runAnthropicSdkAgent using @anthropic-ai/sdk with streaming - Update detectAvailableAgents/detectBestAgent to accept apiKeys option - SDK agents are available when API key is provided (no CLI binary needed) - Enables ralph-starter usage in web applications without CLI dependencies Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386 Co-authored-by: Amp <amp@ampcode.com>

Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386 Co-authored-by: Amp <amp@ampcode.com>

chatgpt-codex-connector · 2026-03-12T23:43:28Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

github-actions · 2026-03-12T23:43:32Z

Issue Linking Reminder

This PR doesn't appear to have a linked issue. Consider linking to:

This repo: Closes #123
ralph-ideas: Closes multivmlabs/ralph-ideas#123

Using Closes, Fixes, or Resolves will auto-close the issue when this PR is merged.

If this PR doesn't need an issue, you can ignore this message.

github-actions · 2026-03-12T23:43:55Z

✔️ Bundle Size Analysis

Metric	Value
Base	2590.27 KB
PR	2635.67 KB
Diff	45.39 KB (1.00%)

Bundle breakdown

156K	dist/auth
80K	dist/automation
4.0K	dist/cli.d.ts
4.0K	dist/cli.d.ts.map
20K	dist/cli.js
12K	dist/cli.js.map
584K	dist/commands
28K	dist/config
4.0K	dist/index.d.ts
4.0K	dist/index.d.ts.map
4.0K	dist/index.js
4.0K	dist/index.js.map
896K	dist/integrations
100K	dist/llm
1.2M	dist/loop
188K	dist/mcp
60K	dist/presets
92K	dist/setup
40K	dist/skills
392K	dist/sources
76K	dist/ui
144K	dist/utils
336K	dist/wizard

Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386 Co-authored-by: Amp <amp@ampcode.com>

greptile-apps · 2026-03-12T23:46:56Z

Greptile Summary

This PR modularizes the agent runtime by splitting src/loop/agents.ts into focused sub-modules (amp-sdk, anthropic-sdk, opencode-sdk, output-collector), adds a production-hardened Anthropic SDK agent with file tools and path-traversal protection, and introduces a new OpenCode SDK agent that wraps a locally-spawned OpenCode server. The agent dispatcher and CLI help text are updated accordingly.

Key changes:

anthropic-sdk.ts: Multi-turn tool-use loop using client.messages.create. Uses execa (v9-compatible, correct API), gates shell execution via allowShellExecution, and performs path traversal checks using realpathSync + sep-based boundary validation. Previously flagged issues (wrong timeout position, max_tokens from maxTurns, execaCommand, path traversal bypass) appear to have been addressed in this version.
opencode-sdk.ts: Wraps OpenCode's local server via @opencode-ai/sdk. Handles streaming events incrementally, guards streamPromise against unhandled rejection, warns on model format issues and keyless scenarios. Several issues remain from prior review threads (provider config key mismatch, single-shot HTTP timeout risk, etc.).
output-collector.ts: Clean shared helper that correctly buffers partial lines before dispatching to onOutput, fixing a previously flagged contract violation.
agents.ts: Refactored dispatcher with correct checkAgentAvailable logic for the two new SDK agents. detectAvailableAgents still does not forward OPENCODE_API_KEY to opencode-sdk availability checks (flagged previously).
Tests: beforeEach correctly stubs ANTHROPIC_API_KEY and OPENCODE_API_KEY to empty strings. However, detectBestAgent tests are missing a dedicated execa mock for opencode-sdk's CLI check (7th call), relying on persistent mock state from an earlier test — making them fragile.

Confidence Score: 2/5

PR introduces a working Anthropic SDK agent but the OpenCode SDK agent carries several unresolved issues from prior review rounds that may cause subtle runtime failures in production.
The Anthropic SDK agent looks substantially improved from prior iterations — path traversal is properly guarded, execa v9 API is used correctly, and shell execution is opt-in. However, multiple issues flagged in prior review threads on opencode-sdk.ts and agents.ts remain open (provider config key mismatch, OPENCODE_API_KEY not forwarded in detectAvailableAgents, potential HTTP-layer timeout on single-shot prompt, text duplication on streaming parts). The test suite also has a fragile mock-ordering issue for the new agent priority. Score reflects that existing functionality is unbroken and the Anthropic SDK path is solid, but the OpenCode SDK integration has enough unresolved edge cases to warrant another review pass before merging.
Pay close attention to src/loop/agents/opencode-sdk.ts (provider config / auth key routing) and src/loop/agents.ts (OPENCODE_API_KEY not forwarded to opencode-sdk availability check). src/loop/__tests__/agents.test.ts also needs the missing execa mock for opencode-sdk's CLI check in the detectBestAgent test suite.

Important Files Changed

Filename	Overview
src/loop/agents/anthropic-sdk.ts	New Anthropic SDK agent module with tool support. Uses correct execa v9 API. Path traversal protection is present and uses sep for proper boundary checking. Shell execution is gated behind allowShellExecution flag. run_command combines stdout and stderr correctly with filter/join.
src/loop/agents/opencode-sdk.ts	New OpenCode SDK agent with streaming event support. Several issues flagged in prior review (streamPromise unhandled rejection guarded but promptAsync race still exists, provider config key mismatch, text duplication). API key for non-model case is now warned about. Overall complex async flow with remaining risks.
src/loop/agents/output-collector.ts	Clean, well-factored output buffering helper. Properly buffers lines before calling onOutput, handles memory limits, and flushes remaining content. No issues found.
src/loop/agents/amp-sdk.ts	Clean extraction of the existing Amp SDK + CLI fallback logic from the monolithic agents.ts. No behavioral changes detected; code is functionally equivalent to the original.
src/loop/agents.ts	Central dispatcher refactored cleanly. checkAgentAvailable correctly handles anthropic-sdk (env var) and opencode-sdk (CLI + optional key). detectAvailableAgents silently ignores OPENCODE_API_KEY for opencode-sdk (previously flagged). runSubprocessAgent retains correct remaining-buffer flush on close.
src/loop/tests/agents.test.ts	Tests correctly stub API key env vars in beforeEach and update length assertion to 8. However, detectBestAgent tests are missing an execa mock for opencode-sdk's CLI check (7th call), relying on persistent default mock state from an earlier test to avoid flaking.
src/cli.ts	Simple CLI help text update to include the two new agent names. No logic changes.
package.json	Adds @opencode-ai/sdk ^1.2.25 dependency and bumps version to 0.4.4. Straightforward dependency addition.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[runAgent called] --> B{agent.type}
    B -->|claude-code / cursor / codex / opencode / openclaw| C[Build CLI args]
    C --> D[runSubprocessAgent\nspawn subprocess]
    D --> E[Stream stdout/stderr\nline-by-line via onOutput]
    E --> F[Resolve on close / timeout]

    B -->|amp| G[runAmpAgent]
    G --> G1{amp-sdk importable?}
    G1 -->|yes| G2[ampSdk.execute\nasync generator]
    G1 -->|no| G3[runAmpCli\nspawn amp --stream-json]

    B -->|anthropic-sdk| H[runAnthropicSdkAgent]
    H --> H1[client.messages.create\nwith tools]
    H1 --> H2{toolCalls.length > 0?}
    H2 -->|yes| H3[executeAnthropicTool\nread_file / write_file / list_directory / run_command]
    H3 --> H4[Append tool_result to messages]
    H4 --> H1
    H2 -->|no| H5[Return output]

    B -->|opencode-sdk| I[runOpencodeSdkAgent]
    I --> I1[createOpencode\nstart local server]
    I1 --> I2[client.session.create]
    I2 --> I3[client.event.subscribe\nSSE stream]
    I3 --> I4[client.session.promptAsync\nfire-and-forget]
    I4 --> I5[await streamPromise\nmessage.part.updated events]
    I5 --> I6{session.idle or\nsession.error}
    I6 --> I7[Return output]

    subgraph OutputCollector
        OC1[append text] --> OC2[Buffer lines]
        OC2 --> OC3[onOutput per line]
    end

    H --> OutputCollector
    I --> OutputCollector

Comments Outside Diff (1)

src/loop/__tests__/agents.test.ts, line 148-172 (link)

Missing execa mock for opencode-sdk CLI check in detectBestAgent tests

detectAvailableAgents checks 8 agents. The opencode-sdk agent calls execa('opencode', ['--version'], ...) as its availability check — this is one execa call in addition to the opencode (CLI) agent's check. Both tests below only provide 6 once mocks (one for each of: claude-code, cursor, codex, opencode, openclaw, amp), leaving the 7th call for opencode-sdk without a dedicated mock.

The tests currently pass because vi.clearAllMocks() clears call history but not mock implementations, so the persistent mockResolvedValue set by should prefer claude-code over others is still active for the 7th call — making opencode-sdk silently available in both tests. Since amp has higher priority than opencode-sdk in the preference order, the outcome is still correct by accident.

This is fragile: if vi.resetAllMocks() is ever used, or if test execution order changes, the 7th call behaviour changes and the tests may start failing or testing the wrong thing.

Add a 7th rejection mock for each test to be explicit:
```
// should fall back to amp if claude-code is not available
mockExeca
  .mockRejectedValueOnce(new Error('not found')) // claude-code
  .mockRejectedValueOnce(new Error('not found')) // cursor
  .mockRejectedValueOnce(new Error('not found')) // codex
  .mockRejectedValueOnce(new Error('not found')) // opencode (CLI agent)
  .mockRejectedValueOnce(new Error('not found')) // openclaw
  .mockResolvedValueOnce({ stdout: '1.0.0', exitCode: 0 } as any) // amp
  .mockRejectedValueOnce(new Error('not found')); // opencode-sdk CLI check (opencode binary)
```
The same fix applies to should prefer amp over cursor (line 161). The detectAvailableAgents test at line 105 has the same gap and should also include a 7th mock.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/loop/__tests__/agents.test.ts
Line: 148-172

Comment:
**Missing execa mock for `opencode-sdk` CLI check in `detectBestAgent` tests**

`detectAvailableAgents` checks 8 agents. The `opencode-sdk` agent calls `execa('opencode', ['--version'], ...)` as its availability check — this is one execa call **in addition to** the `opencode` (CLI) agent's check. Both tests below only provide 6 `once` mocks (one for each of: `claude-code`, `cursor`, `codex`, `opencode`, `openclaw`, `amp`), leaving the 7th call for `opencode-sdk` without a dedicated mock.

The tests currently pass because `vi.clearAllMocks()` clears call history but **not** mock implementations, so the persistent `mockResolvedValue` set by `should prefer claude-code over others` is still active for the 7th call — making `opencode-sdk` silently available in both tests. Since `amp` has higher priority than `opencode-sdk` in the preference order, the outcome is still correct by accident.

This is fragile: if `vi.resetAllMocks()` is ever used, or if test execution order changes, the 7th call behaviour changes and the tests may start failing or testing the wrong thing.

Add a 7th rejection mock for each test to be explicit:

```typescript
// should fall back to amp if claude-code is not available
mockExeca
  .mockRejectedValueOnce(new Error('not found')) // claude-code
  .mockRejectedValueOnce(new Error('not found')) // cursor
  .mockRejectedValueOnce(new Error('not found')) // codex
  .mockRejectedValueOnce(new Error('not found')) // opencode (CLI agent)
  .mockRejectedValueOnce(new Error('not found')) // openclaw
  .mockResolvedValueOnce({ stdout: '1.0.0', exitCode: 0 } as any) // amp
  .mockRejectedValueOnce(new Error('not found')); // opencode-sdk CLI check (opencode binary)
```

The same fix applies to `should prefer amp over cursor` (line 161). The `detectAvailableAgents` test at line 105 has the same gap and should also include a 7th mock.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: 962d42b}

src/loop/agents.ts

src/loop/__tests__/agents.test.ts

src/loop/agents.ts

rubenmarcus · 2026-03-12T23:47:07Z

@greptileai

- Add file tools (read_file, write_file, list_directory, run_command) - Implement multi-turn tool-use loop so agent can actually edit files - Fix timeout: pass as RequestOptions (2nd arg), not in message body - Fix max_tokens: use fixed 16384 instead of deriving from maxTurns - Fix system prompt: accurately describes tool capabilities Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386 Co-authored-by: Amp <amp@ampcode.com>

src/loop/agents.ts

Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386 Co-authored-by: Amp <amp@ampcode.com>

rubenmarcus · 2026-03-12T23:56:09Z

@greptileai

src/loop/agents.ts

- Prevent path traversal: resolve + realpath guard on all file tools - Gate run_command behind allowShellExecution option (default: false) - Exclude run_command from tool list entirely when shell disabled - Use remaining time budget for per-request timeout instead of full value Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386 Co-authored-by: Amp <amp@ampcode.com>

rubenmarcus · 2026-03-13T00:06:19Z

@greptileai

src/loop/agents.ts

rubenmarcus · 2026-03-13T00:32:30Z

@greptileai

Amp-Thread-ID: https://ampcode.com/threads/T-019ce48e-f36c-74d2-abf4-e980f10d7bdf Co-authored-by: Amp <amp@ampcode.com>

src/loop/__tests__/agents.test.ts

rubenmarcus · 2026-03-13T00:37:34Z

@greptileai

src/loop/agents/opencode-sdk.ts

Amp-Thread-ID: https://ampcode.com/threads/T-019ce48e-f36c-74d2-abf4-e980f10d7bdf Co-authored-by: Amp <amp@ampcode.com>

rubenmarcus · 2026-03-13T00:45:58Z

@greptileai

src/loop/agents/opencode-sdk.ts

src/loop/agents/anthropic-sdk.ts

src/loop/agents/opencode-sdk.ts

Amp-Thread-ID: https://ampcode.com/threads/T-019ce48e-f36c-74d2-abf4-e980f10d7bdf Co-authored-by: Amp <amp@ampcode.com>

rubenmarcus · 2026-03-13T01:11:00Z

@greptileai

src/loop/agents/anthropic-sdk.ts

src/loop/agents.ts

Amp-Thread-ID: https://ampcode.com/threads/T-019ce48e-f36c-74d2-abf4-e980f10d7bdf Co-authored-by: Amp <amp@ampcode.com>

rubenmarcus · 2026-03-13T01:21:18Z

@greptileai

src/loop/agents/opencode-sdk.ts

Amp-Thread-ID: https://ampcode.com/threads/T-019ce48e-f36c-74d2-abf4-e980f10d7bdf Co-authored-by: Amp <amp@ampcode.com>

rubenmarcus · 2026-03-13T01:38:05Z

@greptileai

src/loop/agents/opencode-sdk.ts

Amp-Thread-ID: https://ampcode.com/threads/T-019ce4df-e4cc-77c7-b880-50cdc69cd3db Co-authored-by: Amp <amp@ampcode.com>

rubenmarcus · 2026-03-13T01:49:10Z

@greptileai

src/loop/agents/anthropic-sdk.ts

src/loop/agents/opencode-sdk.ts

Amp-Thread-ID: https://ampcode.com/threads/T-019ce4df-e4cc-77c7-b880-50cdc69cd3db Co-authored-by: Amp <amp@ampcode.com>

rubenmarcus · 2026-03-13T01:57:26Z

@greptileai

rubenmarcus and others added 2 commits March 12, 2026 23:40

chore: release v0.4.4

35ced10

Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386 Co-authored-by: Amp <amp@ampcode.com>

github-actions bot assigned rubenmarcus Mar 12, 2026

github-actions bot added candidate-release PR is ready for release config core enhancement labels Mar 12, 2026

fix: update agent test to expect 7 agents (includes anthropic-sdk)

0b2efe0

Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386 Co-authored-by: Amp <amp@ampcode.com>

github-actions bot added the tests label Mar 12, 2026

greptile-apps bot reviewed Mar 12, 2026

View reviewed changes

src/loop/agents.ts Outdated Show resolved Hide resolved

src/loop/agents.ts Outdated Show resolved Hide resolved

src/loop/__tests__/agents.test.ts Show resolved Hide resolved

src/loop/agents.ts Outdated Show resolved Hide resolved

greptile-apps bot reviewed Mar 12, 2026

View reviewed changes

src/loop/agents.ts Outdated Show resolved Hide resolved

fix: buffer onOutput to emit complete lines, matching CLI agent contract

e385916

Amp-Thread-ID: https://ampcode.com/threads/T-019ce458-6e75-7448-b1cd-1839264ca386 Co-authored-by: Amp <amp@ampcode.com>