Skip to content
This repository was archived by the owner on Mar 20, 2026. It is now read-only.

Add Speech-to-Speech mode via AssemblyAI S2S API#131

Open
alexkroman wants to merge 1 commit into
mainfrom
server/s2s-support
Open

Add Speech-to-Speech mode via AssemblyAI S2S API#131
alexkroman wants to merge 1 commit into
mainfrom
server/s2s-support

Conversation

@alexkroman
Copy link
Copy Markdown
Owner

Summary

  • Add mode: "s2s" option to defineAgent() that replaces the STT+LLM+TTS pipeline with a single AssemblyAI Speech-to-Speech WebSocket
  • Default remains "pipeline" (existing STT+LLM+TTS behavior)
  • New server/s2s.ts WebSocket client with EventTarget pattern, session resume, and message logging
  • New server/session_s2s.ts implementing the same Session interface as the pipeline session
  • Both transports (WebSocket and Twilio) route to the correct session based on agentConfig.mode
  • executeBuiltinTool() added for direct tool execution in S2S mode (bypasses Vercel AI SDK wrappers)
  • BRAVE_API_KEY now reads from platform env (Fly secrets) instead of agent env

Usage

export default defineAgent({
  name: "my-agent",
  mode: "s2s",  // opt into S2S
});

Test plan

  • All 173+ tests pass (deno task check)
  • S2S connects, receives session.ready, sends audio bidirectionally
  • Tool calls execute and results sent back to S2S API
  • Pipeline mode unchanged when mode is omitted
  • Manual test: S2S agent with greeting, speech, tool calls
  • Manual test: Twilio S2S session

🤖 Generated with Claude Code

Agents can opt into S2S mode via `mode: "s2s"` in defineAgent(), which
replaces the STT+LLM+TTS pipeline with a single AssemblyAI S2S WebSocket.
The default remains `"pipeline"` (existing behavior).

New files:
- server/s2s.ts — S2S WebSocket client with EventTarget pattern, session
  resume, and full message logging
- server/session_s2s.ts — S2S session implementation with same Session
  interface as the pipeline session

Changes:
- sdk/types.ts — Add PipelineMode type ("s2s" | "pipeline") and mode
  field to AgentOptions/AgentDef
- core/types.ts — Add mode to AgentConfig
- server/_schemas.ts — Add mode to AgentConfigSchema
- sdk/define_agent.ts — Default mode to "pipeline"
- cli/_static_config.ts — Extract mode from agent.ts at build time
- server/types.ts — Add S2SConfig type and defaults
- server/config.ts — Add s2sConfig to PlatformConfig
- server/transport_websocket.ts — Route to S2S or pipeline session based
  on agentConfig.mode
- server/transport_twilio.ts — Same routing for Twilio transport
- server/builtin_tools.ts — Add executeBuiltinTool() for direct tool
  execution in S2S mode; read BRAVE_API_KEY from platform env
- server/deno.json — Add ws dependency for S2S WebSocket auth headers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants