Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion skills/agora/SKILL.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
name: agora
description: Write code using Agora SDKs (agora.io) for real-time communication. Covers RTC (video/voice calling, live streaming), RTM (signaling, messaging, presence), Conversational AI (voice AI agents), Cloud Recording, and server-side token generation. Use when the user wants to build real-time audio/video applications, integrate Agora SDKs (Web JS/TS, React, iOS Swift, Android Kotlin/Java, Go, Python), manage channels, tracks, tokens, use RTM for messaging/signaling, or build Conversational AI with the agent-toolkit. Triggers on mentions of Agora, agora.io, RTC, RTM, video calling, voice calling, real-time communication, agora-rtc-sdk-ng, agora-rtc-react, agora-rtm, conversational AI with Agora, Agora token generation, Cloud Recording.
description: Write code using Agora SDKs (agora.io) for real-time communication. Covers RTC (video/voice calling, live streaming), RTM (signaling, messaging, presence), Conversational AI (voice AI agents), Cloud Recording, and server-side token generation. Use when the user wants to build real-time audio/video applications, integrate Agora SDKs (Web JS/TS, React, iOS Swift, Android Kotlin/Java, Go, Python), manage channels, tracks, tokens, use RTM for messaging/signaling, or build Conversational AI with the agent-toolkit. Triggers on mentions of Agora, agora.io, RTC, RTM, video calling, voice calling, real-time communication, agora-rtc-sdk-ng, agora-rtc-react, agora-rtm, conversational AI with Agora, Agora token generation, Cloud Recording, @agora/agent-client-toolkit, @agora/agent-client-toolkit-react, AgoraVoiceAI, useConversationalAI, useTranscript, useAgentState, agent transcript, agent state hook.
metadata:
author: agora
version: '1.1.0'
Expand Down
3 changes: 2 additions & 1 deletion skills/agora/references/conversational-ai/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,8 @@ https://api.agora.io/api/conversational-ai-agent/v2/projects/{appid}
Each file maps to one repo in [AgoraIO-Conversational-AI](https://github.com/AgoraIO-Conversational-AI):

- **[agent-samples.md](agent-samples.md)** — Backend (simple-backend), React clients, profile system, MLLM/Gemini config, deployment
- **[agent-toolkit.md](agent-toolkit.md)** — `@agora/conversational-ai` SDK: ConversationalAIAPI, RTCHelper, RTMHelper, transcript handling
- **[agent-toolkit.md](agent-toolkit.md)** — `@agora/agent-client-toolkit` + `@agora/agent-client-toolkit-react`: AgoraVoiceAI, events, transcript, sendText, interrupt, React hooks
- **[agent-client-toolkit-react.md](agent-client-toolkit-react.md)** — React hooks detail: ConversationalAIProvider, useTranscript, useAgentState, useAgentError, useAgentMetrics, useConversationalAI
- **[agent-ui-kit.md](agent-ui-kit.md)** — `@agora/agent-ui-kit` React components: voice, chat, video, settings
- **[server-custom-llm.md](server-custom-llm.md)** — Custom LLM proxy: RAG, tool calling, conversation memory
- **[server-mcp.md](server-mcp.md)** — MCP memory server: persistent per-user memory via tool calling
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
---
name: agora-agent-client-toolkit-react
description: |
React hooks for Agora Conversational AI client integration. Use when the user is building
a React app with Agora ConvoAI and needs @agora/agent-client-toolkit-react hooks.
Triggers on useTranscript, useAgentState, useAgentError, useAgentMetrics,
ConversationalAIProvider, @agora/agent-client-toolkit-react, React ConvoAI hooks,
agent transcript React, agent state React hook.
license: MIT
metadata:
author: agora
version: '1.0.0'
---

# Agent Client Toolkit — React

React hooks for `@agora/agent-client-toolkit`. Wraps the `AgoraVoiceAI` singleton into React state and effects. Must be used alongside `agora-rtc-react` — this package handles ConvoAI concerns only, not RTC primitives (mic tracks, camera, remote users).

**npm:** `@agora/agent-client-toolkit-react`
**Requires:** `@agora/agent-client-toolkit`, `agora-rtc-react`, React >= 18

## Installation

```bash
npm install @agora/agent-client-toolkit-react @agora/agent-client-toolkit agora-rtc-react agora-rtm
```

## Usage

Use `ConversationalAIProvider` + standalone hooks. The provider manages the `AgoraVoiceAI` lifecycle — standalone hooks connect via context so only the components that need updates re-render.

> For simple single-component cases, `useConversationalAI` is available as a batteries-included alternative. See the [package README](https://github.com/AgoraIO-Conversational-AI/agent-client-toolkit-ts/blob/main/packages/react/README.md) for details.

```tsx
import { useMemo } from 'react';
import AgoraRTC, { AgoraRTCProvider, useJoin, useLocalMicrophoneTrack, usePublish } from 'agora-rtc-react';
import AgoraRTM from 'agora-rtm';
import {
ConversationalAIProvider,
useTranscript,
useAgentState,
useAgentError,
useAgentMetrics,
} from '@agora/agent-client-toolkit-react';

const rtcClient = AgoraRTC.createClient({ mode: 'rtc', codec: 'vp8' });
const rtmClient = new AgoraRTM.RTM('APP_ID', 'USER_ID');
await rtmClient.login({ token: 'RTM_TOKEN' });

function App() {
const config = useMemo(() => ({
channel: 'my-channel',
rtmConfig: { rtmEngine: rtmClient },
}), []);

return (
// AgoraRTCProvider (outer) → ConversationalAIProvider (inner)
<AgoraRTCProvider client={rtcClient}>
<ConversationalAIProvider config={config}>
<VoiceSession />
</ConversationalAIProvider>
</AgoraRTCProvider>
);
}

function VoiceSession() {
// Agora RTC hooks — your existing integration
useJoin({ appid: 'APP_ID', channel: 'my-channel', token: 'RTC_TOKEN' });
const { localMicrophoneTrack } = useLocalMicrophoneTrack();
usePublish([localMicrophoneTrack]);

// ConvoAI hooks — added on top
const transcript = useTranscript();
const { agentState } = useAgentState();
const { error, clearError } = useAgentError();
const { metrics } = useAgentMetrics();

return (
<div>
<p>Agent: {agentState ?? 'idle'}</p>
{error && <p onClick={clearError}>Error: {error.error.message}</p>}
<ul>{transcript.map((t) => <li key={t.turn_id}>{t.text}</li>)}</ul>
</div>
);
}
```

## Hooks Reference

### `useTranscript()`

Subscribe to transcript updates. Returns the full conversation history — replace, don't append.

```typescript
const transcript = useTranscript();
// transcript: TranscriptHelperItem[]
// Each item: { uid, turn_id, text, status, metadata }
```

### `useAgentState()`

Subscribe to `AGENT_STATE_CHANGED` events.

```typescript
const { agentState, stateEvent, agentUserId } = useAgentState();
// agentState: 'idle' | 'listening' | 'thinking' | 'speaking' | 'silent' | null
// stateEvent: { state, turnID, timestamp, reason } | null
```

Only fires when RTM is configured and agent start config includes `advanced_features.enable_rtm: true` + `parameters.data_channel: "rtm"`.

### `useAgentError()`

Subscribe to `AGENT_ERROR` and `MESSAGE_ERROR` events. Returns a discriminated union.

```typescript
const { error, clearError } = useAgentError();
// error: { source: 'agent', agentUserId, error: ModuleError }
// | { source: 'message', agentUserId, error: { type, code, message, timestamp } }
// | null
```

Call `clearError()` after dismissing (e.g. closing a toast).

### `useAgentMetrics()`

Subscribe to `AGENT_METRICS` events.

```typescript
const { metrics, agentUserId } = useAgentMetrics();
// metrics: { type: ModuleType, name: string, value: number, timestamp: number } | null
// ModuleType: 'llm' | 'mllm' | 'tts' | 'context' | 'unknown'
```

Only fires when agent start config includes `parameters.enable_metrics: true`.

## Critical Rules

1. **Wrap `config` in `useMemo`** — `ConversationalAIProvider` depends on `config.channel`. An inline object creates a new reference every render, causing unnecessary re-init cycles.
2. **`AgoraRTCProvider` must be the outer wrapper** — `ConversationalAIProvider` calls `useRTCClient()` internally and will throw if rendered outside `AgoraRTCProvider`.
3. **All standalone hooks require `ConversationalAIProvider`** — `useTranscript`, `useAgentState`, `useAgentError`, and `useAgentMetrics` won't receive events without it.
4. **Use `agora-rtc-react` for RTC primitives** — mic tracks, camera, remote users, join, and publish are handled by `agora-rtc-react` hooks. This package covers ConvoAI concerns only.
5. **RTM must be logged in before passing to config** — call `rtmClient.login()` before passing `rtmEngine` into the provider config. The provider does not manage RTM login/logout.
176 changes: 141 additions & 35 deletions skills/agora/references/conversational-ai/agent-toolkit.md
Original file line number Diff line number Diff line change
@@ -1,62 +1,168 @@
# Agent Toolkit SDK
---
name: agora-agent-client-toolkit
description: |
Client-side TypeScript SDK for adding Agora Conversational AI features to applications
already using the Agora RTC SDK. Use when the user needs to integrate @agora/agent-client-toolkit
or @agora/agent-client-toolkit-react, receive transcripts, track agent state, send messages
to an AI agent, handle agent events, or build a ConvoAI front-end client. Triggers on
@agora/agent-client-toolkit, AgoraVoiceAI, useConversationalAI, useTranscript, useAgentState,
agent transcript, agent state, TRANSCRIPT_UPDATED, AGENT_STATE_CHANGED, ConversationalAIProvider.
license: MIT
metadata:
author: agora
version: '1.0.0'
---

# Agent Client Toolkit

Client-side SDK for adding Agora Conversational AI features to applications already using the Agora RTC SDK. Runs in the browser — adds transcript rendering, agent state tracking, and RTM-based messaging controls on top of `agora-rtc-sdk-ng`.

**npm:** `@agora/agent-client-toolkit` (core) · `@agora/agent-client-toolkit-react` (React hooks)
**Repo:** <https://github.com/AgoraIO-Conversational-AI/agent-client-toolkit-ts>

> This toolkit is a **client add-on** — it does not start agents. Start agents via the ConvoAI REST API first. See [README.md](README.md) for the REST API.

TypeScript SDK (`@agora/conversational-ai`) — framework-agnostic core for ConvoAI clients.
## Installation

**Repo:** <https://github.com/AgoraIO-Conversational-AI/agent-toolkit>
**npm:** `@agora/conversational-ai`
```bash
npm install @agora/agent-client-toolkit agora-rtc-sdk-ng agora-rtm

## Installation
# React
npm install @agora/agent-client-toolkit-react agora-rtc-react
```

> See [README — Quick Start](https://github.com/AgoraIO-Conversational-AI/agent-toolkit#quick-start)
## Initialization

## ConversationalAIAPI
`AgoraVoiceAI.init()` is **async** — always `await` it. Pass the RTC client you already have.

Main orchestration class. Handles init, agent connection, message sending, transcript management.
```typescript
import AgoraRTC from 'agora-rtc-sdk-ng';
import AgoraRTM from 'agora-rtm';
import { AgoraVoiceAI } from '@agora/agent-client-toolkit';

> **[README — ConversationalAIAPI](https://github.com/AgoraIO-Conversational-AI/agent-toolkit#conversationalapiapi)** — init(), getInstance(), sendMessage(), getTranscript(), events
// Your existing RTC + RTM setup
const rtcClient = AgoraRTC.createClient({ mode: 'rtc', codec: 'vp8' });
const rtmClient = new AgoraRTM.RTM('APP_ID', 'USER_ID');
await rtmClient.login({ token: 'RTM_TOKEN' });

## RTCHelper
// Initialize the toolkit — pass your existing clients
const ai = await AgoraVoiceAI.init({
rtcEngine: rtcClient,
rtmConfig: { rtmEngine: rtmClient }, // optional — needed for sendText/interrupt
});

Agora RTC wrapper. Audio/video track management, join/leave, publish/unpublish, volume monitoring.
// Join + publish via RTC directly (toolkit does not wrap join/publish)
await rtcClient.join('APP_ID', 'CHANNEL', 'RTC_TOKEN', null);
const micTrack = await AgoraRTC.createMicrophoneAudioTrack();
await rtcClient.publish([micTrack]);

- `createVideoTrack()`, `setVideoEnabled()`, `getVideoEnabled()` for video lifecycle
- Subscription filters: `shouldSubscribeAudio(uid)`, `shouldSubscribeVideo(uid)` callbacks in init config
// Start receiving agent messages
ai.subscribeMessage('CHANNEL');
```

> **[README — RTCHelper](https://github.com/AgoraIO-Conversational-AI/agent-toolkit#rtchelper)** — join, tracks, shouldSubscribeAudio/Video callbacks
## Configuration

## RTMHelper
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `rtcEngine` | `IAgoraRTCClient` | Yes | Your existing Agora RTC client |
| `rtmConfig` | `{ rtmEngine: RTMClient }` | No | Pass your RTM client for sendText/interrupt |
| `renderMode` | `TranscriptHelperMode` | No | `TEXT`, `WORD`, `CHUNK`, `AUTO` (default: `AUTO`) |
| `enableLog` | `boolean` | No | Debug logging (default: `false`) |
| `enableAgoraMetrics` | `boolean` | No | Load `@agora-js/report` for usage metrics |

Agora RTM wrapper for text messaging alongside voice.
## Events

- Dynamic key auth fix: token now correctly passed for `APP_CERTIFICATE`-enabled projects
Register handlers before calling `subscribeMessage()`. All 9 events:

> **[README — RTMHelper](https://github.com/AgoraIO-Conversational-AI/agent-toolkit#rtmhelper)**
```typescript
import { AgoraVoiceAIEvents } from '@agora/agent-client-toolkit';

## SubRenderController
// Transcript — delivers FULL history every time, replace don't append
ai.on(AgoraVoiceAIEvents.TRANSCRIPT_UPDATED, (transcript) => {
renderTranscript(transcript);
});

Queue-based message processing with PTS sync, deduplication, render modes.
// Agent state — requires RTM + enable_rtm: true in agent start config
ai.on(AgoraVoiceAIEvents.AGENT_STATE_CHANGED, (agentUserId, event) => {
// event.state: 'idle' | 'listening' | 'thinking' | 'speaking' | 'silent'
updateStatusUI(event.state);
});

> **[README — SubRenderController](https://github.com/AgoraIO-Conversational-AI/agent-toolkit#subrendercontroller)**
// Agent interrupted (user cut off agent's response)
ai.on(AgoraVoiceAIEvents.AGENT_INTERRUPTED, (agentUserId, event) => {
// event: { turnID: number, timestamp: number }
});

## React Hooks
// Performance metrics — requires enable_metrics: true in agent start config
ai.on(AgoraVoiceAIEvents.AGENT_METRICS, (agentUserId, metrics) => {
// metrics: { type: ModuleType, name: string, value: number, timestamp: number }
// ModuleType: 'llm' | 'mllm' | 'tts' | 'context' | 'unknown'
});

- `useLocalVideo` — local camera track management
- `useRemoteVideo` — remote video subscription
// Agent pipeline error — requires enable_error_message: true in agent start config
ai.on(AgoraVoiceAIEvents.AGENT_ERROR, (agentUserId, error) => {
// error: { type: ModuleType, code: number, message: string, timestamp: number }
showErrorToast(error.message);
});

> **[README — React Integration](https://github.com/AgoraIO-Conversational-AI/agent-toolkit#react-integration)**
// Message delivery receipt — requires RTM
ai.on(AgoraVoiceAIEvents.MESSAGE_RECEIPT_UPDATED, (agentUserId, receipt) => {});

## Events
// RTM message delivery failure
ai.on(AgoraVoiceAIEvents.MESSAGE_ERROR, (agentUserId, error) => {});

// Speech Activity Level registration status — requires RTM
ai.on(AgoraVoiceAIEvents.MESSAGE_SAL_STATUS, (agentUserId, salStatus) => {});

// Internal debug log
ai.on(AgoraVoiceAIEvents.DEBUG_LOG, (message) => {});
```

## Sending Messages & Interrupting

- `transcript-updated` — new/updated transcript items
- `connection-state-changed` — agent connection lifecycle
- `agent-error` — error events
Requires `rtmConfig` — throws `RTMRequiredError` if called without RTM.

> **[README — Events](https://github.com/AgoraIO-Conversational-AI/agent-toolkit#events)**
```typescript
import { ChatMessageType, ChatMessagePriority } from '@agora/agent-client-toolkit';

## Types
// Send text to the agent
await ai.sendText(agentUserId, {
messageType: ChatMessageType.TEXT,
text: 'What is the weather like today?',
priority: ChatMessagePriority.INTERRUPTED, // interrupts current agent speech
responseInterruptable: true,
});

- `TranscriptItem`, `UserTranscription`, `AgentTranscription`
- `AgentState`, `TurnStatus`, `ConnectionState`
- `MessageInterrupt`, `Word`
// Send image to the agent
await ai.sendImage(agentUserId, {
messageType: ChatMessageType.IMAGE,
uuid: crypto.randomUUID(),
url: 'https://example.com/image.png',
});

// Interrupt the agent's current speech
await ai.interrupt(agentUserId);
```

## Cleanup

```typescript
ai.unsubscribe(); // stop receiving channel messages
ai.destroy(); // remove all event handlers, clear singleton

await rtmClient.logout(); // you manage RTM lifecycle
```

## Critical Rules

1. **`init()` is async** — always `await AgoraVoiceAI.init()`. Missing the await causes `getInstance()` to throw `NotInitializedError`.
2. **Register events before `subscribeMessage()`** — events from messages already in the stream will be missed otherwise.
3. **Transcript replaces, never appends** — `TRANSCRIPT_UPDATED` delivers the complete history every time. Set state to the full array, not `prev.concat(next)`.
4. **`AgoraVoiceAI` is a singleton** — calling `init()` twice replaces the first instance. Call `destroy()` before re-initializing.
5. **RTM is optional but required for several features** — `sendText`, `sendImage`, and `interrupt` throw `RTMRequiredError` without `rtmConfig`. `AGENT_STATE_CHANGED`, `MESSAGE_RECEIPT_UPDATED`, `MESSAGE_ERROR`, `MESSAGE_SAL_STATUS` only fire with RTM.
6. **Agent start config flags are required for some events** — `AGENT_STATE_CHANGED` requires `advanced_features.enable_rtm: true` AND `parameters.data_channel: "rtm"`. `AGENT_METRICS` requires `parameters.enable_metrics: true`. `AGENT_ERROR` requires `parameters.enable_error_message: true`.
7. **Toolkit does not wrap join/publish** — call `rtcClient.join()` and `rtcClient.publish()` yourself before `subscribeMessage()`.

## React Hooks

> **[README — Types](https://github.com/AgoraIO-Conversational-AI/agent-toolkit#types)**
For React integration, see **[agent-client-toolkit-react.md](agent-client-toolkit-react.md)**.
6 changes: 3 additions & 3 deletions skills/agora/references/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ The `agora-agent-sdk` TypeScript SDK supports both token-based auth and Basic Au
- **Basic Auth (legacy)**: Pass `customerId` + `customerSecret` (from Agora Console → Developer Toolkit → RESTful API).

See the agent SDK READMEs for full examples:
- [agora-agent-ts-sdk](https://github.com/AgoraIO-Conversational-AI/agora-agent-ts-sdk)
- [agora-agent-go-sdk](https://github.com/AgoraIO-Conversational-AI/agora-agent-go-sdk)
- [agora-agent-python-sdk](https://github.com/AgoraIO-Conversational-AI/agora-agent-python-sdk)
- [agent-server-sdk-ts](https://github.com/AgoraIO-Conversational-AI/agent-server-sdk-ts)
- [agent-server-sdk-go](https://github.com/AgoraIO-Conversational-AI/agent-server-sdk-go)
- [agent-server-sdk-python](https://github.com/AgoraIO-Conversational-AI/agent-server-sdk-python)

## Reference Files

Expand Down
Loading