diff --git a/README.md b/README.md index af88933..5e92b26 100644 --- a/README.md +++ b/README.md @@ -1,427 +1,707 @@ -# Agent Framework +# MiniAgent Framework -A platform-agnostic agent framework for building autonomous AI agents with tool execution capabilities. +A TypeScript-first, streaming-only AI agent framework for building autonomous agents with sophisticated tool execution capabilities. ## Install -npm install @continue-reasoning/mini-agent +```bash +pnpm install @continue-reasoning/mini-agent +``` ## Features ### LLM Providers -- [x] Gemini -- [ ] Vercel -- [ ] OpenAI -- [ ] Anthropic +- [x] **Gemini** - Google Gemini 2.0 Flash with native tool calling +- [x] **OpenAI** - GPT-4o with function calling and response caching +- [ ] Anthropic Claude (coming soon) +- [ ] Vercel AI SDK integration (planned) ### Core Features -- [x] ChatHistory -- [x] eventStream -- [x] streaming -- [x] toolScheduler - -### External Feature -- [ ] support mcp +- [x] **Session Management** - Multi-session conversation handling with isolation +- [x] **Event Stream** - Comprehensive real-time event system (20+ event types) +- [x] **Streaming-Only** - All responses are streamed for optimal UX +- [x] **Tool Scheduler** - Sophisticated tool execution with approval workflows +- [x] **Token Tracking** - Real-time usage monitoring with automatic history management +- [x] **BaseTool System** - Extensible tool creation with built-in validation and lifecycle +- [x] **Type Safety** - Full TypeScript support with comprehensive interfaces + +### Advanced Features +- [x] **Tool Confirmation** - User approval workflows for destructive operations +- [x] **Parallel Execution** - Concurrent tool execution with state management +- [x] **Abort Control** - Comprehensive cancellation and timeout support +- [x] **Error Recovery** - Graceful error handling with detailed error events +- [x] **History Management** - Automatic context window management +- [x] **Output Streaming** - Real-time tool output updates during execution + +### External Integrations +- [x] **MCP Support** - Model Context Protocol integration (in development) +- [ ] **Plugin System** - External tool discovery and loading ## Key Design Principles -1. **Streaming-First**: We only support streaming because streaming can implement call functionality -2. **Interface-Driven**: Pre-defined interfaces with core implementation by BaseAgent -3. **Platform-Agnostic**: Clean abstractions that work with any LLM provider -4. **Event-Based**: Comprehensive event system for real-time monitoring +1. **Streaming-First**: All responses are streamed by default - non-streaming is implemented by collecting stream chunks +2. **Interface-Driven**: Clean TypeScript interfaces with flexible implementations (BaseAgent, StandardAgent) +3. **Platform-Agnostic**: Provider-agnostic design that works with any LLM (Gemini, OpenAI, etc.) +4. **Event-Based**: Rich event system with 20+ event types for comprehensive monitoring +5. **Type-Safe**: Full TypeScript support with generics for tools and comprehensive interface definitions +6. **Session-Aware**: Built-in multi-session management with isolated conversation contexts ## Architecture image ``` -BaseAgent (Core Implementation) -├── IChat (LLM Interface) -│ └── GeminiChat (Gemini Provider Implementation) -├── IToolScheduler (Tool Execution) -│ └── CoreToolScheduler -├── ITokenTracker (Token Monitoring) -│ └── TokenTracker -└── AgentEvent (Event System) +StandardAgent (Session-Aware Agent) +├── BaseAgent (Core Implementation) +│ ├── IChat (LLM Interface) +│ │ ├── GeminiChat (Google Gemini Implementation) +│ │ └── OpenAIChat (OpenAI GPT Implementation) +│ ├── IToolScheduler (Tool Execution) +│ │ └── CoreToolScheduler (Parallel execution with approval workflows) +│ └── ITokenTracker (Token Monitoring) +│ └── TokenTracker (Real-time usage tracking) +├── SessionManager (Multi-session Management) +└── AgentEvent (Event System - 20+ event types) ``` ### Core Components -- **BaseAgent**: Main orchestrator that connects all interfaces -- **IChat**: Streaming-first chat interface for LLM communication -- **IToolScheduler**: Manages tool execution with confirmation workflows -- **ITokenTracker**: Real-time token usage tracking -- **AgentEvent**: Event emission for monitoring agent behavior +- **StandardAgent**: Session-aware agent with multi-conversation management +- **BaseAgent**: Core orchestrator implementing the main agent processing loop +- **IChat**: Streaming-first chat interface supporting multiple LLM providers +- **IToolScheduler**: Advanced tool execution with parallel processing and user confirmation +- **ITokenTracker**: Real-time token usage monitoring with automatic history management +- **SessionManager**: Isolated conversation contexts with persistence support +- **AgentEvent**: Comprehensive event system for real-time monitoring and integration ## Usage Example -### Define Custom Tools +### Quick Start ```typescript -import { BaseTool, ToolResult } from '@gemini-tool/agent'; -import { Type } from '@google/genai'; - -// Define a weather tool -export class WeatherTool extends BaseTool<{ latitude: number; longitude: number }> { +import { StandardAgent, AgentEventType, AllConfig } from '@continue-reasoning/mini-agent'; +import { BaseTool, DefaultToolResult, Type } from '@continue-reasoning/mini-agent'; + +// 1. Create a custom tool using BaseTool +export class WeatherTool extends BaseTool< + { latitude: number; longitude: number }, + { temperature: number; location: string } +> { constructor() { super( - 'get_weather', // Tool name - 'Weather Tool', // Display name - 'Get current weather temperature', // Description + 'get_weather', // Tool name + 'Weather Tool', // Display name + 'Get current weather temperature', // Description { type: Type.OBJECT, properties: { - latitude: { - type: Type.NUMBER, - description: 'Latitude coordinate' - }, - longitude: { - type: Type.NUMBER, - description: 'Longitude coordinate' - } + latitude: { type: Type.NUMBER, description: 'Latitude coordinate' }, + longitude: { type: Type.NUMBER, description: 'Longitude coordinate' } }, required: ['latitude', 'longitude'] }, - false, // isOutputMarkdown - true // canUpdateOutput + true, // isOutputMarkdown + true // canUpdateOutput for real-time updates ); } - validateToolParams(params: { latitude: number; longitude: number }): string | null { + override validateToolParams(params: { latitude: number; longitude: number }): string | null { if (params.latitude < -90 || params.latitude > 90) { return 'Latitude must be between -90 and 90'; } + if (params.longitude < -180 || params.longitude > 180) { + return 'Longitude must be between -180 and 180'; + } return null; } async execute( params: { latitude: number; longitude: number }, - abortSignal: AbortSignal, - outputUpdateHandler?: (output: string) => void - ): Promise { - // Fetch weather data... - const temperature = await this.fetchWeatherData(params.latitude, params.longitude); - - return this.createResult( - `Weather: ${temperature}°C`, // LLM content - `🌤️ Temperature: ${temperature}°C`, // Display content - `Retrieved weather: ${temperature}°C` // Summary - ); + signal: AbortSignal, + updateOutput?: (output: string) => void + ): Promise> { + try { + // Check for cancellation + this.checkAbortSignal(signal, 'Weather fetch'); + + // Update progress in real-time + if (updateOutput) { + updateOutput(this.formatProgress('Fetching weather', 'Connecting to API...', '🌤️')); + } + + // Simulate API call + const temperature = Math.round(Math.random() * 35 + 5); // 5-40°C + + if (updateOutput) { + updateOutput(this.formatProgress('Weather retrieved', `${temperature}°C`, '✅')); + } + + const result = { temperature, location: `${params.latitude},${params.longitude}` }; + + return new DefaultToolResult(this.createResult( + `Weather: ${temperature}°C at coordinates ${params.latitude}, ${params.longitude}`, + `🌤️ Temperature: **${temperature}°C**`, + `Retrieved weather: ${temperature}°C` + )); + + } catch (error) { + return new DefaultToolResult(this.createErrorResult(error, 'Weather fetch')); + } } } -``` - -### Use Agent with Tools - -```typescript -import { StandardAgent, AgentEventType, AllConfig } from '@gemini-tool/agent'; -// Configure agent with tool execution callbacks +// 2. Configure and create the agent const config: AllConfig = { agentConfig: { - model: 'gemini-2.0-flash', + model: 'gpt-4o', // or 'gemini-2.0-flash' workingDirectory: process.cwd(), - apiKey: process.env.GEMINI_API_KEY, - sessionId: 'demo-session', + apiKey: process.env.OPENAI_API_KEY, // or GEMINI_API_KEY maxHistoryTokens: 100000, }, chatConfig: { - apiKey: process.env.GEMINI_API_KEY, - modelName: 'gemini-2.0-flash', - tokenLimit: 100000, - systemPrompt: 'You are a helpful assistant with weather and calculation tools.', + apiKey: process.env.OPENAI_API_KEY, // or GEMINI_API_KEY + modelName: 'gpt-4o', // or 'gemini-2.0-flash' + tokenLimit: 128000, + systemPrompt: 'You are a helpful assistant with weather capabilities.', }, toolSchedulerConfig: { - approvalMode: 'yolo', // Auto-approve for demo - - // Optional: Subscribe to tool execution events - onToolCallsUpdate: (toolCalls) => { - // Called whenever tool state changes - toolCalls.forEach(call => { - console.log(`[${call.request.name}] Status: ${call.status}`); - }); - }, - - outputUpdateHandler: (callId, output) => { - // Called for real-time tool output - console.log(`Tool output: ${output}`); - }, - - onAllToolCallsComplete: (completedCalls) => { - // Called when all tools finish - console.log(`Completed ${completedCalls.length} tool calls`); - } + approvalMode: 'yolo', // Auto-approve for demo }, }; -// Create agent with tools -const agent = new StandardAgent( - [new WeatherTool(), new SubtractionTool()], - config -); - -// Process user input with streaming and event handling -const userInput = 'Get weather for Beijing and Shanghai, then calculate the temperature difference'; -const sessionId = 'demo-session'; -const abortController = new AbortController(); - -// Set a timeout for the operation -setTimeout(() => { - console.log('⏰ Timeout reached, aborting...'); - abortController.abort(); -}, 30000); +// 3. Create agent with tools +const agent = new StandardAgent([new WeatherTool()], config); -console.log(`👤 User: ${userInput}`); -console.log('🤖 Assistant: '); +// 4. Process user input with streaming +const userInput = 'What is the weather like in Tokyo (latitude: 35.6762, longitude: 139.6503)?'; -for await (const event of agent.process(userInput, sessionId, abortController.signal)) { +for await (const event of agent.processWithSession(userInput)) { switch (event.type) { - case AgentEventType.AssistantMessage: - // Complete assistant response - console.log('🤖 Assistant Response:', event.data); + case AgentEventType.ResponseChunkTextDelta: + // Real-time text streaming + process.stdout.write(event.data.content.text_delta); break; - case AgentEventType.UserMessage: - // User message processed - console.log('👤 User message:', event.data); + case AgentEventType.ToolExecutionStart: + console.log(`🔧 Executing: ${event.data.toolName}`); break; - case AgentEventType.TurnComplete: - // Conversation turn completed - console.log('🔄 Turn complete:', event.data); + case AgentEventType.ToolExecutionDone: + console.log(`✅ Completed: ${event.data.toolName}`); break; - case AgentEventType.ToolCallRequest: - // Tool execution requested - console.log(`🔧 Tool requested: ${event.data.toolCall.name}`); - console.log(` Args: ${JSON.stringify(event.data.toolCall.args)}`); - break; - - case AgentEventType.ToolCallResponse: - // Tool execution completed - console.log(`🛠️ Tool response: ${event.data}`); - break; - - case AgentEventType.TokenUsage: - // Token usage update - console.log(`📊 Token usage: ${event.data.usage.totalTokens} tokens`); - break; - - case AgentEventType.Error: - // Error occurred - console.error(`❌ Error: ${event.data.message}`); + case AgentEventType.ResponseComplete: + console.log('\n✨ Response complete'); break; } } - -// Get final status after processing -const status = agent.getStatus(); -console.log('\n📊 Final Status:'); -console.log(` • Processing: ${status.isRunning ? 'Yes' : 'No'}`); -console.log(` • Tokens used: ${status.tokenUsage.totalTokens}`); -console.log(` • Usage: ${status.tokenUsage.usagePercentage.toFixed(2)}%`); - -// Get detailed token usage -const tokenUsage = agent.getTokenUsage(); -console.log('\n📈 Token Usage Summary:'); -console.log(` • Input tokens: ${tokenUsage.inputTokens}`); -console.log(` • Output tokens: ${tokenUsage.outputTokens}`); -console.log(` • Total tokens: ${tokenUsage.totalTokens}`); ``` -## Event System +### Multi-Session Management -The agent emits various events during processing that you can handle: +```typescript +import { StandardAgent } from '@continue-reasoning/mini-agent'; -### Event Types +// Create agent with session management +const agent = new StandardAgent(tools, config); -| Event Type | Description | When Emitted | -|------------|-------------|--------------| -| `AssistantMessage` | Complete assistant response | When the assistant finishes responding | -| `UserMessage` | User message processed | When user input is processed | -| `TurnComplete` | Conversation turn finished | When a complete turn (user + assistant) is done | -| `ToolCallRequest` | Tool execution started | When a tool is requested for execution | -| `ToolCallResponse` | Tool execution finished | When a tool completes execution | -| `TokenUsage` | Token usage update | Periodically during processing | -| `Error` | Error occurred | When an error happens during processing | +// Create multiple conversation sessions +const session1 = agent.createNewSession('Weather Analysis'); +const session2 = agent.createNewSession('Math Calculations'); -### Event Handling Best Practices +// Use different sessions for different conversations +for await (const event of agent.processWithSession('What is the weather in Tokyo?', session1)) { + // Handle weather conversation in session1 +} -1. **Handle all event types** to ensure robust error handling -2. **Use AbortController** for timeout and cancellation control -3. **Monitor token usage** to avoid hitting limits -4. **Check status after processing** for final statistics +for await (const event of agent.processWithSession('Calculate 15 + 27', session2)) { + // Handle math conversation in session2 +} -### Complete Event Flow Example +// Switch between sessions +agent.switchToSession(session1); +for await (const event of agent.processWithSession('How about the weather in London?', session1)) { + // Continue weather conversation in session1 +} + +// Get session information +const sessions = agent.getSessions(); +sessions.forEach(session => { + console.log(`Session: ${session.title}`); + console.log(`Messages: ${session.messageHistory.length}`); + console.log(`Tokens: ${session.tokenUsage.totalTokens}`); +}); +``` + +## Event System + +MiniAgent provides a comprehensive event system with 20+ event types for real-time monitoring: + +### Core Event Types + +| Event Type | Description | Data Structure | +|------------|-------------|----------------| +| **LLM Response Events** | +| `ResponseChunkTextDelta` | Real-time text streaming | `{ content: { text_delta: string } }` | +| `ResponseChunkTextDone` | Text complete | `{ content: { text: string } }` | +| `ResponseChunkThinkingDelta` | Thinking process streaming | `{ content: { thinking_delta: string } }` | +| `ResponseChunkThinkingDone` | Thinking process complete | `{ content: { thinking: string } }` | +| `ResponseChunkFunctionCallDelta` | Function call parameters streaming | `{ content: { functionCall: { name: string, args: string } } }` | +| `ResponseChunkFunctionCallDone` | Function call parameters complete | `{ content: { functionCall: { id: string, call_id: string, name: string, args: string } } }` | +| `ResponseComplete` | Response finished | `{ response_id: string, usage: TokenUsage }` | +| **Tool Execution Events** | +| `ToolExecutionStart` | Tool begins execution | `{ toolName: string, callId: string, args: Record, sessionId: string, turn: number }` | +| `ToolExecutionDone` | Tool completes | `{ toolName: string, callId: string, result?: unknown, error?: string, duration?: number, sessionId: string, turn: number }` | +| **Session Events** | +| `UserMessage` | User input processed | `{ type: string, content: string, sessionId: string, turn: number, metadata?: any }` | +| `TurnComplete` | Conversation turn done | `{ type: string, sessionId: string, turn: number, hasToolCalls: boolean }` | +| **Error Events** | +| `Error` | General errors | `{ message: string, timestamp: number, turn: number }` | +| `ResponseFailed` | LLM response failed | `{ response_id: string, error: { code?: string, message?: string } }` | +| `ResponseIncomplete` | Response not complete | `{ response_id: string, incomplete_details: { reason: string } }` | + +### Event Handling Best Practices + +1. **Choose between Delta and Done events** - Don't handle both `*Delta` and `*Done` events for the same content type + - **Delta events** (`ResponseChunkTextDelta`, `ResponseChunkThinkingDelta`, `ResponseChunkFunctionCallDelta`) - Use only when real-time streaming UX is critical + - **Done events** (`ResponseChunkTextDone`, `ResponseChunkThinkingDone`, `ResponseChunkFunctionCallDone`) - **Recommended for most cases** - contains complete content + - Done events contain the full aggregated content from all corresponding delta events + +2. **Recommended event handling pattern**: + ```typescript + // ✅ Good - Handle complete content + case AgentEventType.ResponseChunkTextDone: + console.log('Complete response:', event.data.content.text); + break; + + // ❌ Avoid - Don't handle both delta and done + case AgentEventType.ResponseChunkTextDelta: + // Only use when real-time streaming is essential + case AgentEventType.ResponseChunkTextDone: + // Don't handle both - creates duplicate content + ``` + +3. **Tool execution monitoring** - **Recommended**: Use `ToolExecutionStart/Done` events for tool tracking + - `ToolExecutionStart/Done` - **Best practice** - High-level tool execution lifecycle + - `ResponseChunkFunctionCallDelta/Done` - **Only when needed** - Low-level function call parameter streaming + - Function call events show LLM preparing tool calls, execution events show actual tool running + +4. **Use AbortController** - Implement timeout and cancellation control +5. **Error handling** - Listen for `Error` and `ResponseFailed` events +6. **Token monitoring** - Track usage with `ResponseComplete` event data + +### Complete Event Flow Examples + +#### Recommended Pattern (Using Done Events) ```typescript -// Set up proper error handling and timeouts +import { AgentEventType } from '@continue-reasoning/mini-agent'; + +// ✅ Recommended - Handle complete content with Done events const abortController = new AbortController(); -setTimeout(() => abortController.abort(), 30000); // 30 second timeout +setTimeout(() => abortController.abort(), 30000); try { - for await (const event of agent.process(userInput, sessionId, abortController.signal)) { + for await (const event of agent.processWithSession(userInput, sessionId, abortController.signal)) { switch (event.type) { case AgentEventType.UserMessage: - // Log user input processing - console.log('Processing:', event.data); + // ✅ Type-safe access to user message data + const userMsgData = event.data as { + content: string; + sessionId: string; + turn: number; + type: string; + }; + console.log(`👤 Processing user input (Turn ${userMsgData.turn}): ${userMsgData.content}`); + break; + + case AgentEventType.ResponseChunkTextDone: + // ✅ Get complete text content - recommended approach + const textData = event.data as { content: { text: string } }; + console.log('🤖 Assistant:', textData.content.text); break; - case AgentEventType.ToolCallRequest: - // Show tool being executed - console.log(`Executing: ${event.data.toolCall.name}`); + case AgentEventType.ResponseChunkThinkingDone: + // ✅ Get complete thinking process if needed + const thinkingData = event.data as { content: { thinking: string } }; + console.log('🧠 Reasoning:', thinkingData.content.thinking); break; - case AgentEventType.AssistantMessage: - // Display final response - console.log('Response:', event.data); + case AgentEventType.ResponseChunkFunctionCallDone: + // ✅ Get complete function call parameters - useful for debugging + const funcCallData = event.data as { + content: { + functionCall: { + id: string; + call_id: string; + name: string; + args: string; + } + } + }; + const funcCall = funcCallData.content.functionCall; + console.log(`🔧 LLM prepared tool: ${funcCall.name} with args: ${funcCall.args}`); + break; + + case AgentEventType.ToolExecutionStart: + // ✅ Recommended - Track actual tool execution with full context + const toolStartData = event.data as { + toolName: string; + callId: string; + args: Record; + sessionId: string; + turn: number; + }; + console.log(`⚙️ Tool executing: ${toolStartData.toolName} (Turn ${toolStartData.turn})`); + console.log(` Args:`, toolStartData.args); + break; + + case AgentEventType.ToolExecutionDone: + // ✅ Recommended - Track tool completion with full context + const toolDoneData = event.data as { + toolName: string; + callId: string; + result?: unknown; + error?: string; + duration?: number; + sessionId: string; + turn: number; + }; + if (toolDoneData.error) { + console.log(`❌ Tool failed: ${toolDoneData.toolName} - ${toolDoneData.error}`); + } else { + console.log(`✅ Tool completed: ${toolDoneData.toolName} (${toolDoneData.duration}ms)`); + } + break; + + case AgentEventType.ResponseComplete: + // ✅ Access complete response with token usage + const completeData = event.data as { + response_id: string; + usage?: { + inputTokens: number; + outputTokens: number; + totalTokens: number; + } + }; + console.log(`📊 Response complete - Tokens: ${completeData.usage?.totalTokens || 0}`); break; case AgentEventType.TurnComplete: - // Turn finished, ready for next input - console.log('Turn completed'); + // ✅ Turn completion with context + const turnData = event.data as { + type: string; + sessionId: string; + turn: number; + hasToolCalls: boolean; + }; + console.log(`🔄 Turn ${turnData.turn} completed ${turnData.hasToolCalls ? 'with' : 'without'} tool calls`); break; case AgentEventType.Error: - console.error('Error:', event.data.message); + // ✅ Structured error handling + const errorData = event.data as { + message: string; + timestamp: number; + turn: number; + }; + console.error(`❌ Error (Turn ${errorData.turn}): ${errorData.message}`); + break; + + case AgentEventType.ResponseFailed: + // ✅ Handle response failures + const failedData = event.data as { + response_id: string; + error: { code?: string; message?: string }; + }; + console.error(`❌ Response failed: ${failedData.error.message || 'Unknown error'}`); break; } } } catch (error) { if (abortController.signal.aborted) { - console.log('Operation timed out'); + console.log('⏰ Operation timed out'); } else { - console.error('Unexpected error:', error); + console.error('💥 Unexpected error:', error); } } ``` -## Tool Execution Callbacks - -The agent framework provides three callbacks to monitor tool execution: - -### 1. `onToolCallsUpdate` - State Change Notifications -Called whenever any tool changes state (validating → scheduled → executing → success/error): +#### Real-Time Streaming Pattern (When UX is Critical) ```typescript -onToolCallsUpdate: (toolCalls: IToolCall[]) => { - toolCalls.forEach(call => { - if (call.status === 'awaiting_approval') { - // Handle tool confirmation UI - showConfirmationDialog(call); - } - }); -} -``` +import { AgentEventType } from '@continue-reasoning/mini-agent'; -### 2. `outputUpdateHandler` - Real-time Output -Called during tool execution for streaming output: +// 🎯 Only use when real-time streaming UX is essential +let assistantResponse = ''; -```typescript -outputUpdateHandler: (callId: string, output: string) => { - // Stream output to UI or logs - appendToToolOutput(callId, output); +for await (const event of agent.processWithSession(userInput)) { + switch (event.type) { + case AgentEventType.ResponseChunkTextDelta: + // ⚡ Real-time character-by-character streaming + const deltaData = event.data as { content: { text_delta: string } }; + const delta = deltaData.content.text_delta; + process.stdout.write(delta); + assistantResponse += delta; + break; + + case AgentEventType.ResponseChunkFunctionCallDelta: + // 🎯 Real-time function call parameter streaming (if needed for UX) + const funcDeltaData = event.data as { + content: { functionCall: { name: string; args: string } } + }; + const funcDelta = funcDeltaData.content.functionCall; + console.log(`\n🔧 LLM preparing: ${funcDelta.name}...`); + break; + + // ❌ Don't handle corresponding Done events when using Delta events + // case AgentEventType.ResponseChunkTextDone: + // case AgentEventType.ResponseChunkFunctionCallDone: + // // These would duplicate content from delta events + + case AgentEventType.ToolExecutionStart: + // ✅ Type-safe tool execution tracking + const toolStartData = event.data as { + toolName: string; + callId: string; + args: Record; + sessionId: string; + turn: number; + }; + console.log(`\n⚙️ Tool executing: ${toolStartData.toolName} (Turn ${toolStartData.turn})`); + break; + + case AgentEventType.ToolExecutionDone: + // ✅ Type-safe tool completion tracking + const toolDoneData = event.data as { + toolName: string; + callId: string; + result?: unknown; + error?: string; + duration?: number; + sessionId: string; + turn: number; + }; + const status = toolDoneData.error ? '❌ Failed' : '✅ Completed'; + console.log(`\n${status}: ${toolDoneData.toolName} (${toolDoneData.duration || 0}ms)`); + break; + + case AgentEventType.ResponseComplete: + // ✅ Final response with token usage + const completeData = event.data as { + response_id: string; + usage?: { totalTokens: number } + }; + console.log(`\n📊 Final response complete (${completeData.usage?.totalTokens || 0} tokens)`); + console.log(`Full response: ${assistantResponse}`); + break; + + case AgentEventType.Error: + // ✅ Handle streaming errors + const errorData = event.data as { + message: string; + timestamp: number; + turn: number; + }; + console.error(`\n❌ Streaming error (Turn ${errorData.turn}): ${errorData.message}`); + break; + } } ``` -### 3. `onAllToolCallsComplete` - Completion Handler -Called once when all tools finish execution: +## Tool Execution System -```typescript -onAllToolCallsComplete: (completedCalls: ICompletedToolCall[]) => { - // Show execution summary - const summary = completedCalls.map(tc => - `${tc.request.name}: ${tc.status} (${tc.durationMs}ms)` - ).join('\n'); - console.log(summary); -} -``` +The framework provides sophisticated tool execution with approval workflows and real-time monitoring: -### Handling Tool Confirmations +### Tool Scheduler Callbacks -When `approvalMode` is not 'yolo', tools may require user confirmation: +Configure callbacks to monitor tool execution lifecycle: ```typescript const config: AllConfig = { - // ... other config toolSchedulerConfig: { - approvalMode: 'default', // Requires confirmation for destructive operations + approvalMode: 'default', // 'yolo' | 'always' | 'default' - onToolCallsUpdate: async (toolCalls) => { - const waitingTools = toolCalls.filter( - tc => tc.status === 'awaiting_approval' - ); - - for (const tool of waitingTools) { - const approved = await showConfirmationUI(tool); + // 1. Real-time tool output streaming + outputUpdateHandler: (callId: string, output: string) => { + console.log(`[${callId}] ${output}`); + // Stream to UI in real-time + }, + + // 2. Tool state change notifications + onToolCallsUpdate: (toolCalls: IToolCall[]) => { + toolCalls.forEach(call => { + console.log(`Tool ${call.request.name}: ${call.status}`); - // Respond to the agent - agent.toolScheduler.handleConfirmationResponse( - tool.request.callId, - approved ? ToolConfirmationOutcome.ProceedOnce : ToolConfirmationOutcome.Cancel - ); - } + if (call.status === 'awaiting_approval') { + // Handle confirmation UI + handleToolConfirmation(call); + } + }); + }, + + // 3. Batch completion handler + onAllToolCallsComplete: (completed: ICompletedToolCall[]) => { + const successful = completed.filter(tc => tc.status === 'success').length; + const failed = completed.filter(tc => tc.status === 'error').length; + console.log(`Batch complete: ${successful} successful, ${failed} failed`); } } }; ``` -### Complete Example with Tool Monitoring +### Tool Approval Modes + +```typescript +// Auto-approve all tools (good for demos) +approvalMode: 'yolo' + +// Always require user confirmation +approvalMode: 'always' + +// Let each tool decide (based on tool.shouldConfirmExecute) +approvalMode: 'default' +``` + +### Tool Confirmation Workflow ```typescript -class ToolMonitor { - private toolStates = new Map(); +import { ToolConfirmationOutcome } from '@continue-reasoning/mini-agent'; + +async function handleToolConfirmation(toolCall: IWaitingToolCall) { + const { confirmationDetails } = toolCall; - createConfig(): AllConfig { - return { - // ... agent and chat config - toolSchedulerConfig: { - onToolCallsUpdate: (toolCalls) => { - toolCalls.forEach(call => { - const prev = this.toolStates.get(call.request.callId); - if (prev !== call.status) { - console.log(`[${call.request.name}] ${prev || 'new'} → ${call.status}`); - this.toolStates.set(call.request.callId, call.status); - } - }); - }, - - outputUpdateHandler: (callId, output) => { - console.log(`[Output] ${output}`); - }, - - onAllToolCallsComplete: (completed) => { - console.log(`\nExecution Summary:`); - completed.forEach(tc => { - console.log(`- ${tc.request.name}: ${tc.status} (${tc.durationMs}ms)`); - }); - this.toolStates.clear(); - } - } - }; + // Show confirmation UI based on tool type + const userChoice = await showConfirmationDialog({ + title: confirmationDetails.title, + message: confirmationDetails.prompt, + toolName: toolCall.request.name, + args: toolCall.request.args + }); + + // Send response back to scheduler + await agent.getToolScheduler().handleConfirmationResponse( + toolCall.request.callId, + userChoice ? ToolConfirmationOutcome.ProceedOnce : ToolConfirmationOutcome.Cancel + ); +} +``` + +## Getting Started + +### Installation + +```bash +# Install MiniAgent +pnpm install @continue-reasoning/mini-agent + +# Set up environment variables +echo "OPENAI_API_KEY=your_openai_key" >> .env +echo "GEMINI_API_KEY=your_gemini_key" >> .env +``` + +### Basic Example + +```typescript +import { StandardAgent, BaseTool, AllConfig } from '@continue-reasoning/mini-agent'; + +// 1. Create a simple tool +class GreetingTool extends BaseTool<{ name: string }, { greeting: string }> { + constructor() { + super('greet', 'Greeting Tool', 'Generate a personalized greeting', { + type: 'object', + properties: { name: { type: 'string', description: 'Name to greet' } }, + required: ['name'] + }); + } + + async execute(params, signal) { + return new DefaultToolResult(this.createResult( + `Hello, ${params.name}! Nice to meet you.`, + `👋 Hello, **${params.name}**!`, + `Greeted ${params.name}` + )); } } -const monitor = new ToolMonitor(); -const agent = new StandardAgent(tools, monitor.createConfig()); +// 2. Configure agent +const config: AllConfig = { + agentConfig: { + model: 'gpt-4o', + workingDirectory: process.cwd(), + apiKey: process.env.OPENAI_API_KEY + }, + chatConfig: { + apiKey: process.env.OPENAI_API_KEY, + modelName: 'gpt-4o', + tokenLimit: 128000, + systemPrompt: 'You are a helpful assistant with greeting capabilities.' + }, + toolSchedulerConfig: { approvalMode: 'yolo' } +}; + +// 3. Create and use agent +const agent = new StandardAgent([new GreetingTool()], config); + +for await (const event of agent.processWithSession('Please greet Alice')) { + if (event.type === 'response.chunk.text.delta') { + process.stdout.write(event.data.content.text_delta); + } +} +``` + +### Examples + +- **[Basic Example](./examples/basicExample.ts)** - Simple agent setup with weather tools +- **[Session Management](./examples/sessionManagerExample.ts)** - Multi-session conversation handling +- **[Tool Creation](./examples/tools.ts)** - Custom tool implementation examples +- **[Provider Comparison](./examples/comparison.ts)** - OpenAI vs Gemini comparison + +## Documentation + +- **[Integration Guide](./docs/prompts/integration-dev.md)** - Complete integration guide for coding agents +- **[API Reference](./docs/api-reference.md)** - Detailed API documentation +- **[Tool Development](./docs/tool-development.md)** - Guide for creating custom tools +- **[Architecture Overview](./docs/architecture.md)** - Framework design and principles + +## Contributing + +```bash +# Clone and setup +git clone https://github.com/your-org/miniagent.git +cd miniagent +pnpm install + +# Build and test +pnpm build +pnpm test +pnpm lint + +# Run examples +pnpm example:basic +pnpm example:session ``` -For more detailed documentation on tool callbacks, see [agent_subscribe_tools.md](./docs/agent_subscribe_tools.md). +## License + +MIT License - see [LICENSE](./LICENSE) for details. -## Directory Structure +## Framework Structure ``` src/ -├── interfaces.ts # Core interface definitions -├── baseAgent.ts # Base agent implementation -├── geminiAgent.ts # Gemini-specific agent -├── geminiChat.ts # Gemini chat provider -├── tokenTracker.ts # Token usage tracking -├── coreToolScheduler.ts # Tool execution scheduler -├── logger.ts # Logging system -├── index.ts # Public API exports -├── tools/ # Built-in tools -│ └── calculator.ts # Calculator tool example -└── test/ # Test files - ├── setup.ts # Test configuration - └── *.test.ts # Unit tests +├── interfaces.ts # Core TypeScript interfaces +├── baseAgent.ts # Core agent implementation +├── standardAgent.ts # Session-aware agent +├── sessionManager.ts # Multi-session management +├── chat/ +│ ├── geminiChat.ts # Google Gemini provider +│ └── openaiChat.ts # OpenAI provider +├── coreToolScheduler.ts # Tool execution engine +├── baseTool.ts # Tool base class +├── tokenTracker.ts # Token usage monitoring +├── agentEvent.ts # Event system +└── examples/ # Working examples + ├── basicExample.ts # Simple agent usage + ├── sessionManagerExample.ts # Multi-session demo + └── tools.ts # Tool implementation examples ``` diff --git a/docs/prompts/agent-dev.md b/docs/prompts/agent-dev.md new file mode 100644 index 0000000..56cda4b --- /dev/null +++ b/docs/prompts/agent-dev.md @@ -0,0 +1,986 @@ +# MiniAgent Framework Development Guide + +> You are a MiniAgent framework developer responsible for developing, extending, and maintaining the core MiniAgent framework. This document focuses on the internal architecture, development patterns, and core system implementation rather than integration usage. + +## Table of Contents + +1. [Framework Architecture Overview](#framework-architecture-overview) +2. [Core Components Deep Dive](#core-components-deep-dive) +3. [Event System Architecture](#event-system-architecture) +4. [Tool Definition and Execution Pipeline](#tool-definition-and-execution-pipeline) +5. [Chat History Management System](#chat-history-management-system) +6. [Agent Lifecycle and State Management](#agent-lifecycle-and-state-management) +7. [Streaming Response Architecture](#streaming-response-architecture) +8. [Extension Points and Plugin System](#extension-points-and-plugin-system) +9. [Development Workflow and Best Practices](#development-workflow-and-best-practices) + +## Framework Architecture Overview + +### Core Design Principles + +MiniAgent follows these fundamental architectural principles: + +1. **Interface-Driven Design**: All major components implement TypeScript interfaces for maximum flexibility +2. **Event-Driven Architecture**: Comprehensive event system for real-time monitoring and debugging +3. **Streaming-First Approach**: All responses are streaming by default, non-streaming is implemented as stream collection +4. **Platform Agnostic**: Framework adapts providers to our interfaces, not the other way around +5. **Composable Architecture**: Components can be mixed and matched for different use cases + +### High-Level Component Interaction + +```typescript +// Core Framework Flow: +// UserInput -> BaseAgent -> IChat -> LLMResponse Stream -> AgentEvent Stream +// \-> IToolScheduler -> Tool Execution -> AgentEvent Stream + +// Key Files: +// src/interfaces.ts - All TypeScript interfaces and types +// src/baseAgent.ts - Core agent implementation +// src/agentEvent.ts - Event system implementation +// src/baseTool.ts - Tool system foundation +// src/coreToolScheduler.ts - Tool execution orchestration +// src/chat/interfaces.ts - Chat system abstractions +// src/standardAgent.ts - Session management layer +``` + +### Interface Hierarchy + +```typescript +// Core Interfaces defined in src/interfaces.ts +interface IAgent { + // Main processing pipeline + process(messages, sessionId, signal): AsyncGenerator + processOneTurn(sessionId, messages, signal): AsyncGenerator + + // Component access + getChat(): IChat + getToolScheduler(): IToolScheduler + getTokenUsage(): ITokenUsage + + // Configuration + setSystemPrompt(prompt: string): void + clearHistory(): void +} + +interface IChat { + // Core streaming method + sendMessageStream(messages, promptId, tools): Promise> + + // History management + getHistory(): MessageItem[] + addHistory(message: MessageItem): void + setHistory(history: MessageItem[]): void + + // Provider conversion + convertToProviderMessage(message: MessageItem): T + convertFromChunkItems(chunk: ChunkItem, role): MessageItem +} + +interface IToolScheduler { + // Tool execution lifecycle + schedule(requests, signal, callbacks): Promise + handleConfirmationResponse(callId, outcome, payload): Promise + + // Tool management + registerTool(tool: ITool): void + getTool(name: string): ITool | undefined + getCurrentToolCalls(): IToolCall[] +} +``` + +## Core Components Deep Dive + +### BaseAgent Implementation (`src/baseAgent.ts`) + +The `BaseAgent` class is the orchestrator that connects all framework components: + +```typescript +// Core processing flow in BaseAgent +export abstract class BaseAgent implements IAgent { + private eventHandlers: Map = new Map(); + private currentTurn = 0; + private isRunning = false; + + constructor( + protected agentConfig: IAgentConfig, + protected chat: IChat, + protected toolScheduler: IToolScheduler, + registry?: SubAgentRegistry + ) { + // Initialize components and setup event handlers + } + + // Main processing pipeline + async *process(userMessages, sessionId, abortSignal): AsyncGenerator { + // 1. Convert user messages to MessageItems + // 2. Process turns with tool execution in loop + // 3. Handle streaming responses and tool calls + // 4. Emit events throughout the process + } + + // Single turn processing with tool execution + private async *processOneTurnWithHistory(sessionId, chatMessages, abortSignal) { + // 1. Get tool declarations from scheduler + // 2. Send messages to chat for LLM processing + // 3. Process streaming response events + // 4. Extract and schedule tool calls asynchronously + // 5. Wait for all tool calls to complete + // 6. Add results to chat history + // 7. Emit turn completion event + } +} +``` + +**Key Responsibilities:** +- Orchestrate conversation flow between IChat and IToolScheduler +- Manage conversation turns and tool execution loops +- Convert LLM responses to AgentEvents via `createAgentEventFromLLMResponse` +- Handle asynchronous tool execution without blocking LLM streaming +- Maintain conversation history through chat integration + +**Critical Implementation Details:** + +1. **Non-blocking Tool Execution**: Tools are scheduled asynchronously and don't block LLM response streaming +2. **History Integration**: Tool calls and responses are properly added to chat history for LLM context +3. **Turn Management**: Multiple turns can occur if tools are executed (up to 10 turns) +4. **Event Forwarding**: LLM events are directly forwarded as AgentEvents maintaining consistency + +### StandardAgent Extension (`src/standardAgent.ts`) + +`StandardAgent` extends `BaseAgent` with session management capabilities: + +```typescript +export class StandardAgent extends BaseAgent implements IStandardAgent { + sessionManager: InternalSessionManager; + private mcpManager?: McpManager; + + constructor(tools: ITool[], config: AllConfig, registry?: SubAgentRegistry) { + // 1. Select chat provider (OpenAI or Gemini) + // 2. Create tool scheduler with tools + // 3. Initialize base agent + // 4. Setup session manager + // 5. Initialize MCP if configured + } + + // Session-aware processing + async *processWithSession(userInput, sessionId?, abortSignal?) { + // 1. Handle session switching if needed + // 2. Convert input to BaseAgent format + // 3. Delegate to BaseAgent.process() + } + + // MCP Integration + async addMcpServer(config: McpServerConfig): Promise { + // 1. Connect to MCP server + // 2. Convert MCP tools to ITool implementations + // 3. Register tools with agent + // 4. Handle naming conflicts + } +} +``` + +### Chat System Architecture (`src/chat/interfaces.ts`) + +The chat system provides a unified interface for different LLM providers: + +```typescript +// Universal content representation +interface ContentPart { + type: 'text' | 'thinking' | 'function_call' | 'function_response' | ... + text?: string; + thinking?: string; + functionCall?: { id?, call_id, name, args } + functionResponse?: { id?, call_id, name, result } +} + +// Streaming response events +type LLMResponse = + | LLMStart + | LLMChunkTextDelta | LLMChunkTextDone + | LLMChunkThinking + | LLMFunctionCallDelta | LLMFunctionCallDone + | LLMComplete; + +// Provider implementations +interface IChat { + sendMessageStream(messages, promptId, tools): AsyncGenerator + convertToProviderMessage(message: MessageItem): T // Provider-specific format + convertFromChunkItems(chunk: ChunkItem, role): MessageItem +} +``` + +**Key Features:** +- **Universal Content Format**: Single `ContentPart` interface handles all content types +- **Streaming Events**: Comprehensive event types for all LLM response patterns +- **Provider Abstraction**: Providers implement our interfaces, not vice versa +- **History Management**: Consistent history format across providers + +## Event System Architecture + +### Event Types and Hierarchy (`src/interfaces.ts`) + +```typescript +enum AgentEventType { + // User interaction events + UserMessage = 'user.message', + UserCancelled = 'user.cancelled', + + // LLM Response events (forwarded from IChat) + ResponseStart = 'response.start', + ResponseChunkTextDelta = 'response.chunk.text.delta', + ResponseChunkTextDone = 'response.chunk.text.done', + ResponseChunkThinkingDelta = 'response.chunk.thinking.delta', + ResponseChunkThinkingDone = 'response.chunk.thinking.done', + ResponseChunkFunctionCallDelta = 'response.chunk.function_call.delta', + ResponseChunkFunctionCallDone = 'response.chunk.function_call.done', + ResponseComplete = 'response.complete', + ResponseIncomplete = 'response.incomplete', + ResponseFailed = 'response.failed', + + // Tool execution events (Agent-specific) + ToolExecutionStart = 'tool.call.execution.start', + ToolExecutionDone = 'tool.call.execution.done', + ToolConfirmation = 'tool.confirmation', + + // Agent-level events + TurnComplete = 'turn.complete', + Error = 'agent.error', + ModelFallback = 'agent.model_fallback', +} +``` + +### Event Flow Architecture + +```typescript +// Event Creation and Forwarding Pattern (src/interfaces.ts:405) +function createAgentEventFromLLMResponse( + llmResponse: LLMResponse, + sessionId: string, + turn: number, +): LLMResponseAgentEvent { + // Map LLMResponse types to AgentEventTypes + // Preserve original data while adding agent metadata + // Maintain event stream consistency between IChat and IAgent +} + +// Event Processing in BaseAgent (src/baseAgent.ts:415) +for await (const llmResponse of responseStream) { + // Forward LLM events directly as Agent events + yield createAgentEventFromLLMResponse(llmResponse, sessionId, this.currentTurn); + + // Handle specific response types for tool extraction + if (llmResponse.type === 'response.chunk.function_call.done') { + // Extract tool call and schedule execution + // Add assistant message to history + // Schedule tool asynchronously + } +} +``` + +### Event System Utilities (`src/agentEvent.ts`) + +```typescript +export class AgentEventFactory { + createEvent(type: AgentEventType, data?, customMetadata?): AgentEvent + // Type-safe event creation with consistent metadata +} + +export class AgentEventEmitter { + on(id: string, handler: EventHandler, eventTypes?: AgentEventType[]): void + emit(event: AgentEvent): void + // Error-safe event emission with filtering +} + +export class AgentEventUtils { + static isUserMessageEvent(event: AgentEvent): boolean + static isToolExecutionEvent(event: AgentEvent): boolean + static extractLLMContent(event: AgentEvent): string | null + static formatForLogging(event: AgentEvent): string + // Utility methods for event classification and processing +} +``` + +## Tool Definition and Execution Pipeline + +### Tool Interface Design (`src/interfaces.ts`) + +```typescript +interface ITool { + name: string; + description: string; + schema: ToolDeclaration; + isOutputMarkdown: boolean; + canUpdateOutput: boolean; + + // Validation and description + validateToolParams(params: TParams): string | null; + getDescription(params: TParams): string; + + // Confirmation workflow + shouldConfirmExecute(params: TParams, signal: AbortSignal): Promise; + + // Execution + execute(params: TParams, signal: AbortSignal, updateOutput?: (output: string) => void): Promise; +} + +// Tool result interface with custom history rendering +interface IToolResult { + toHistoryStr(): string; +} +``` + +### BaseTool Implementation (`src/baseTool.ts`) + +```typescript +export abstract class BaseTool implements ITool> { + constructor( + readonly name: string, + readonly displayName: string, + readonly description: string, + readonly parameterSchema: Schema, + readonly isOutputMarkdown: boolean = true, + readonly canUpdateOutput: boolean = false, + ) {} + + // Computed property for tool declaration + get schema(): ToolDeclaration { + return { + name: this.name, + description: this.description, + parameters: this.parameterSchema, + }; + } + + // Helper methods for common patterns + protected validateRequiredParams(params: Record, requiredFields: string[]): string | null + protected validateParameterTypes(params: Record, typeMap: Record): string | null + protected createResult(llmContent: string, returnDisplay?: string, summary?: string) + protected createErrorResult(error: Error | string, context?: string) + protected checkAbortSignal(signal: AbortSignal, operation?: string): void + + // Abstract execution method + abstract execute(params: TParams, signal: AbortSignal, updateOutput?: (output: string) => void): Promise>; +} +``` + +### Tool Execution Pipeline (`src/coreToolScheduler.ts`) + +```typescript +export class CoreToolScheduler implements IToolScheduler { + private toolCalls: Map = new Map(); + private toolRegistry: Map = new Map(); + + // Main execution pipeline (src/coreToolScheduler.ts:105) + async schedule(requests: IToolCallRequestInfo[], signal: AbortSignal, callbacks?) { + // Phase 1: Validate all tool calls + await this.validateToolCalls(requests); + + // Phase 2: Handle confirmations for tools that require them + await this.handleConfirmations(); + + // Phase 3: Execute approved tools in parallel + await this.executeApprovedTools(); + + // Phase 4: Wait for completion and cleanup + await this.waitForCompletion(); + } + + // Tool call state management (src/coreToolScheduler.ts:281) + private async validateSingleToolCall(request: IToolCallRequestInfo) { + // 1. Create validating tool call state + // 2. Resolve tool from registry + // 3. Validate tool parameters + // 4. Check if confirmation required + // 5. Transition to scheduled or awaiting_approval state + } + + // Tool execution (src/coreToolScheduler.ts:409) + private async executeToolCall(scheduledCall: IScheduledToolCall) { + // 1. Transition to executing state + // 2. Set up output update handler + // 3. Execute tool with abort signal + // 4. Handle success/error states + // 5. Call lifecycle callbacks + } +} +``` + +### Tool Call State Machine + +```typescript +enum ToolCallStatus { + Validating = 'validating', // Initial validation + Scheduled = 'scheduled', // Ready for execution + Executing = 'executing', // Currently running + Success = 'success', // Completed successfully + Error = 'error', // Failed with error + Cancelled = 'cancelled', // User/system cancelled + AwaitingApproval = 'awaiting_approval', // Waiting for user confirmation +} + +// State-specific interfaces (src/interfaces.ts:592) +interface IValidatingToolCall extends IBaseToolCall { + status: ToolCallStatus.Validating; + tool: ITool; +} + +interface IExecutingToolCall extends IBaseToolCall { + status: ToolCallStatus.Executing; + tool: ITool; + liveOutput?: string; // Real-time output updates +} + +interface ISuccessfulToolCall extends IBaseToolCall { + status: ToolCallStatus.Success; + tool: ITool; + response: IToolCallResponseInfo; + durationMs?: number; +} +``` + +## Chat History Management System + +### Message Structure and Flow + +```typescript +// Universal message format (src/chat/interfaces.ts:65) +interface MessageItem { + role: 'user' | 'assistant'; + content: ContentPart; + turnIdx?: number; // For cache optimization + metadata?: { + sessionId?: string; + timestamp?: number; + turn?: number; + responseId?: string; + }; +} + +// Content types for different message patterns (src/chat/interfaces.ts:19) +interface ContentPart { + type: 'text' | 'thinking' | 'function_call' | 'function_response'; + + // Text content + text?: string; + text_delta?: string; + + // AI reasoning + thinking?: string; + thinking_delta?: string; + + // Function calling + functionCall?: { + id?: string; // OpenAI function ID + call_id: string; // Universal call ID + name: string; + args: string; // JSON string + }; + + functionResponse?: { + id?: string; // OpenAI function ID + call_id: string; // Universal call ID + name: string; + result: string; // Tool result as string + }; +} +``` + +### History Integration in BaseAgent + +```typescript +// Critical history management in processOneTurn (src/baseAgent.ts:418) +for await (const llmResponse of responseStream) { + if (llmResponse.type === 'response.chunk.text.done') { + // Add assistant text to history + const textMessage: MessageItem = { + role: 'assistant', + content: llmResponse.content, + turnIdx: this.currentTurn, + }; + this.chat.addHistory(textMessage); + } + + else if (llmResponse.type === 'response.chunk.function_call.done') { + // Add assistant function call to history (src/baseAgent.ts:457) + const assistantMessage: MessageItem = { + role: 'assistant', + content: { + type: 'function_call', + functionCall: { + id: llmResponse.content.functionCall.id || '', + call_id: llmResponse.content.functionCall.call_id, + name: llmResponse.content.functionCall.name, + args: llmResponse.content.functionCall.args, + }, + }, + turnIdx: this.currentTurn, + }; + this.chat.addHistory(assistantMessage); + + // Schedule tool execution asynchronously (src/baseAgent.ts:475) + this.toolScheduler.schedule([toolCall], abortSignal, callbacks); + } +} + +// Tool result integration (src/baseAgent.ts:389) +onExecutionDone: (request: IToolCallRequestInfo, response: IToolCallResponseInfo) => { + // Add tool result as user message so LLM can see the result + const toolResultMessage: MessageItem = { + role: 'user', + content: { + type: 'function_response', + functionResponse: { + ...(request.functionId && { id: request.functionId }), + call_id: request.callId, + name: request.name, + result: response.result ? response.result.toHistoryStr() : (response.error?.message || 'Unknown error'), + }, + }, + turnIdx: this.currentTurn, + }; + this.chat.addHistory(toolResultMessage); +} +``` + +### Session Management in StandardAgent + +```typescript +// Session manager implementation (src/standardAgent.ts:24) +class InternalSessionManager implements ISessionManager { + private sessions: Map = new Map(); + private currentSessionId: string | null = null; + + setCurrentSession(sessionId: string): boolean { + // Save current session state before switching (src/standardAgent.ts:72) + if (this.currentSessionId && this.currentSessionId !== sessionId) { + this.saveCurrentSessionState(); + } + + // Switch to new session + this.currentSessionId = sessionId; + this.restoreSessionState(session); + + return true; + } + + private saveCurrentSessionState(): void { + // Save chat history to session (src/standardAgent.ts:134) + const chat = this.agent.getChat(); + if (chat && chat.getHistory) { + session.messageHistory = chat.getHistory(); + } + + // Save token usage (src/standardAgent.ts:139) + const tokenUsage = this.agent.getTokenUsage(); + if (tokenUsage) { + session.tokenUsage = { + totalInputTokens: tokenUsage.inputTokens || 0, + totalOutputTokens: tokenUsage.outputTokens || 0, + totalTokens: tokenUsage.totalTokens || 0 + }; + } + } + + private restoreSessionState(session: AgentSession): void { + // Restore chat history (src/standardAgent.ts:157) + const chat = this.agent.getChat(); + if (chat && chat.clearHistory && chat.setHistory) { + chat.clearHistory(); + if (session.messageHistory.length > 0) { + chat.setHistory(session.messageHistory); + } + } + + // Reset token tracker to avoid stale state (src/standardAgent.ts:166) + const tokenTracker = chat.getTokenTracker(); + if (tokenTracker && tokenTracker.reset) { + tokenTracker.reset(); + } + } +} +``` + +## Agent Lifecycle and State Management + +### Agent State Tracking + +```typescript +interface IAgentStatus { + isRunning: boolean; + currentTurn: number; + historySize: number; + config: IAgentConfig; + lastUpdateTime: number; + tokenUsage: ITokenUsage; + modelInfo: { model: string; tokenLimit: number }; +} + +// State management in BaseAgent (src/baseAgent.ts:76) +export abstract class BaseAgent implements IAgent { + private currentTurn = 0; + private isRunning = false; + private lastUpdateTime = Date.now(); + + async *process(userMessages, sessionId, abortSignal) { + if (this.isRunning) { + yield this.createErrorEvent('Agent is already processing a request'); + return; + } + + this.isRunning = true; + try { + // Process multiple turns with tool execution (src/baseAgent.ts:223) + for (let turnCount = 0; turnCount < 10 && hasToolCallsInTurn && !abortSignal.aborted; turnCount++) { + for await (const event of this.processOneTurn(sessionId, messagesToProcess, abortSignal)) { + yield event; + + if (event.type === AgentEventType.TurnComplete) { + hasToolCallsInTurn = (event.data as {hasToolCalls: boolean}).hasToolCalls; + } + } + } + } finally { + this.isRunning = false; + this.lastUpdateTime = Date.now(); + } + } +} +``` + +### Token Management + +```typescript +// Token usage interface (src/chat/interfaces.ts:298) +interface ITokenUsage { + inputTokens: number; + inputTokenDetails?: { cachedTokens: number }; + outputTokens: number; + outputTokenDetails?: { reasoningTokens: number }; + totalTokens: number; + cumulativeTokens: number; + tokenLimit: number; + usagePercentage: number; + + // Cache performance metrics + cacheHitRate?: number; + tokenSavings?: number; + totalCacheableRequests?: number; + actualCacheHits?: number; +} + +// Token tracker interface (src/chat/interfaces.ts:333) +interface ITokenTracker { + updateUsage(usage: { inputTokens, outputTokens, ... }): void; + getUsage(): ITokenUsage; + reset(): void; + isApproachingLimit(threshold?: number): boolean; +} +``` + +## Streaming Response Architecture + +### Streaming Pipeline Design + +```typescript +// Complete streaming flow from IChat to AgentEvent +// 1. IChat.sendMessageStream() -> AsyncGenerator +// 2. BaseAgent processes stream -> converts to AgentEvent via createAgentEventFromLLMResponse +// 3. Tool extraction happens during streaming without blocking +// 4. Tool results are integrated back into history + +// Non-blocking tool execution pattern (src/baseAgent.ts:410) +for await (const llmResponse of responseStream) { + // Immediately forward LLM events as AgentEvents + yield createAgentEventFromLLMResponse(llmResponse, sessionId, this.currentTurn); + + // Extract and schedule tools without blocking the stream + if (llmResponse.type === 'response.chunk.function_call.done') { + // Schedule tool execution asynchronously (src/baseAgent.ts:475) + this.toolScheduler.schedule([toolCall], abortSignal, createToolCallbacks()) + .catch(error => { + this.logger.error(`Tool scheduling failed: ${error}`); + }); + } +} + +// Wait for all tools to complete after LLM stream ends (src/baseAgent.ts:483) +while (pendingToolCalls.size > 0 && !abortSignal.aborted) { + // Emit buffered tool execution events + while (toolExecutionEvents.length > 0) { + yield toolExecutionEvents.shift()!; + } + await new Promise(resolve => setTimeout(resolve, 10)); +} +``` + +### Response Type Processing + +```typescript +// LLM Response types mapped to AgentEvent types (src/interfaces.ts:413) +const LLM_TO_AGENT_EVENT_MAP = { + 'response.start': AgentEventType.ResponseStart, + 'response.chunk.text.delta': AgentEventType.ResponseChunkTextDelta, + 'response.chunk.text.done': AgentEventType.ResponseChunkTextDone, + 'response.chunk.thinking.delta': AgentEventType.ResponseChunkThinkingDelta, + 'response.chunk.thinking.done': AgentEventType.ResponseChunkThinkingDone, + 'response.chunk.function_call.delta': AgentEventType.ResponseChunkFunctionCallDelta, + 'response.chunk.function_call.done': AgentEventType.ResponseChunkFunctionCallDone, + 'response.complete': AgentEventType.ResponseComplete, + 'response.incomplete': AgentEventType.ResponseIncomplete, + 'response.failed': AgentEventType.ResponseFailed, +}; + +// Direct event forwarding with metadata addition (src/interfaces.ts:449) +function createAgentEventFromLLMResponse(llmResponse: LLMResponse, sessionId: string, turn: number) { + return { + type: LLM_TO_AGENT_EVENT_MAP[llmResponse.type], + data: llmResponse, + timestamp: Date.now(), + sessionId, + turn, + metadata: { + source: 'llm_response', + originalType: llmResponse.type, + }, + }; +} +``` + +## Extension Points and Plugin System + +### SubAgent System + +```typescript +// SubAgent registry for task delegation (src/interfaces.ts:1217) +interface SubAgentConfig { + name: string; + description: string; + systemPrompt: string; + tools?: string[] | '*'; + whenToUse: string; +} + +export class SubAgentRegistry { + private subagents: Map = new Map(); + + register(config: SubAgentConfig): void { + this.subagents.set(config.name, config); + } + + generateSystemPromptSnippet(): string { + // Generate system prompt addition that informs the agent about available subagents + } +} + +// TaskTool for delegating to subagents (src/baseAgent.ts:833) +export class TaskTool extends BaseTool<{ name: string; description: string }, TaskResponse> { + constructor( + private registry: SubAgentRegistry, + private agentConfig: IAgentConfig, + private createChatInstance: (config: any) => IChat, + private createSchedulerInstance: (config: any) => Promise + ) { + super('task', 'Task', 'Delegate complex tasks to specialized subagents', schema); + } + + async execute(params: { name: string; description: string }, signal: AbortSignal) { + // 1. Find appropriate subagent from registry + // 2. Create new agent instance with subagent configuration + // 3. Execute task in stateless mode using processOneTurn + // 4. Return consolidated result + } +} +``` + +### MCP Integration + +```typescript +// Model Context Protocol integration for external tools (src/interfaces.ts:57) +interface McpServerConfig { + name: string; + transport: 'stdio' | 'http' | 'sse'; + command?: string; + args?: string[]; + url?: string; + timeout?: number; + autoConnect?: boolean; +} + +// MCP tool adapter +export class McpToolAdapter implements ITool { + constructor( + private mcpTool: any, + private mcpClient: any, + public name: string, + public description: string + ) {} + + async execute(params: any, signal: AbortSignal): Promise { + // Bridge MCP tool execution to ITool interface + const result = await this.mcpClient.callTool(this.mcpTool.name, params); + return new DefaultToolResult(result); + } +} + +// MCP manager in StandardAgent (src/standardAgent.ts:376) +async addMcpServer(config: McpServerConfig): Promise { + const mcpTools = await this.mcpManager.addServer(config); + const tools = this.convertMcpToolsToITools(mcpTools, config.name); + tools.forEach(tool => this.registerTool(tool)); + return tools; +} +``` + +### Custom Chat Provider Integration + +```typescript +// Example: Adding a new chat provider +export class CustomChat implements IChat { + constructor(private config: IChatConfig) {} + + async sendMessageStream(messages: MessageItem[], promptId: string, tools?: ToolDeclaration[]): Promise> { + // 1. Convert MessageItems to provider format using convertToProviderMessage + // 2. Set up streaming request to provider + // 3. Convert provider response events to LLMResponse format + // 4. Yield standardized LLMResponse events + } + + convertToProviderMessage(message: MessageItem): CustomMessage { + // Convert universal MessageItem to provider-specific format + } + + convertFromChunkItems(chunk: ChunkItem, role: 'user' | 'assistant'): MessageItem { + // Convert provider response chunks back to universal format + } + + // Implement other IChat methods... +} + +// Usage in StandardAgent +const chat = new CustomChat(config.chatConfig); +const agent = new StandardAgent(tools, { ...config, chatProvider: 'custom' }); +``` + +## Development Workflow and Best Practices + +### Code Organization + +```typescript +// Recommended project structure for extensions +src/ + interfaces.ts // Core interfaces (extend, don't modify) + baseAgent.ts // Core agent logic (extend via composition) + agentEvent.ts // Event system (use utilities, add custom events) + baseTool.ts // Tool foundation (extend for custom tools) + coreToolScheduler.ts // Tool execution (extend for custom scheduling) + standardAgent.ts // Session management (extend for features) + chat/ +  interfaces.ts // Chat abstractions (implement for new providers) +  geminiChat.ts // Gemini implementation +  openaiChat.ts // OpenAI implementation + tools/ // Custom tool implementations + extensions/ // Framework extensions + tests/ // Comprehensive test suite +``` + +### Extension Development Pattern + +```typescript +// Pattern for extending BaseAgent +export class CustomAgent extends BaseAgent { + private customFeatures: CustomFeatureManager; + + constructor(config: IAgentConfig, chat: IChat, scheduler: IToolScheduler) { + super(config, chat, scheduler); + this.customFeatures = new CustomFeatureManager(); + } + + // Override specific methods while preserving core functionality + protected async processOneTurnWithHistory(sessionId: string, messages: MessageItem[], signal: AbortSignal) { + // Add custom pre-processing + await this.customFeatures.preProcessMessages(messages); + + // Call parent implementation + const eventGenerator = super.processOneTurnWithHistory(sessionId, messages, signal); + + // Add custom post-processing + for await (const event of eventGenerator) { + const enhancedEvent = await this.customFeatures.enhanceEvent(event); + yield enhancedEvent; + } + } +} + +// Pattern for custom tool categories +export abstract class CustomToolCategory extends BaseTool { + constructor(name: string, description: string, schema: Schema) { + super(name, `Custom ${name}`, description, schema, true, true); + } + + // Shared validation logic for this tool category + override validateToolParams(params: any): string | null { + const baseValidation = super.validateToolParams(params); + if (baseValidation) return baseValidation; + + // Add category-specific validation + return this.validateCategorySpecific(params); + } + + protected abstract validateCategorySpecific(params: any): string | null; +} +``` + +### Error Handling Best Practices + +```typescript +// Comprehensive error handling in framework development +export class RobustAgent extends BaseAgent { + private errorRecovery: ErrorRecoveryManager; + + async *process(userMessages: UserMessage[], sessionId: string, signal: AbortSignal) { + try { + yield* super.process(userMessages, sessionId, signal); + } catch (error) { + // Log error with context + this.logger.error('Agent processing failed', { + error: error instanceof Error ? error.message : String(error), + stack: error instanceof Error ? error.stack : undefined, + sessionId, + turn: this.currentTurn, + userMessages: userMessages.map(m => ({ role: m.role, content: m.content.text?.substring(0, 100) })) + }); + + // Attempt recovery + const recoveryEvent = await this.errorRecovery.attemptRecovery(error, sessionId); + if (recoveryEvent) { + yield recoveryEvent; + } else { + // Emit error event for handling by client + yield this.createErrorEvent(`Processing failed: ${error instanceof Error ? error.message : String(error)}`); + } + } + } +} + +// Error recovery strategies +class ErrorRecoveryManager { + async attemptRecovery(error: unknown, sessionId: string): Promise { + if (this.isRecoverableError(error)) { + // Reset state and retry + return this.createEvent(AgentEventType.ModelFallback, { + originalError: error, + recoveryAction: 'state_reset', + sessionId + }); + } + return null; + } + + private isRecoverableError(error: unknown): boolean { + // Define recoverable error conditions + return error instanceof Error && error.message.includes('rate limit'); + } +} +``` + +This comprehensive guide provides the foundation for developing and extending the MiniAgent framework. Focus on understanding the interface contracts, event flow patterns, and composition-based extension strategies to build robust, maintainable additions to the framework. \ No newline at end of file diff --git a/docs/prompts/integration-dev.md b/docs/prompts/integration-dev.md new file mode 100644 index 0000000..81314d5 --- /dev/null +++ b/docs/prompts/integration-dev.md @@ -0,0 +1,1971 @@ +# MiniAgent Framework Integration Guide + +> 你是一个 MiniAgent 框架的集成工程师,负责把 MiniAgent 集成到项目里。此文档专注于集成过程,而不是 MiniAgent 的内部运行原理。 + +## 目录 + +1. [快速开始](#快速开始) +2. [完整示例参考](#完整示例参考) +3. [工具系统:创建和管理 Tools](#工具系统创建和管理-tools) +4. [StandardAgent:使用和 Session 管理](#standardagent使用和-session-管理) +5. [流式接口:如何处理 Streaming](#流式接口如何处理-streaming) +6. [事件系统:监听和处理事件](#事件系统监听和处理事件) +7. [配置详解:所有 Config 说明](#配置详解所有-config-说明) +8. [工具执行器:回调和批准流程](#工具执行器回调和批准流程) +9. [常见问题和解决方案](#常见问题和解决方案) + +## 快速开始 + +### 安装依赖 + +```bash +pnpm install @continue-reasoning/mini-agent +``` + +### 环境设置 + +```bash +# 创建 .env 文件 +echo "OPENAI_API_KEY=your_openai_api_key" > .env +echo "GEMINI_API_KEY=your_gemini_api_key" >> .env + +# 安装开发依赖 +pnpm install -D typescript tsx @types/node dotenv +``` + +### TypeScript 配置 + +```json +// tsconfig.json +{ + "compilerOptions": { + "target": "ES2022", + "module": "ESNext", + "moduleResolution": "node", + "esModuleInterop": true, + "allowSyntheticDefaultImports": true, + "strict": true, + "skipLibCheck": true, + "forceConsistentCasingInFileNames": true, + "resolveJsonModule": true + } +} +``` + +### 基础使用 + +```typescript +import { StandardAgent, AllConfig, ITool } from '@continue-reasoning/mini-agent'; + +// 1. 创建配置 +const config: AllConfig = { + agentConfig: { + model: 'gpt-4', + workingDirectory: process.cwd(), + apiKey: process.env.OPENAI_API_KEY + }, + chatConfig: { + modelName: 'gpt-4', + apiKey: process.env.OPENAI_API_KEY, + systemPrompt: '你是一个有用的助手', + tokenLimit: 128000 + }, + toolSchedulerConfig: { + approvalMode: 'default' + } +}; + +// 2. 创建 StandardAgent +const agent = new StandardAgent([], config); + +// 3. 处理用户输入 +for await (const event of agent.processWithSession('Hello!')) { + console.log(event); +} +``` + +## 完整示例参考 + +在开始详细集成之前,建议先查看这些完整的工作示例: + +### examples/tools.ts - 工具创建示例 + +这个文件展示了如何创建生产级别的工具: + +```typescript +// 📁 examples/tools.ts +import { createWeatherTool, createSubTool, CITY_COORDINATES } from '@continue-reasoning/mini-agent/examples/tools'; + +// 1. 使用预建工具 +const weatherTool = createWeatherTool(); +const mathTool = createSubTool(); + +// 2. 获取天气数据 +const beijingCoords = CITY_COORDINATES['北京']; // { latitude: 39.9042, longitude: 116.4074 } +const result = await weatherTool.execute(beijingCoords, new AbortController().signal); + +// 3. 数学计算 +const mathResult = await mathTool.execute({ minuend: 10, subtrahend: 5 }, new AbortController().signal); +console.log(mathResult.data); // { result: 5, operation: "10 - 5 = 5" } +``` + +**关键特性:** +- 完整的 TypeScript 类型定义 +- 实时输出更新支持 +- 错误处理和取消信号 +- 预定义的城市坐标数据 +- 生产就绪的工具验证 + +### examples/sessionManagerExample.ts - 会话管理示例 + +这个文件演示了复杂的多会话管理场景: + +```typescript +// 📁 examples/sessionManagerExample.ts +import { StandardAgent, AgentEventType } from '@continue-reasoning/mini-agent'; +import { createWeatherTool, createSubTool } from '@continue-reasoning/mini-agent/examples/tools'; + +// 运行示例 +async function runExample() { + const tools = [createWeatherTool(), createSubTool()]; + const agent = new StandardAgent(tools, config); + + // 创建多个会话进行温度比较 + const session1 = agent.createNewSession('Beijing vs Shanghai'); + const session2 = agent.createNewSession('Shanghai vs Guangzhou'); + + // 在不同会话中进行独立对话 + await processConversation(agent, '比较北京和上海的温度', session1); + await processConversation(agent, '比较上海和广州的温度', session2); + + // 切换回第一个会话继续对话 + agent.switchToSession(session1); + await processConversation(agent, '现在比较广州和深圳的温度', session1); +} +``` + +**演示功能:** +- 多会话并行处理 +- 会话状态隔离 +- 会话历史保持 +- 实时事件处理 +- Token 使用统计 + +### 如何运行示例 + +```bash +# 设置环境变量 +export OPENAI_API_KEY="your_openai_key" + +# 运行会话管理示例 +npx tsx examples/sessionManagerExample.ts + +# 在你的项目中导入和使用 +import { createWeatherTool } from '@continue-reasoning/mini-agent/examples/tools'; +``` + +## 工具系统:创建和管理 Tools + +### ITool 和 BaseTool 结构详解 + +#### ITool 接口 + +```typescript +import { ITool, IToolResult, DefaultToolResult, ToolDeclaration } from '@continue-reasoning/mini-agent'; + +// ITool 接口定义了所有工具必须实现的方法 +interface ITool { + // 基本属性 + name: string; // 工具名称 + description: string; // 工具描述 + schema: ToolDeclaration; // 工具声明 + isOutputMarkdown: boolean; // 输出是否为 Markdown + canUpdateOutput: boolean; // 是否支持流式输出 + + // 核心方法 + validateToolParams(params: TParams): string | null; + getDescription(params: TParams): string; + shouldConfirmExecute(params: TParams, signal: AbortSignal): Promise; + execute(params: TParams, signal: AbortSignal, updateOutput?: (output: string) => void): Promise; +} +``` + +#### BaseTool 类结构详解 + +BaseTool 是一个抽象基类,提供了完整的工具实现框架。理解其结构对于创建高质量工具至关重要: + +```typescript +import { BaseTool, Schema, Type, DefaultToolResult } from '@continue-reasoning/mini-agent'; + +// BaseTool 类的完整结构 +export abstract class BaseTool + implements ITool> { + + // === 核心属性 === + readonly name: string; // 工具内部名称 (API调用用) + readonly displayName: string; // 用户友好的显示名称 + readonly description: string; // 工具功能描述 + readonly parameterSchema: Schema; // 参数验证Schema + readonly isOutputMarkdown: boolean; // 输出是否为Markdown格式 + readonly canUpdateOutput: boolean; // 是否支持流式输出更新 + + // === 自动生成的属性 === + get schema(): ToolDeclaration { // 工具声明,自动由name、description、parameterSchema组合 + return { + name: this.name, + description: this.description, + parameters: this.parameterSchema, + }; + } + + // === 可重写的方法 === + validateToolParams(params: TParams): string | null; // 参数验证 + getDescription(params: TParams): string; // 获取执行描述 + shouldConfirmExecute(params: TParams, signal: AbortSignal): Promise; // 确认执行 + + // === 必须实现的抽象方法 === + abstract execute(params: TParams, signal: AbortSignal, updateOutput?: (output: string) => void): Promise>; + + // === 内置的辅助方法 === + protected createResult(llmContent: string, returnDisplay?: string, summary?: string): {...}; + protected createErrorResult(error: Error | string, context?: string): {...}; + protected createFileDiffResult(fileName: string, fileDiff: string, llmContent: string, summary?: string): {...}; + protected validateRequiredParams(params: Record, requiredFields: string[]): string | null; + protected validateParameterTypes(params: Record, typeMap: Record): string | null; + protected formatProgress(operation: string, progress: string, emoji?: string): string; + protected checkAbortSignal(signal: AbortSignal, operation?: string): void; +} +``` + +#### BaseTool 的设计原则 + +1. **类型安全**: 使用泛型 `` 确保参数和结果类型安全 +2. **生命周期管理**: 从验证到执行的完整生命周期控制 +3. **流式输出**: 支持实时输出更新,提升用户体验 +4. **确认机制**: 内置危险操作确认流程 +5. **错误处理**: 标准化的错误处理和结果格式 +6. **取消支持**: 通过 AbortSignal 支持操作取消 + +#### 创建 BaseTool 的完整示例 + +```typescript +// 定义工具结果类型 +export interface MyToolResult { + success: boolean; + output: string; + timestamp: string; +} + +export class MyTool extends BaseTool<{ input: string }, MyToolResult> { + constructor() { + super( + 'my_tool', // name - 工具名称 + 'My Tool', // displayName - 显示名称 + '执行某项任务的示例工具', // description - 工具描述 + { // parameterSchema - 参数 Schema + type: Type.OBJECT, + properties: { + input: { + type: Type.STRING, + description: '输入参数' + } + }, + required: ['input'] + }, + false, // isOutputMarkdown - 输出是否为 Markdown + true // canUpdateOutput - 是否支持流式输出 + ); + } + + // 重写参数验证方法 + override validateToolParams(params: { input: string }): string | null { + // 使用内置的验证辅助方法 + const requiredError = this.validateRequiredParams(params, ['input']); + if (requiredError) return requiredError; + + const typeError = this.validateParameterTypes(params, { + input: 'string' + }); + if (typeError) return typeError; + + // 自定义验证逻辑 + if (!params.input.trim()) { + return '输入不能为空'; + } + + return null; + } + + // 重写描述方法 + override getDescription(params: { input: string }): string { + return `将执行任务,输入: ${params.input}`; + } + + // 可选:重写确认执行方法(用于需要用户确认的工具) + override async shouldConfirmExecute( + params: { input: string }, + signal: AbortSignal + ): Promise { + // 对于危险操作,返回确认详情 + if (params.input.includes('delete') || params.input.includes('remove')) { + return { + type: 'info', + title: '确认执行', + prompt: `确定要执行: ${params.input}?`, + onConfirm: async (outcome) => { + console.log('用户选择:', outcome); + } + }; + } + + return false; // 无需确认 + } + + // 实现执行方法 + async execute( + params: { input: string }, + signal: AbortSignal, + updateOutput?: (output: string) => void + ): Promise> { + // 输出实时进度 + if (updateOutput) { + updateOutput(this.formatProgress('开始处理', params.input, '⚙️')); + } + + try { + // 检查取消信号 + this.checkAbortSignal(signal, '任务处理'); + + // 模拟处理过程 + await new Promise(resolve => setTimeout(resolve, 1000)); + + if (updateOutput) { + updateOutput(this.formatProgress('处理中', '50%', '⚙️')); + } + + this.checkAbortSignal(signal, '任务处理'); + + // 完成处理 + const result: MyToolResult = { + success: true, + output: `处理结果: ${params.input}`, + timestamp: new Date().toISOString() + }; + + if (updateOutput) { + updateOutput(this.formatProgress('处理完成', result.output, '✅')); + } + + return new DefaultToolResult(result); + } catch (error) { + // 使用内置的错误结果创建方法 + const errorResult = this.createErrorResult(error instanceof Error ? error : new Error(String(error))); + return new DefaultToolResult(errorResult as MyToolResult); + } + } +} + +// 注册工具到 Agent +const tool = new MyTool(); +agent.registerTool(tool); +``` + +#### BaseTool 内置辅助方法详解 + +BaseTool 提供了丰富的辅助方法来简化工具开发: + +```typescript +export class AdvancedTool extends BaseTool<{ data: string; operation: string }, any> { + // 使用所有内置辅助方法的示例 + + override validateToolParams(params: { data: string; operation: string }): string | null { + // 1. 验证必需参数 + const requiredError = this.validateRequiredParams(params, ['data', 'operation']); + if (requiredError) return requiredError; + + // 2. 验证参数类型 + const typeError = this.validateParameterTypes(params, { + data: 'string', + operation: 'string' + }); + if (typeError) return typeError; + + // 3. 自定义业务逻辑验证 + const validOperations = ['process', 'analyze', 'transform']; + if (!validOperations.includes(params.operation)) { + return `operation 必须是: ${validOperations.join(', ')}`; + } + + return null; + } + + async execute( + params: { data: string; operation: string }, + signal: AbortSignal, + updateOutput?: (output: string) => void + ) { + try { + // 4. 检查取消信号 + this.checkAbortSignal(signal, '数据处理'); + + // 5. 格式化进度输出 + if (updateOutput) { + updateOutput(this.formatProgress('开始处理', params.operation, '🚀')); + } + + // 模拟处理过程 + await new Promise(resolve => setTimeout(resolve, 1000)); + this.checkAbortSignal(signal, '数据处理'); + + if (updateOutput) { + updateOutput(this.formatProgress('处理中', '50%', '⚙️')); + } + + // 完成处理 + const result = { + processed_data: `${params.operation}_${params.data}`, + timestamp: new Date().toISOString() + }; + + // 6. 创建标准结果 + const toolResult = this.createResult( + `处理完成: ${result.processed_data}`, // llmContent + `✅ 操作 ${params.operation} 完成`, // returnDisplay + `${params.operation} 操作成功` // summary + ); + + return new DefaultToolResult(toolResult); + + } catch (error) { + // 7. 创建错误结果 + const errorResult = this.createErrorResult( + error instanceof Error ? error : new Error(String(error)), + '数据处理过程中' + ); + return new DefaultToolResult(errorResult); + } + } +} +``` + +#### BaseTool 生命周期方法重写指南 + +```typescript +export class ComprehensiveTool extends BaseTool { + + // === 1. 参数验证方法(可选重写)=== + override validateToolParams(params: ToolParams): string | null { + // 基础验证使用内置方法 + const requiredError = this.validateRequiredParams(params, ['requiredField']); + if (requiredError) return requiredError; + + // 自定义业务逻辑验证 + if (params.value < 0) { + return 'value 必须为正数'; + } + + return null; // 验证通过 + } + + // === 2. 获取执行描述(可选重写)=== + override getDescription(params: ToolParams): string { + return `将执行 ${this.displayName},参数: ${params.action}`; + } + + // === 3. 确认执行检查(可选重写)=== + override async shouldConfirmExecute( + params: ToolParams, + signal: AbortSignal + ): Promise { + // 危险操作需要确认 + if (params.dangerous === true) { + return { + type: 'info', + title: '确认危险操作', + prompt: `确定要执行危险操作 ${params.action}?`, + onConfirm: async (outcome) => { + console.log('用户确认结果:', outcome); + } + }; + } + + return false; // 无需确认 + } + + // === 4. 执行方法(必须实现)=== + async execute( + params: ToolParams, + signal: AbortSignal, + updateOutput?: (output: string) => void + ): Promise> { + try { + // 使用所有内置辅助方法 + this.checkAbortSignal(signal); + + if (updateOutput) { + updateOutput(this.formatProgress('开始', params.action, '🚀')); + } + + // 实际业务逻辑 + const result = await this.performAction(params, signal, updateOutput); + + // 返回标准结果 + return new DefaultToolResult(this.createResult( + `操作完成: ${result.message}`, + `✅ ${result.status}`, + result.summary + )); + + } catch (error) { + return new DefaultToolResult(this.createErrorResult(error, '工具执行')); + } + } + + private async performAction( + params: ToolParams, + signal: AbortSignal, + updateOutput?: (output: string) => void + ): Promise { + // 业务逻辑实现 + // 定期检查取消信号 + // 发送进度更新 + return { message: 'success', status: 'completed', summary: 'Action performed' }; + } +} +``` + +### 参考现有工具实现 + +查看 `examples/tools.ts` 中的完整 BaseTool 示例: + +```typescript +// 天气工具示例 +import { createWeatherTool, createSubTool } from './examples/tools.js'; + +const weatherTool = createWeatherTool(); +const mathTool = createSubTool(); + +// 使用预定义的城市坐标 +import { CITY_COORDINATES, getCityCoordinates } from './examples/tools.js'; +const beijingCoords = getCityCoordinates('北京'); +console.log(beijingCoords); // { latitude: 39.9042, longitude: 116.4074 } +``` + +### 需要确认的工具 + +```typescript +import { ToolCallConfirmationDetails, ToolConfirmationOutcome } from 'miniagent'; + +class DestructiveTool implements ITool { + // ... 其他属性 + + async shouldConfirmExecute(params: any, signal: AbortSignal): Promise { + // 需要用户确认的危险操作 + return { + type: 'exec', + title: '确认执行危险操作', + command: `rm -rf ${params.path}`, + rootCommand: 'rm', + onConfirm: async (outcome: ToolConfirmationOutcome) => { + console.log('用户选择:', outcome); + } + }; + } +} +``` + +### 工具管理 + +```typescript +// 获取所有工具 +const tools = agent.getToolList(); + +// 获取特定工具 +const tool = agent.getTool('my_tool'); + +// 移除工具 +const removed = agent.removeTool('my_tool'); + +// 获取会话特定工具 +const sessionTools = agent.getToolsForSession('session-123'); +``` + +## StandardAgent:使用和 Session 管理 + +### 创建和使用 StandardAgent + +```typescript +import { StandardAgent } from '@continue-reasoning/mini-agent'; + +// 创建带 session 管理的 agent +const agent = new StandardAgent(tools, { + chatProvider: 'openai', // 或 'gemini' + agentConfig: { + model: 'gpt-4', + workingDirectory: process.cwd(), + sessionId: 'default-session' + }, + chatConfig: { + modelName: 'gpt-4', + apiKey: process.env.OPENAI_API_KEY, + tokenLimit: 128000 + }, + toolSchedulerConfig: { + approvalMode: 'default' + } +}); +``` + +### Session 管理 + +```typescript +// 创建新会话 +const sessionId = agent.createNewSession('我的新对话'); + +// 切换到指定会话 +const switched = agent.switchToSession(sessionId); + +// 获取所有会话 +const sessions = agent.getSessions(); + +// 获取当前会话 +const currentSession = agent.getSessionManager().getCurrentSession(); + +// 更新会话标题 +agent.updateSessionTitle(sessionId, '新的标题'); + +// 删除会话 +agent.deleteSession(sessionId); + +// 获取会话状态 +const status = agent.getSessionStatus(sessionId); +console.log(status.sessionInfo); // 包含会话详细信息 +``` + +### Session 数据结构 + +```typescript +import { AgentSession } from 'miniagent'; + +interface AgentSession { + id: string; // 会话唯一标识 + title?: string; // 会话标题 + createdAt: string; // 创建时间 + lastActiveAt: string; // 最后活动时间 + messageHistory: MessageItem[]; // 消息历史 + tokenUsage: { // Token 使用统计 + totalInputTokens: number; + totalOutputTokens: number; + totalTokens: number; + }; + metadata?: Record; // 自定义元数据 +} +``` + +### 多会话处理示例 + +参考 `examples/sessionManagerExample.ts` 的完整实现: + +```typescript +// 参考 examples/sessionManagerExample.ts +import { AgentEventType, AgentEvent } from '@continue-reasoning/mini-agent'; + +async function handleMultipleSessions() { + // 为不同用户创建不同会话 + const userSessions = new Map(); + + function getUserSession(userId: string): string { + if (!userSessions.has(userId)) { + const sessionId = agent.createNewSession(`用户 ${userId} 的对话`); + userSessions.set(userId, sessionId); + } + return userSessions.get(userId)!; + } + + // 处理用户消息 - 基于 sessionManagerExample.ts 的 processConversation 函数 + async function handleUserMessage(userId: string, message: string) { + const sessionId = getUserSession(userId); + + console.log(`💬 Session: ${sessionId}`); + console.log(`👤 User ${userId}: ${message}`); + + let assistantResponse = ''; + + for await (const event of agent.processWithSession(message, sessionId)) { + switch (event.type) { + case AgentEventType.ToolExecutionStart: + const toolStartData = event.data as any; + console.log(`🔧 Tool started: ${toolStartData.toolName}`); + break; + case AgentEventType.ToolExecutionDone: + const toolDoneData = event.data as any; + console.log(`🔧 Tool completed: ${toolDoneData.toolName}`); + break; + case AgentEventType.ResponseChunkTextDelta: + const deltaData = event.data as any; + assistantResponse += deltaData.content.text_delta || ''; + break; + case AgentEventType.ResponseChunkTextDone: + const textDoneData = event.data as any; + console.log(`🤖 Assistant: ${textDoneData.content.text}`); + break; + case AgentEventType.ResponseComplete: + console.log(`✅ Turn complete`); + break; + } + } + + return assistantResponse; + } +} + +// 显示会话状态 - 基于 sessionManagerExample.ts +function showSessionStatus(agent: StandardAgent) { + const sessions = agent.getSessions(); + const currentSessionId = agent.getCurrentSessionId(); + + sessions.forEach((session, index) => { + const isCurrent = session.id === currentSessionId; + const indicator = isCurrent ? '👉' : ' '; + console.log(`${indicator} ${index + 1}. ${session.title} (${session.id})`); + console.log(` Created: ${new Date(session.createdAt).toLocaleString()}`); + console.log(` Messages: ${session.messageHistory.length}`); + console.log(` Tokens: ${session.tokenUsage.totalTokens}`); + }); +} +``` + +## 流式接口:如何处理 Streaming + +### 基础流式处理 + +MiniAgent 只支持流式接口,所有响应都通过 AsyncGenerator 返回。以下是所有事件类型的完整数据结构: + +```typescript +import { AgentEventType, AgentEvent } from '@continue-reasoning/mini-agent'; + +// 基础流式处理 +async function handleStreaming(userInput: string) { + const abortController = new AbortController(); + + try { + for await (const event of agent.processWithSession( + userInput, + undefined, // 使用当前会话 + abortController.signal + )) { + switch (event.type) { + // LLM 响应流事件 + case AgentEventType.ResponseStart: + // 响应开始 + const startData = event.data as { id: string; model: string; tools?: any[] }; + console.log('🚀 响应开始:', startData.model); + break; + + case AgentEventType.ResponseChunkTextDelta: + // 实时文本片段 + const textDelta = event.data as { content: { text_delta: string } }; + process.stdout.write(textDelta.content.text_delta); + break; + + case AgentEventType.ResponseChunkTextDone: + // 文本完成 + const textDone = event.data as { content: { text: string } }; + console.log('\n✅ 文本完成:', textDone.content.text); + break; + + case AgentEventType.ResponseChunkThinkingDelta: + // 思考过程片段 + const thinkingDelta = event.data as { content: { thinking_delta: string } }; + console.log('🧠 思考中:', thinkingDelta.content.thinking_delta); + break; + + case AgentEventType.ResponseChunkThinkingDone: + // 思考过程完成 + const thinkingDone = event.data as { content: { thinking: string } }; + console.log('🧠 思考完成:', thinkingDone.content.thinking); + break; + + case AgentEventType.ResponseChunkFunctionCallDelta: + // 函数调用参数增量 + const funcCallDelta = event.data as { + content: { functionCall: { id: string; call_id: string; name: string; args: string } } + }; + console.log('🔧 函数调用参数:', funcCallDelta.content.functionCall.name); + break; + + case AgentEventType.ResponseChunkFunctionCallDone: + // 函数调用参数完成 + const funcCallDone = event.data as { + content: { functionCall: { id: string; call_id: string; name: string; args: string } } + }; + console.log('🔧 函数调用准备:', funcCallDone.content.functionCall.name, + JSON.parse(funcCallDone.content.functionCall.args)); + break; + + case AgentEventType.ResponseComplete: + // 响应完成 + const completeData = event.data as { + response_id: string; + usage?: { + inputTokens: number; + outputTokens: number; + totalTokens: number; + inputTokenDetails?: { cachedTokens: number }; + outputTokenDetails?: { reasoningTokens: number }; + } + }; + console.log('✅ 响应完成, Token 使用:', completeData.usage); + break; + + // 工具执行事件 + case AgentEventType.ToolExecutionStart: + // 工具开始执行 + const toolStart = event.data as { + toolName: string; + callId: string; + args: Record; + sessionId: string; + turn: number + }; + console.log('🔧 工具开始执行:', toolStart.toolName, toolStart.args); + break; + + case AgentEventType.ToolExecutionDone: + // 工具执行完成 + const toolDone = event.data as { + toolName: string; + callId: string; + result?: any; + error?: string; + duration?: number; + sessionId: string; + turn: number + }; + if (toolDone.error) { + console.log('❌ 工具执行失败:', toolDone.toolName, toolDone.error); + } else { + console.log('✅ 工具执行完成:', toolDone.toolName, + `耗时: ${toolDone.duration}ms`); + } + break; + + // 用户交互事件 + case AgentEventType.UserMessage: + // 用户消息 + const userMsg = event.data as { + type: string; + content: string; + sessionId: string; + turn: number; + metadata?: any + }; + console.log('👤 用户消息:', userMsg.content); + break; + + case AgentEventType.TurnComplete: + // 轮次完成 + const turnComplete = event.data as { + type: string; + sessionId: string; + turn: number; + hasToolCalls: boolean + }; + console.log('🔄 轮次完成:', `Turn ${turnComplete.turn}`, + turnComplete.hasToolCalls ? '包含工具调用' : '无工具调用'); + break; + + // 错误和状态事件 + case AgentEventType.Error: + // 错误事件 + const errorData = event.data as { + message: string; + timestamp: number; + turn: number + }; + console.error('❌ Agent 错误:', errorData.message); + break; + + case AgentEventType.ResponseFailed: + // 响应失败 + const failedData = event.data as { + response_id: string; + error: { code?: string; message?: string } + }; + console.error('❌ 响应失败:', failedData.error.message); + break; + + case AgentEventType.ResponseIncomplete: + // 响应不完整 + const incompleteData = event.data as { + response_id: string; + incomplete_details: { reason: string } + }; + console.warn('⚠️ 响应不完整:', incompleteData.incomplete_details.reason); + break; + + default: + console.log('🔍 其他事件:', event.type, event.data); + } + } + } catch (error) { + console.error('处理错误:', error); + } +} +``` + +### 响应类型分类 + +```typescript +import { AgentEventType } from 'miniagent'; + +function categorizeEvents(event: AgentEvent) { + // LLM 响应事件 + const llmEvents = [ + AgentEventType.ResponseStart, + AgentEventType.ResponseChunkTextDelta, + AgentEventType.ResponseChunkTextDone, + AgentEventType.ResponseChunkThinkingDelta, + AgentEventType.ResponseChunkThinkingDone, + AgentEventType.ResponseChunkFunctionCallDelta, + AgentEventType.ResponseChunkFunctionCallDone, + AgentEventType.ResponseComplete, + AgentEventType.ResponseIncomplete, + AgentEventType.ResponseFailed + ]; + + // 工具执行事件 + const toolEvents = [ + AgentEventType.ToolExecutionStart, + AgentEventType.ToolExecutionDone, + AgentEventType.ToolConfirmation + ]; + + // 用户交互事件 + const userEvents = [ + AgentEventType.UserMessage, + AgentEventType.UserCancelled + ]; + + // Agent 级别事件 + const agentEvents = [ + AgentEventType.TurnComplete, + AgentEventType.Error, + AgentEventType.ModelFallback + ]; +} +``` + +### 取消和超时处理 + +```typescript +async function handleWithTimeout(userInput: string, timeoutMs: number = 30000) { + const abortController = new AbortController(); + + // 设置超时 + const timeout = setTimeout(() => { + abortController.abort(); + }, timeoutMs); + + try { + for await (const event of agent.processWithSession(userInput, undefined, abortController.signal)) { + // 处理事件... + + // 用户取消 + if (shouldCancel()) { + abortController.abort(); + break; + } + } + } catch (error) { + if (abortController.signal.aborted) { + console.log('操作被取消或超时'); + } else { + console.error('处理错误:', error); + } + } finally { + clearTimeout(timeout); + } +} +``` + +## 事件系统:监听和处理事件 + +### 推荐的事件处理模式 + +```typescript +import { AgentEvent, AgentEventType } from 'miniagent'; + +class EventProcessor { + private currentResponse = ''; + private toolExecutions = new Map(); + + async processEvents(agent: StandardAgent, input: string) { + for await (const event of agent.processWithSession(input)) { + await this.handleEvent(event); + } + } + + private async handleEvent(event: AgentEvent) { + switch (event.type) { + // 1. 用户消息处理 + case AgentEventType.UserMessage: + this.onUserMessage(event); + break; + + // 2. LLM 响应流处理 + case AgentEventType.ResponseStart: + this.onResponseStart(event); + break; + + case AgentEventType.ResponseChunkTextDelta: + this.onTextDelta(event); + break; + + case AgentEventType.ResponseChunkTextDone: + this.onTextDone(event); + break; + + case AgentEventType.ResponseChunkThinkingDelta: + this.onThinkingDelta(event); + break; + + case AgentEventType.ResponseChunkThinkingDone: + this.onThinkingDone(event); + break; + + // 3. 函数调用处理 + case AgentEventType.ResponseChunkFunctionCallDelta: + this.onFunctionCallDelta(event); + break; + + case AgentEventType.ResponseChunkFunctionCallDone: + this.onFunctionCallDone(event); + break; + + // 4. 工具执行处理 + case AgentEventType.ToolExecutionStart: + this.onToolStart(event); + break; + + case AgentEventType.ToolExecutionDone: + this.onToolDone(event); + break; + + // 5. 完成和错误处理 + case AgentEventType.TurnComplete: + this.onTurnComplete(event); + break; + + case AgentEventType.ResponseComplete: + this.onResponseComplete(event); + break; + + case AgentEventType.Error: + this.onError(event); + break; + } + } + + private onUserMessage(event: AgentEvent) { + console.log('用户输入:', event.data); + } + + private onResponseStart(event: AgentEvent) { + console.log('开始响应...'); + this.currentResponse = ''; + } + + private onTextDelta(event: AgentEvent) { + const delta = event.data?.content?.text_delta || ''; + this.currentResponse += delta; + process.stdout.write(delta); // 实时显示 + } + + private onTextDone(event: AgentEvent) { + console.log('\n助手回复完成'); + } + + private onThinkingDelta(event: AgentEvent) { + // 可选:显示思考过程 + const thinking = event.data?.content?.thinking_delta || ''; + console.log(`[思考] ${thinking}`); + } + + private onFunctionCallDone(event: AgentEvent) { + const call = event.data?.content?.functionCall; + console.log(`准备调用工具: ${call?.name}`); + } + + private onToolStart(event: AgentEvent) { + const { toolName, callId, args } = event.data as any; + console.log(`🔧 开始执行工具: ${toolName}`); + this.toolExecutions.set(callId, { name: toolName, startTime: Date.now() }); + } + + private onToolDone(event: AgentEvent) { + const { toolName, callId, result, error, duration } = event.data as any; + console.log(`✅ 工具执行完成: ${toolName} (${duration}ms)`); + this.toolExecutions.delete(callId); + } + + private onTurnComplete(event: AgentEvent) { + console.log('对话轮次完成'); + } + + private onError(event: AgentEvent) { + console.error('错误:', event.data); + } +} +``` + +### 自定义事件监听器 + +```typescript +// 注册全局事件监听器 +agent.onEvent('my-logger', (event: AgentEvent) => { + console.log(`[${new Date().toISOString()}] ${event.type}:`, event.data); +}); + +// 移除事件监听器 +agent.offEvent('my-logger'); +``` + +### 高级事件处理模式 + +```typescript +class AdvancedEventHandler { + private progressBar?: any; + private metrics = { + totalTokens: 0, + toolCalls: 0, + responseTime: 0 + }; + + async handleWithProgress(agent: StandardAgent, input: string) { + const startTime = Date.now(); + + for await (const event of agent.processWithSession(input)) { + this.updateMetrics(event); + this.updateUI(event); + } + + this.metrics.responseTime = Date.now() - startTime; + this.showSummary(); + } + + private updateMetrics(event: AgentEvent) { + switch (event.type) { + case AgentEventType.ResponseComplete: + const usage = (event.data as any)?.usage; + if (usage) { + this.metrics.totalTokens = usage.totalTokens; + } + break; + + case AgentEventType.ToolExecutionStart: + this.metrics.toolCalls++; + break; + } + } + + private updateUI(event: AgentEvent) { + // 更新进度条、状态指示器等 + } + + private showSummary() { + console.log('执行摘要:', this.metrics); + } +} +``` + +## 配置详解:所有 Config 说明 + +### AllConfig 结构 + +```typescript +interface AllConfig { + agentConfig: IAgentConfig; // Agent 基础配置 + toolSchedulerConfig: IToolSchedulerConfig; // 工具调度器配置 + chatConfig: IChatConfig; // 聊天提供商配置 +} +``` + +### IAgentConfig - Agent 基础配置 + +```typescript +interface IAgentConfig { + // 必需配置 + model: string; // AI 模型名称 'gpt-4', 'gemini-pro' 等 + workingDirectory: string; // 工作目录 + + // 可选配置 + apiKey?: string; // API 密钥 + sessionId?: string; // 默认会话 ID + systemPrompt?: string; // 系统提示词 + maxHistorySize?: number; // 最大历史记录数 + maxHistoryTokens?: number; // 最大历史 Token 数 + debugMode?: boolean; // 调试模式 + logger?: ILogger; // 自定义日志器 + logLevel?: LogLevel; // 日志级别 + + // MCP 配置 + mcp?: { + enabled: boolean; // 启用 MCP + servers: McpServerConfig[]; // MCP 服务器列表 + autoDiscoverTools?: boolean; // 自动发现工具 + connectionTimeout?: number; // 连接超时 + toolNamingStrategy?: 'prefix' | 'suffix' | 'error'; // 工具命名策略 + toolNamePrefix?: string; // 工具名前缀 + toolNameSuffix?: string; // 工具名后缀 + }; +} +``` + +### IChatConfig - 聊天提供商配置 + +```typescript +interface IChatConfig { + // 基础配置 + modelName: string; // 模型名称 + apiKey: string; // API 密钥 + systemPrompt?: string; // 系统提示词 + tokenLimit: number; // Token 限制 + + // 高级配置 + temperature?: number; // 温度参数 + maxTokens?: number; // 最大输出 Token + topP?: number; // Top-P 参数 + frequencyPenalty?: number; // 频率惩罚 + presencePenalty?: number; // 存在惩罚 + + // 历史管理 + initialHistory?: MessageItem[]; // 初始历史 + maxHistorySize?: number; // 最大历史条数 + maxHistoryTokens?: number; // 最大历史 Token +} +``` + +### IToolSchedulerConfig - 工具调度器配置 + +```typescript +interface IToolSchedulerConfig { + tools?: ITool[]; // 工具列表 + approvalMode?: 'default' | 'yolo' | 'always'; // 批准模式 + + // 回调函数 + outputUpdateHandler?: (callId: string, output: string) => void; + onAllToolCallsComplete?: (completed: ICompletedToolCall[]) => void; + onToolCallsUpdate?: (toolCalls: IToolCall[]) => void; + + // 其他配置 + getPreferredEditor?: () => string | undefined; + config?: unknown; // 自定义配置 +} +``` + +### 配置示例 + +```typescript +// 完整配置示例 +const config: AllConfig = { + // Agent 配置 + agentConfig: { + model: 'gpt-4', + workingDirectory: '/project/path', + apiKey: process.env.OPENAI_API_KEY, + sessionId: 'main-session', + systemPrompt: '你是一个专业的编程助手', + maxHistorySize: 100, + maxHistoryTokens: 50000, + debugMode: false, + logLevel: LogLevel.INFO, + + // MCP 配置 + mcp: { + enabled: true, + autoDiscoverTools: true, + connectionTimeout: 5000, + toolNamingStrategy: 'prefix', + toolNamePrefix: 'mcp', + servers: [ + { + name: 'filesystem', + transport: 'stdio', + command: 'mcp-server-filesystem', + args: ['/project/path'] + } + ] + } + }, + + // 聊天配置 + chatConfig: { + modelName: 'gpt-4', + apiKey: process.env.OPENAI_API_KEY, + systemPrompt: '你是一个有用的助手', + tokenLimit: 128000, + temperature: 0.7, + maxTokens: 4000, + topP: 0.9 + }, + + // 工具调度器配置 + toolSchedulerConfig: { + approvalMode: 'default', + outputUpdateHandler: (callId, output) => { + console.log(`工具 ${callId}: ${output}`); + }, + onAllToolCallsComplete: (completed) => { + console.log(`完成 ${completed.length} 个工具调用`); + }, + onToolCallsUpdate: (toolCalls) => { + console.log(`工具状态更新: ${toolCalls.length} 个调用`); + } + } +}; +``` + +## 工具执行器:回调和批准流程 + +### CoreToolScheduler 回调机制详解 + +CoreToolScheduler 提供了完整的工具执行生命周期回调系统: + +```typescript +import { + IToolScheduler, + IToolSchedulerConfig, + ToolExecutionStartCallback, + ToolExecutionDoneCallback, + IToolCallRequestInfo, + IToolCallResponseInfo, + ICompletedToolCall, + IToolCall, + ToolCallStatus, + ToolConfirmationOutcome +} from '@continue-reasoning/mini-agent'; + +// 1. 配置 ToolScheduler 回调 +const toolSchedulerConfig: IToolSchedulerConfig = { + // 批准模式:'yolo'(自动批准) | 'always'(总是要求批准) | 'default'(根据工具决定) + approvalMode: 'default', + + // 实时输出更新回调 - 工具执行过程中的流式输出 + outputUpdateHandler: (callId: string, output: string) => { + console.log(`📤 工具输出 [${callId}]: ${output}`); + // 可以将输出发送到 UI 或日志系统 + }, + + // 工具状态更新回调 - 工具状态变化时触发 + onToolCallsUpdate: (toolCalls: IToolCall[]) => { + console.log('📊 工具状态更新:'); + toolCalls.forEach(call => { + const { name } = call.request; + const status = call.status; + + switch (status) { + case ToolCallStatus.Validating: + console.log(` 🔍 ${name}: 验证参数中...`); + break; + case ToolCallStatus.AwaitingApproval: + console.log(` ⏳ ${name}: 等待用户批准`); + break; + case ToolCallStatus.Scheduled: + console.log(` 📅 ${name}: 已调度,准备执行`); + break; + case ToolCallStatus.Executing: + console.log(` ⚙️ ${name}: 执行中...`); + break; + case ToolCallStatus.Success: + console.log(` ✅ ${name}: 执行成功`); + break; + case ToolCallStatus.Error: + console.log(` ❌ ${name}: 执行失败`); + break; + case ToolCallStatus.Cancelled: + console.log(` 🚫 ${name}: 已取消`); + break; + } + }); + }, + + // 所有工具完成回调 - 批次处理完成时触发 + onAllToolCallsComplete: (completed: ICompletedToolCall[]) => { + console.log(`✅ 批次完成: ${completed.length} 个工具执行完毕`); + + // 统计执行结果 + const successful = completed.filter(call => call.status === ToolCallStatus.Success); + const failed = completed.filter(call => call.status === ToolCallStatus.Error); + const cancelled = completed.filter(call => call.status === ToolCallStatus.Cancelled); + + console.log(` 成功: ${successful.length}, 失败: ${failed.length}, 取消: ${cancelled.length}`); + + // 处理失败的工具 + failed.forEach(call => { + console.error(` 失败工具 ${call.request.name}: ${call.response.error?.message}`); + }); + }, +}; + +// 2. 在 Agent 处理过程中传递执行回调 +const executionCallbacks = { + onExecutionStart: (toolCall: IToolCallRequestInfo) => { + console.log(`🚀 开始执行工具: ${toolCall.name}`); + console.log(` 调用ID: ${toolCall.callId}`); + console.log(` 参数:`, toolCall.args); + console.log(` Prompt ID: ${toolCall.promptId}`); + }, + + onExecutionDone: ( + request: IToolCallRequestInfo, + response: IToolCallResponseInfo, + duration?: number + ) => { + if (response.success) { + console.log(`✅ 工具执行成功: ${request.name}`); + console.log(` 耗时: ${duration}ms`); + console.log(` 结果:`, response.result?.toHistoryStr()); + } else { + console.log(`❌ 工具执行失败: ${request.name}`); + console.log(` 错误:`, response.error?.message); + console.log(` 耗时: ${duration}ms`); + } + } +}; + +// 3. 手动处理工具批准 +class ToolApprovalManager { + constructor(private scheduler: IToolScheduler) {} + + // 批准工具执行 + async approveToolExecution(callId: string) { + await this.scheduler.handleConfirmationResponse( + callId, + ToolConfirmationOutcome.ProceedOnce + ); + } + + // 总是批准此类工具 + async approveAlways(callId: string) { + await this.scheduler.handleConfirmationResponse( + callId, + ToolConfirmationOutcome.ProceedAlways + ); + } + + // 修改参数后执行 + async modifyAndExecute(callId: string, newContent: string) { + await this.scheduler.handleConfirmationResponse( + callId, + ToolConfirmationOutcome.ModifyWithEditor, + { newContent } + ); + } + + // 取消工具执行 + async cancelToolExecution(callId: string) { + await this.scheduler.handleConfirmationResponse( + callId, + ToolConfirmationOutcome.Cancel + ); + } + + // 监控待批准的工具 + monitorPendingApprovals() { + const pendingCalls = this.scheduler.getCurrentToolCalls() + .filter(call => call.status === ToolCallStatus.AwaitingApproval); + + pendingCalls.forEach(call => { + const details = (call as any).confirmationDetails; + console.log(`⏳ 等待批准: ${call.request.name}`); + console.log(` 类型: ${details.type}`); + console.log(` 标题: ${details.title}`); + }); + + return pendingCalls; + } +} + +// 4. 完整的 Agent 配置示例 +const agent = new StandardAgent(tools, { + chatProvider: 'openai', + agentConfig: { + model: 'gpt-4', + workingDirectory: process.cwd(), + }, + chatConfig: { + modelName: 'gpt-4', + apiKey: process.env.OPENAI_API_KEY, + tokenLimit: 128000, + }, + toolSchedulerConfig: toolSchedulerConfig, // 使用上面定义的配置 +}); + +// 5. 处理流式响应并传递执行回调 +async function processWithToolCallbacks(userInput: string) { + for await (const event of agent.processWithSession(userInput)) { + // 处理各种事件... + + // 注意:executionCallbacks 在 agent.process 内部传递给 scheduler.schedule + // 这些回调会在 CoreToolScheduler.executeToolCall 方法中被调用 + } +} +``` + +### 工具批准模式 + +```typescript +// 1. 'yolo' 模式 - 自动批准所有工具 +const yoloConfig = { + toolSchedulerConfig: { + approvalMode: 'yolo' as const, + // 所有工具都会自动执行,无需用户确认 + } +}; + +// 2. 'always' 模式 - 总是需要用户确认 +const alwaysConfig = { + toolSchedulerConfig: { + approvalMode: 'always' as const, + // 所有工具都需要用户手动确认 + } +}; + +// 3. 'default' 模式 - 根据工具的 shouldConfirmExecute 决定 +const defaultConfig = { + toolSchedulerConfig: { + approvalMode: 'default' as const, + // 由每个工具的 shouldConfirmExecute 方法决定是否需要确认 + } +}; +``` + +### 手动处理工具确认 + +```typescript +import { + ToolConfirmationOutcome, + ToolCallConfirmationDetails +} from '@continue-reasoning/mini-agent'; + +class ToolApprovalHandler { + private scheduler: IToolScheduler; + + constructor(scheduler: IToolScheduler) { + this.scheduler = scheduler; + } + + // 批准工具执行 + async approveToolExecution(callId: string) { + await this.scheduler.handleConfirmationResponse( + callId, + ToolConfirmationOutcome.ProceedOnce + ); + } + + // 总是批准此类工具 + async approveAlways(callId: string) { + await this.scheduler.handleConfirmationResponse( + callId, + ToolConfirmationOutcome.ProceedAlways + ); + } + + // 取消工具执行 + async cancelToolExecution(callId: string) { + await this.scheduler.handleConfirmationResponse( + callId, + ToolConfirmationOutcome.Cancel + ); + } + + // 修改后执行 + async modifyAndExecute(callId: string, newContent: string) { + await this.scheduler.handleConfirmationResponse( + callId, + ToolConfirmationOutcome.ModifyWithEditor, + { newContent } + ); + } +} +``` + +### 监控工具执行状态 + +```typescript +class ToolExecutionMonitor { + private toolStates = new Map(); + + monitorExecution(scheduler: IToolScheduler) { + // 获取当前所有工具调用 + const currentCalls = scheduler.getCurrentToolCalls(); + + currentCalls.forEach(call => { + this.toolStates.set(call.request.callId, call); + this.logToolState(call); + }); + } + + private logToolState(call: IToolCall) { + switch (call.status) { + case 'validating': + console.log(`🔍 验证工具: ${call.request.name}`); + break; + case 'scheduled': + console.log(`📅 已调度工具: ${call.request.name}`); + break; + case 'executing': + console.log(`⚙️ 执行中: ${call.request.name}`); + break; + case 'success': + console.log(`✅ 成功: ${call.request.name}`); + break; + case 'error': + console.log(`❌ 失败: ${call.request.name}`); + break; + case 'cancelled': + console.log(`🚫 已取消: ${call.request.name}`); + break; + case 'awaiting_approval': + console.log(`⏳ 等待批准: ${call.request.name}`); + this.handleAwaitingApproval(call as IWaitingToolCall); + break; + } + } + + private handleAwaitingApproval(call: IWaitingToolCall) { + const details = call.confirmationDetails; + + switch (details.type) { + case 'exec': + console.log(`需要确认执行命令: ${details.command}`); + break; + case 'edit': + console.log(`需要确认文件编辑: ${details.fileName}`); + break; + case 'mcp': + console.log(`需要确认 MCP 工具: ${details.toolDisplayName}`); + break; + case 'info': + console.log(`需要确认信息获取: ${details.prompt}`); + break; + } + } +} +``` + +### 批量工具执行管理 + +```typescript +class BatchToolManager { + private agent: StandardAgent; + private pendingApprovals = new Map(); + + constructor(agent: StandardAgent) { + this.agent = agent; + this.setupToolCallbacks(); + } + + private setupToolCallbacks() { + const scheduler = this.agent.getToolScheduler(); + + // 监听工具状态更新 + const config = { + onToolCallsUpdate: (toolCalls: IToolCall[]) => { + toolCalls.forEach(call => { + if (call.status === 'awaiting_approval') { + this.pendingApprovals.set(call.request.callId, call as IWaitingToolCall); + } else { + this.pendingApprovals.delete(call.request.callId); + } + }); + }, + + onAllToolCallsComplete: (completed: ICompletedToolCall[]) => { + console.log(`批次完成: ${completed.length} 个工具执行完毕`); + this.pendingApprovals.clear(); + } + }; + } + + // 批准所有待批准的工具 + async approveAllPending() { + const scheduler = this.agent.getToolScheduler(); + + for (const [callId, call] of this.pendingApprovals) { + await scheduler.handleConfirmationResponse( + callId, + ToolConfirmationOutcome.ProceedOnce + ); + } + } + + // 取消所有待批准的工具 + async cancelAllPending() { + const scheduler = this.agent.getToolScheduler(); + + for (const [callId, call] of this.pendingApprovals) { + await scheduler.handleConfirmationResponse( + callId, + ToolConfirmationOutcome.Cancel + ); + } + } + + // 获取待批准工具列表 + getPendingApprovals(): IWaitingToolCall[] { + return Array.from(this.pendingApprovals.values()); + } +} +``` + +## 常见问题和解决方案 + +### Q1: 如何处理工具执行超时? + +```typescript +class TimeoutHandler { + async executeWithTimeout(agent: StandardAgent, input: string, timeoutMs: number = 30000) { + const abortController = new AbortController(); + + const timeout = setTimeout(() => { + console.log('操作超时,正在取消...'); + abortController.abort(); + }, timeoutMs); + + try { + for await (const event of agent.processWithSession(input, undefined, abortController.signal)) { + // 处理事件... + + if (event.type === 'turn.complete') { + clearTimeout(timeout); + break; + } + } + } catch (error) { + if (abortController.signal.aborted) { + console.log('操作被取消'); + } else { + throw error; + } + } + } +} +``` + +### Q2: 如何实现工具执行重试? + +```typescript +class RetryHandler { + async executeWithRetry( + scheduler: IToolScheduler, + toolCall: IToolCallRequestInfo, + maxRetries: number = 3 + ) { + for (let attempt = 1; attempt <= maxRetries; attempt++) { + try { + await scheduler.schedule([toolCall], new AbortController().signal); + break; // 成功,跳出循环 + } catch (error) { + console.log(`尝试 ${attempt}/${maxRetries} 失败:`, error); + + if (attempt === maxRetries) { + throw error; // 最后一次尝试仍失败 + } + + // 等待后重试 + await new Promise(resolve => setTimeout(resolve, 1000 * attempt)); + } + } + } +} +``` + +### Q3: 如何实现自定义工具输出格式? + +```typescript +class CustomToolResult implements IToolResult { + constructor( + private data: any, + private format: 'json' | 'text' | 'markdown' = 'json' + ) {} + + toHistoryStr(): string { + switch (this.format) { + case 'json': + return JSON.stringify(this.data, null, 2); + case 'markdown': + return this.formatAsMarkdown(this.data); + case 'text': + return String(this.data); + default: + return JSON.stringify(this.data); + } + } + + private formatAsMarkdown(data: any): string { + if (typeof data === 'string') return data; + + return `\`\`\`json\n${JSON.stringify(data, null, 2)}\n\`\`\``; + } +} + +// 在工具中使用 +class MyTool implements ITool { + async execute(params: any, signal: AbortSignal): Promise { + const result = { success: true, data: params }; + return new CustomToolResult(result, 'markdown'); + } +} +``` + +### Q4: 如何实现会话持久化? + +```typescript +class SessionPersistence { + private storage: Map = new Map(); + + async saveSession(session: AgentSession): Promise { + // 保存到文件系统 + const filePath = `./sessions/${session.id}.json`; + await fs.writeFile(filePath, JSON.stringify(session, null, 2)); + + // 或保存到数据库 + // await db.sessions.upsert({ where: { id: session.id }, data: session }); + } + + async loadSession(sessionId: string): Promise { + try { + // 从文件系统加载 + const filePath = `./sessions/${sessionId}.json`; + const data = await fs.readFile(filePath, 'utf-8'); + return JSON.parse(data); + + // 或从数据库加载 + // return await db.sessions.findUnique({ where: { id: sessionId } }); + } catch (error) { + return null; + } + } + + async initializeWithPersistence(agent: StandardAgent) { + const sessionManager = agent.getSessionManager(); + + // 重写保存方法 + const originalSaveSession = sessionManager.saveSession.bind(sessionManager); + sessionManager.saveSession = async (sessionId: string) => { + const session = sessionManager.getSession(sessionId); + if (session) { + await this.saveSession(session); + return true; + } + return false; + }; + + // 重写加载方法 + const originalLoadSession = sessionManager.loadSession.bind(sessionManager); + sessionManager.loadSession = async (sessionId: string) => { + return await this.loadSession(sessionId); + }; + } +} +``` + +### Q5: 如何实现高级错误处理? + +```typescript +class AdvancedErrorHandler { + async handleWithRecovery(agent: StandardAgent, input: string) { + const maxRetries = 3; + let attempt = 0; + + while (attempt < maxRetries) { + try { + await this.processWithErrorHandling(agent, input); + break; // 成功完成 + } catch (error) { + attempt++; + console.log(`尝试 ${attempt}/${maxRetries} 失败:`, error); + + if (attempt < maxRetries) { + // 尝试恢复 + await this.attemptRecovery(agent, error); + } else { + // 最终失败处理 + await this.handleFinalFailure(agent, error); + throw error; + } + } + } + } + + private async processWithErrorHandling(agent: StandardAgent, input: string) { + for await (const event of agent.processWithSession(input)) { + if (event.type === 'agent.error') { + throw new Error(`Agent 错误: ${event.data}`); + } + + if (event.type === 'response.failed') { + throw new Error(`响应失败: ${event.data}`); + } + + if (event.type === 'tool.call.execution.done') { + const { error } = event.data as any; + if (error) { + console.warn(`工具执行警告: ${error}`); + } + } + } + } + + private async attemptRecovery(agent: StandardAgent, error: Error) { + // 清理状态 + agent.clearHistory(); + + // 重置会话 + const newSessionId = agent.createNewSession('恢复会话'); + agent.switchToSession(newSessionId); + + // 等待一段时间再重试 + await new Promise(resolve => setTimeout(resolve, 2000)); + } + + private async handleFinalFailure(agent: StandardAgent, error: Error) { + console.error('最终失败,进行清理...'); + + // 取消所有待处理操作 + const scheduler = agent.getToolScheduler(); + scheduler.cancelAll('系统错误'); + + // 记录错误日志 + await this.logError(error); + } + + private async logError(error: Error) { + const errorLog = { + timestamp: new Date().toISOString(), + message: error.message, + stack: error.stack, + }; + + console.error('错误日志:', errorLog); + // 保存到文件或发送到监控系统 + } +} +``` + +--- + +## 总结 + +MiniAgent 框架提供了完整的 AI Agent 集成解决方案,支持: + +1. **灵活的工具系统** - 轻松创建和管理自定义工具 +2. **强大的会话管理** - 多会话并发处理和状态管理 +3. **纯流式接口** - 实时响应和事件处理 +4. **完善的事件系统** - 细粒度的执行监控和控制 +5. **丰富的配置选项** - 适应各种使用场景 +6. **智能的工具执行** - 确认流程和批准机制 + +通过本指南,你应该能够成功集成 MiniAgent 到你的项目中,并根据具体需求进行定制和扩展。 \ No newline at end of file diff --git a/package.json b/package.json index bb6201e..2c4b080 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "@continue-reasoning/mini-agent", - "version": "0.1.7", + "version": "0.2.0", "type": "module", "private": false, "description": "A platform-agnostic AI agent framework for building autonomous AI agents with tool execution capabilities",