Skip to content

feat(cli-mode): CLI 模式替代 Agent SDK,走交互式配额#2470

Open
tier2tech-tian wants to merge 129 commits into
nanocoai:mainfrom
tier2tech-tian:feat/cli-mode-impl
Open

feat(cli-mode): CLI 模式替代 Agent SDK,走交互式配额#2470
tier2tech-tian wants to merge 129 commits into
nanocoai:mainfrom
tier2tech-tian:feat/cli-mode-impl

Conversation

@tier2tech-tian
Copy link
Copy Markdown

Summary

  • agent-runner 新增 runCliQuery() 路径,指定群 useCliMode: true 后每轮 spawn claude --print --resume 替代 Agent SDK query()
  • 核心:清除 CLAUDE_AGENT_SDK_CLIENT_APP 环境变量,不带 x-client-app header,走交互式配额
  • 纯函数 + 40 个单元测试全部通过

改动文件

  • src/types.ts — ContainerConfig 加 useCliMode 字段
  • container/agent-runner/src/cli-runner.ts — 新文件,核心逻辑
  • container/agent-runner/src/index.ts — main() 中 useCliMode 分叉
  • src/container-runner.ts / src/index.ts / src/task-scheduler.ts — 透传 useCliMode
  • src/cli-runner.test.ts — 40 个单元测试

Test plan

  • 40 个纯函数单测通过(parseStreamJsonLine/buildCliArgs/buildMcpConfig/mapToContainerOutput/buildCliEnv)
  • 宿主侧 tsc 零新增错误
  • Codex Review 通过(4 个 Important 已修复)
  • E2E:对 oc_df0d2dcb8747d8bcc2047c60ddcc7120 设置 useCliMode 后验证

🤖 Generated with Claude Code

tianjunjie and others added 30 commits April 4, 2026 22:19
- saveProfile/loadProfile/loadFacts/storeFacts/enforceMaxFacts 加 userId 参数
- INSERT/SELECT SQL 加 user_id 字段
- ON CONFLICT 匹配复合主键 (group_folder, user_id)
- CREATE INDEX 移到 migrateAddUserId 之后避免迁移报错
- E2E 验证通过:5 条 facts + profile + CLAUDE.md 注入
查询时只按 user_id 过滤(不再按 group_folder),同一用户在不同群的记忆互通。
写入时保留 group_folder 追踪来源。
- /account 列出所有可用 secrets 及当前绑定
- /account <name> 切换到指定账号
- 切换后自动清除 session,下次对话用新 key
- 通过 OneCLI agents set-secrets 实现
- 429/rate_limit/overloaded 检测 → 自动切到下一个 OneCLI secret
- 60s 防抖 + 全部耗尽 10min cooldown
- 轮换状态持久化 SQLite,重启保留
- runAgent 集成重试(轮换后自动重试 1 次)
- 21 个新测试覆盖全部逻辑
validateAdditionalMounts 会自动拼 /workspace/extra/ 前缀,
配置里只需要相对路径。绝对路径会被安全校验拒绝。
之前只从内存 Map 删除引用,飞书里的黄色卡片永远停在「⏳ 处理中...」
现在正式回复到达时调 im.message.delete 删除进度卡片
- agent-runner: tool_use 消息通过 writeOutput(status='progress') 输出
- container-runner: ContainerOutput.status 支持 'progress'
- index.ts: progress 消息转发给 channel.sendMessage
- feishu channel: 进度消息聚合到黄色卡片,正式回复到达后删除卡片
- 支持 Bash/Read/Write/Edit/WebSearch 等工具类型的 emoji 图标
正式回复(含代码块/表格等)用无标题卡片,视觉更干净
进度卡片保留黄色标题「⏳ 处理中...」以区分
+ agent-runner 加 debug 日志查看 assistant 消息结构
+ 从 message.message.content 取 tool_use blocks(Claude Code SDK 结构)
- index.ts: 从 lastUserMsg.sender 提取 memorySenderId
- queue.ts: add() 加 userId,QueueEntry 含 userId
- updater.ts: updateMemory/applyUpdates 传 userId
- inject.ts: injectMemory 传 userId 到 loadProfile/loadFacts/MemoryStore
- 所有调用链: processGroupMessages → queue.add → updater → storage 全程传递
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- 接收图片消息:下载到 group/images/,Agent 可读取
- 接收合并转发:递归解析子消息(text/image/post/merge_forward)
- Post 富文本图片提取
- 发送图片:检测回复中的图片路径,上传飞书并发送
- 安全限制:MAX_MERGE_TEXT_LEN=8000, MAX_MERGE_IMAGES=5, MAX_MERGE_DEPTH=1
- 5 个新测试覆盖解析逻辑

参考 Nine adapter.py 翻译为 TypeScript
- sendPlainOrCard 接受 usage 参数,追加灰色脚注
- 纯文本回复有 usage 时强制升级为卡片
- 完成卡片也保留 usage(两处都显示)
- buildProgressCard 新增 startTime 参数,header 显示已用时间 (Xs/XmYs)
- setTyping(true) 启动 3s 间隔 setInterval 自动 patch 卡片刷新 spinner
- 三重清理防泄露:setTyping(false) + 卡片完成 + 10 分钟硬上限
- progressCards entry 新增 startTime 字段记录创建时间

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Spec for replacing Docker container agent runtime with Node.js
child process spawn. Includes proposal, requirements (R1-R8),
design, tasks, and test plan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace all hardcoded /workspace/* paths with configurable PATHS
object from ContainerInput.workspacePaths. Add NANOCLAW_IPC_DIR
to mcpServers.nanoclaw.env for MCP server path injection.

Changes:
- index.ts: 5 independent + 1 derived path references replaced
- ipc-mcp-stdio.ts: IPC_DIR reads from NANOCLAW_IPC_DIR env
- Remove Docker-specific /tmp/input.json cleanup
- Handle undefined global/extra paths gracefully

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite container-runner.ts to spawn agent-runner as a local
Node.js child process instead of Docker container.

New functions:
- resolveWorkspacePaths: compute host paths per group
- prepareGroupSession: per-group .claude dir + skills sync
- buildLocalEnv: filtered env for child process
- getCredentials: OneCLI CLI → process.env → .env fallback
- getFeishuToken: User Token → Tenant Token fallback
- parseEnvOutput: parse KEY=VALUE format
- checkAgentRunnerDist: verify dist/index.js exists
- killProcessTree: SIGTERM → 5s → SIGKILL process group

Removed:
- Docker spawn, buildContainerArgs, buildVolumeMounts
- @onecli-sh/sdk import (replaced with CLI mode)
- container-runtime.js import
- mount-security.js import
- ensureContainerSystemRunning from index.ts

Preserved: detectRateLimit, rotateAccount, writeTasksSnapshot,
writeGroupsSnapshot

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Delete container-runtime.ts, mount-security.ts
- Delete container/Dockerfile, container/build.sh
- Remove CONTAINER_IMAGE, CONTAINER_TIMEOUT, CONTAINER_MAX_OUTPUT_SIZE,
  MOUNT_ALLOWLIST_PATH from config.ts
- Rename MAX_CONCURRENT_CONTAINERS → MAX_CONCURRENT_AGENTS
- Add "build:agent" npm script for agent-runner compilation
- Update comments to remove Docker references

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Update Quick Context: containers → local child processes
- Update Key Files: container-runner description
- Update Credentials: OneCLI CLI mode
- Update Development: add build:agent, security note, limitations
- Replace Container Build Cache with Docker Rollback section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rewrite container-runner.test.ts: remove Docker mocks, verify
  node spawn with detached flag, workspacePaths in stdin, cwd,
  parseEnvOutput, checkAgentRunnerDist, resolveWorkspacePaths,
  prepareGroupSession, timeout behavior
- Fix account-rotate.test.ts: remove container-runtime and
  mount-security mocks, add env and group-folder mocks

All 322 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
导入的 Mem0 数据 user_id='',之前 WHERE user_id=? 只查当前用户,
导致 1904 条全局记忆搜不到。改为 WHERE user_id=? OR user_id=''。
tianjunjie and others added 26 commits April 18, 2026 09:16
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
原逻辑在 agent 启动前就提前推进 cursor,进程被杀时回滚来不及
执行,导致重启后消息被认为已处理而丢失。

改为:
- 成功路径:bot 回复入库后推进 cursor
- error+已发回复:推进 cursor(防重复回复)
- error+未发回复:不推进(允许重启后重试)
- 删除不再需要的 previousCursor 回滚机制

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- send_message MCP tool 新增 target_chat_jid 参数,仅 main group 可用
- 启动时自动调用 syncGroups 同步飞书群名到 DB
- 新增 patrol.sh 巡检脚本,供定时任务读取各群状态

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
评审增加可测试性维度,汇报前要求制定 P0/P1/P2 测试分层计划

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
旧逻辑统计 agent 最后回复后的用户消息数,不准确(用户已读但没被标记)。
新逻辑看最后一条消息是用户还是 agent 发的,加空闲时长。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
之前 agent 通过 send_message MCP tool 发的消息只走飞书不入库,
导致巡检脚本看不到 bot 回复,误判"谁在等谁"。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
agent 调用工具时的进度事件(如 "🔧 Bash: ls -la")存入 DB,
供巡检和搜索使用,可以看到 agent 在做什么。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- patrol.sh: turn 检测 SQL 从 MAX(timestamp) 改为 MAX(rowid),避免同时间戳多行匹配
- ipc.ts: send_message 存储时用 ASSISTANT_NAME 替代硬编码 '二狗'

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- IpcDeps 新增 enqueueMessageCheck,跨群消息发送后 enqueue 目标群
- 跨群消息存入 DB 时标记为非 bot 消息(is_from_me=false),让目标 agent 视为用户指令
- 同群消息行为不变

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
根因:IPC poll 间隔 1 秒,sendMessage 飞书 API 调用超过 1 秒时,
下一个 cycle 读到同一个文件导致重复处理。
修复:readFileSync 后立即 unlinkSync,再执行异步操作。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
根因:launchctl restart 后 Claude SDK session resume (resumeAt=latest) 会
重新执行被 kill 中断的 tool calls,导致 send_message 被重复调用。
修复:host 侧 IPC watcher 用 chatJid+content hash 做 30 秒窗口去重。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- is_from_me 改为 OR 条件绕过整个 trigger 检查(之前在 AND 内只绕过 sender 校验)
- enqueueMessageCheck 对 idle 容器调用 closeStdin 唤醒处理新消息
- trigger pattern 去掉 ^ 前缀锚点,支持消息中间 @bot
- IPC send_message 30s 窗口去重,防 session resume 重发
- IPC error handler 增加 existsSync 守卫,防已删文件 rename 报错

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
原逻辑从 mentions 数组查发送者姓名,但 mentions 只含被 @的人,
导致 fallback 到 open_id,agent 无法识别用户身份。
改为调飞书 contact.user.get API + 内存缓存。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
contact API 即使有 contact:user.base:readonly 权限,tenant_access_token
也拿不到 name 字段。改用 im/v1/chats/{chat_id}/members 接口,一次拉取
全部群成员并缓存,后续同群消息直接命中缓存。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
飞书回复的同时,把内容剥离媒体标记、送 LLM 摘要、推 Pushover,
iOS 端"朗读通知"自动朗读。fire-and-forget,不阻塞飞书主流程。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- src/container-runner.ts: 新建群 settings.json 默认 model
- src/index.ts: + 前缀触发的 Opus 路由(包括 piped 路径)
- container/agent-runner/src/index.ts: runner fallback defaultModel

Sonnet 路由不变。已同步刷新 10 个存量 session 的 settings.json。

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
agent 传 oc_xxx 格式时自动规范化为 fs:oc_xxx,
避免 findChannel 匹配失败导致消息静默丢失。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
三阶段改造:
1. 时间戳带日期(pretty: MM-DD HH:MM:SS.mmm, json: ISO 8601)+ JSON/Pretty 自动切换
2. AsyncLocalStorage 自动注入 traceId/chatJid(376 处调用零改动)
3. RotatingFileStream 按大小轮转(默认 10MB × 7 归档)

新增文件: log-context.ts, logger.test.ts (15 tests)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
头像生成逻辑不再需要,简化为仅调用 im.chat.update 更新群名。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
agent-runner 侧已有修复(7ec2fa9),但 host 侧 ipc.ts 的
watcher 也需要对 oc_xxx 格式的 JID 补全 fs: 前缀,否则
findChannel 匹配不到目标 channel 导致消息静默丢失。

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sendPlainOrCard/sendMessage 返回飞书 API 的 message_id,bot 回复入库时
优先用飞书 message_id 作为 DB 主键。fetchReplyContext 先查 DB 再 fallback
飞书 API,解决引用卡片消息时只能看到 [互动卡片] 的问题。

改动:
- sendPlainOrCard: void → string | undefined(返回飞书 message_id)
- sendMessage/extractAndSendMedia: 透传 message_id
- Channel 接口 + task-scheduler/ipc 类型同步
- fetchReplyContext: DB 优先查询 + 静态 import getMessageById
- 16 个新单元测试(DB 命中/miss/异常/截断/卡片/token 失败)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat: Pushover 语音通知 + IPC 跨群消息 + 日志系统升级
agent-runner 新增 runCliQuery() 路径,指定群 useCliMode: true 后
每轮 spawn claude --print --resume 替代 Agent SDK query()。
核心:清除 CLAUDE_AGENT_SDK_CLIENT_APP 环境变量,不带 x-client-app header。

改动:
- types.ts: ContainerConfig 加 useCliMode 字段
- cli-runner.ts: 纯函数(parseStreamJsonLine/buildCliArgs/buildMcpConfig/
  mapToContainerOutput/buildCliEnv)+ runCliQuery 主函数
- index.ts (agent-runner): main() 中 useCliMode 分叉
- container-runner.ts/index.ts/task-scheduler.ts: 透传 useCliMode
- cli-runner.test.ts: 38 个单元测试全部通过

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Review 发现 5 个 Important 问题,修复 4 个:
- #1: ContainerOutput 补齐 modelContextWindows 字段
- #3: system error message 加 typeof guard
- #4: mapToContainerOutput 返回数组,不丢中间输出
- nanocoai#5: CLI 循环中去除重复 success output

新增 2 个测试用例(40 total):
- system error 非字符串 message 回退
- assistant 混合 text+tool_use 全部返回

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant