feat(memory): add MEMORY.md index file and mem_index_refresh tool#1122
feat(memory): add MEMORY.md index file and mem_index_refresh tool#1122shiloong wants to merge 1 commit into
Conversation
Forrest-ly
left a comment
There was a problem hiding this comment.
PR #1122 Review — feat(memory): add MEMORY.md index file and mem_index_refresh tool
本 PR 添加 MEMORY.md 索引文件和 mem_index_refresh 工具。索引格式为 - title — description,容量限制 ≤200 条、≤25KB。提供完整重建(refresh_index)、单条 upsert(update_index_entry)和删除(remove_index_entry)功能。兼容 Claude Code 的
.claude/memory/MEMORY.md 格式。
320 行新增、5 个文件。代码结构清晰,测试覆盖 parse/truncate/extract 基本场景。
发现
- src/agent-memory/src/tools/memory_index.rs:~51 — 文档声称 "oldest/least-accessed entries are evicted",但实现仅按路径字母序截断 (CONFIRMED, 中)
模块文档注释:
▎ Capacity: ≤200 lines, ≤25KB. Oldest/least-accessed entries are evicted when limits are reached.
实际代码:
entries.sort_by(|a, b| a.path.cmp(&b.path));
entries.truncate(MAX_LINES);
按路径 a→z 排序后直接 truncate,意味着字母序靠后的路径(如 z-misc/...)被优先淘汰,与访问频率或创建时间无关。若有 201 个文件,所有 w-z 开头的路径被丢弃,无论它们是否刚刚被使用。实现应按 mtime 或 access_count 排序后再截断,或修正文档。
- src/agent-memory/src/tools/memory_index.rs:~215 — update_index_entry 和 remove_index_entry 为死代码,未接入 write/observe 路径 (CONFIRMED, 中)
注释声称 "Called after memory_observe or mem_write to keep the index current",但 diff 中无任何调用方。mem_write、memory_observe 等工具未修改来调用这些函数。这意味着 MEMORY.md 在首次 mem_index_refresh
后立即开始过时——任何后续的写入、观察或删除操作都不会更新索引。用户必须反复手动调用 mem_index_refresh。
- src/agent-memory/src/tools/memory_index.rs:~77 — to_line() 用字节长度对比 MAX_ENTRY_CHARS 常量 (CONFIRMED, 低)
const MAX_ENTRY_CHARS: usize = 150;
// ...
if line.len() > MAX_ENTRY_CHARS {
String::len() 返回字节数而非字符数。对于 ASCII 内容(1 字节/字符)这等价于 150 字符,但对于 CJK 内容(3 字节/字符)仅约 50 个汉字就会触发截断。常量命名 "CHARS" 产生误导。应改为 line.chars().count() 或将常量重命名为 MAX_ENTRY_BYTES。
- src/agent-memory/src/tools/memory_index.rs:~281 — body 提取与 frontmatter 检测不一致,无 body 文件会将 --- 作为 description (CONFIRMED, 低)
let body = content
.find("\n---\n")
.map(|pos| &content[pos + 5..])
.unwrap_or(content);
若文件仅有 frontmatter 无 body(如以 --- 结尾且无尾换行),content.find("\n---\n") 返回 None,fallback 用全文作为 body,first_line 取到 frontmatter 的 --- 开头行。Description 会是 "---" 而非空。
- src/agent-memory/src/tools/memory_index.rs:~99 — parse_index 对含 ]( 的 title 或含 ) 的 path 解析错误 (CONFIRMED, 低)
if let Some(bracket_end) = rest.find("](") {
// ...
if let Some(paren_end) = after_bracket.find(')') {
使用简单的 find 查找首个 ]( 和 ) 。若 title 含 ]((如 fix](issue)或 path 含 )(如 notes/fix(bug).md),解析会得到错误的 title/path。实际触发概率低,但可通过记录时转义或改用更健壮的正则解析避免。
MEMORY.md index file: - Compact table of contents for all memory files (≤200 entries, ≤25KB) - Format: '- [title](path) — description' (≤150 chars per line) - Compatible with Claude Code's .claude/memory/MEMORY.md format - Auto-generated from frontmatter title/hint/category fields Index management: - build_index(): scan mount root, extract title+description from each .md - write_index(): write with dual capacity protection (lines + bytes) - refresh_index(): full rebuild (MCP tool: mem_index_refresh) - update_index_entry(): upsert single entry after memory_observe - remove_index_entry(): remove entry after mem_remove UTF-8 safe truncation: - Char-boundary-aware truncation for multi-byte characters - '…' (3 bytes UTF-8) properly accounted for in byte budget Tests: 5 new index tests (parse, build, truncate, extract, fallback) Total: 0 failures across all suites Tools: 21 total (was 20) Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>
886ff98 to
506bfc5
Compare
Review 修复回复1. 排序逻辑与文档不一致 — ✅ 已修复文档原声称 "Oldest/least-accessed entries are evicted",但实际按路径字母序截断。已修正文档为: 按 mtime/access_count 排序需要读取每个文件的 frontmatter 元数据,与"紧凑索引"的设计目标(快速生成、低 I/O)冲突。字母序排序保证了确定性和可预测的截断行为。 2. update_index_entry/remove_index_entry 为死代码 —
|
Description
Add compact MEMORY.md index file and refresh tool:
Supports indexed context retrieval as an alternative to full mtime-based scanning.
Related Issue
no-issue: MEMORY.md index file for compact context injection
Scope
memory(agent-memory)Checklist
cargo clippy --all-targets -- -D warningspassescargo testpasses (149 tests)