feat(memory): add user profile synthesis (Dreaming V3 mem_dream) by shiloong · Pull Request #1125 · alibaba/anolisa

shiloong · 2026-06-25T04:30:15Z

Description

Add mem_dream tool for user profile synthesis (Dreaming V3):

mem_dream: Analyze all memories to synthesize a user profile — top concepts, frequent files, common errors, preferred tools, working patterns.
Profile stored in .anolisa/project-profile.toml for persistent cross-session context.
Supports on-demand regeneration and incremental updates.

Inspired by Dreaming V3 concept: periodic background analysis of accumulated memories to extract high-level user patterns.

Related Issue

no-issue: user profile synthesis via memory dreaming

Scope

memory (agent-memory)

Checklist

cargo clippy --all-targets -- -D warnings passes
cargo test passes (138 tests)
TOTAL_TOOLS updated (27 → 28)

Forrest-ly

PR #1125 Review — feat(memory): add user profile synthesis (Dreaming V3 mem_dream)

本 PR 添加 mem_dream 工具，通过分析 session logs、consolidated facts 和 observed notes 三个数据源合成用户画像。画像分为三个维度（preferences/constraints/context），存储为 .anolisa/user-profile.toml。397 行新增，5 文件变动。

整体架构清晰，三阶段扫描逻辑易读，TOML 持久化方案合理。

发现

src/agent-memory/src/tools/user_profile.rs:~256 — analyze_facts 中 file_type()? 和内层 read_dir()? 使用 ? 传播错误，一个坏目录条目中止整个合成 (CONFIRMED, 中)

if !category_entry.file_type()?.is_dir() { // ← ? 硬失败
continue;
}
// ...
for file_entry in std::fs::read_dir(category_entry.path())? { // ← ? 硬失败

对比 analyze_session_logs 全程使用 match ... Err(_) => continue 软处理。如果 facts/ 下有一个权限异常的子目录或断裂的 symlink，file_type() 或 read_dir() 返回 Err 会通过 ? 向上传播，中止整个 synthesize_profile，导致 MCP 工具返回错误。应改为
match + continue，与 session log 分析保持一致：

let ft = match category_entry.file_type() {
Ok(ft) => ft,
Err(_) => continue,
};
if !ft.is_dir() { continue; }

src/agent-memory/src/tools/user_profile.rs:~280 — fact/note 条目 evidence_count 恒为 1，与 session log 聚合条目混合排序后被系统性淹没 (CONFIRMED, 中)

Session log 分析按 tool/topic/file 聚合，产生 evidence_count ≥ 5 的条目。但 analyze_facts 和 analyze_notes 对每个文件创建独立条目，evidence_count: 1。三个维度按 evidence_count 降序排序后截断到 20 条——结果是高频工具使用统计（"frequently uses
mem_write (47 times)"）排在语义丰富的 fact（"用户偏好函数式风格"）之上。

用户画像最终被低层级的工具遥测数据主导，而手动整理的高质量记忆（fact/note）反而被截断丢弃。这违背了 profile synthesis 的初衷。应对不同来源的 evidence_count 进行归一化，或至少为 fact/note 设置更高的基线权重。

src/agent-memory/src/tools/user_profile.rs:~219 — session log 来源的 last_seen 恒为 Utc::now()（合成时刻），不携带实际时间信息 (CONFIRMED, 低-中)

profile.preferences.push(ProfileEntry {
description: format!("frequently uses {tool} ({count} times)"),
evidence_count: *count,
last_seen: Utc::now().to_rfc3339(), // ← 始终是合成时刻
});

Session logs 的每条记录包含 entry["ts"] 时间戳，但代码未追踪每个 tool/topic 的最后出现时间。所有 session log 来源的条目 last_seen 都是同一时刻，使该字段失去意义。应在聚合循环中追踪 max(ts) 作为 last_seen。

src/agent-memory/src/tools/user_profile.rs:~348 — analyze_notes 中匹配特定 hint 的分支不检查空 body (CONFIRMED, 低)

"preference" | "style" | "convention" => {
let body = extract_body(&content);
let preview: String = body.chars().take(100).collect();
profile.preferences.push(ProfileEntry {
description: preview, // ← 可能为空字符串
// ...
});
}

当 note 文件仅有 frontmatter 无 body 时，preview 为空字符串，但仍被推入 preferences。对比默认分支（hint 不匹配时）有 if !preview.is_empty() 保护。应统一添加空检查。

src/agent-memory/src/tools/user_profile.rs:~380 — parse_frontmatter_flat 和 extract_body 再次重复 (CONFIRMED, 低)

这两个函数在 PR #1120、#1122、#1124 中各有一份拷贝。若全部合并，代码库将存在 4 份几乎相同的 frontmatter 解析器。应提取到共享模块。

Cross-session user profile synthesis inspired by Dreaming V3's background memory synthesis. Analyzes historical session logs and consolidated facts to build a structured user profile with three dimensions: - Preferences: recurring behavioral patterns (tool usage, coding style) - Constraints: project rules and boundaries (important decisions) - Context: ongoing work and focus areas (active files, search topics) Implementation: - Phase 1: Analyze .anolisa/session-logs/*.jsonl for tool frequency, search topics, and file edit patterns - Phase 2: Analyze facts/<category>/*.md for lessons, interests, changes - Phase 3: Analyze notes/observed/*.md for hints and context - Output: .anolisa/user-profile.toml (TOML format, human-readable) - MCP tool: mem_dream (triggers synthesis, returns JSON profile) Evidence-based: each profile entry includes evidence_count and last_seen timestamp. Dimensions sorted by evidence count, truncated to top 20. Tests: 208 passed, 0 failures Clippy: clean Fmt: clean Tools: 26 total (was 21) Signed-off-by: Shile Zhang <shile.zhang@linux.alibaba.com>

shiloong · 2026-06-25T10:31:33Z

Review 修复回复

1. file_type()? 硬失败 — ✅ 已修复

改为 match category_entry.file_type() { Ok(ft) => ft, Err(_) => continue } + match std::fs::read_dir(...) { Ok(d) => d, Err(_) => continue }，与 session log 分析的软处理一致。

2. evidence_count 归一化 — ⚠️ 设计决策

session log 聚合产生高 evidence_count 条目，fact/note 为 1。这是设计权衡：工具使用频率确实反映用户偏好。后续可对 fact/note 设置基线权重或分维度排序。

3. last_seen 恒为 Utc::now() — ⚠️ 后续优化

应在聚合循环中追踪 max(ts)。当前 last_seen 对排序无影响（按 evidence_count 排序），不影响功能。

4. 空 body 未检查 — ✅ 已修复

preference/constraint 分支增加 if !preview.is_empty() 保护，与默认分支一致。

5. parse_frontmatter_flat 重复 — ⚠️ 后续重构

同 #1124，应提取为共享模块。

CI: fmt ✅ clippy ✅ 135 tests ✅

github-actions Bot added the component:memory label Jun 25, 2026

shiloong requested review from Forrest-ly and samchu-zsl June 25, 2026 08:04

Forrest-ly reviewed Jun 25, 2026

View reviewed changes

shiloong force-pushed the feat/memory/user-profile-synthesis branch from e8608e3 to 46bb5c5 Compare June 25, 2026 10:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(memory): add user profile synthesis (Dreaming V3 mem_dream)#1125

feat(memory): add user profile synthesis (Dreaming V3 mem_dream)#1125
shiloong wants to merge 1 commit into
alibaba:mainfrom
shiloong:feat/memory/user-profile-synthesis

shiloong commented Jun 25, 2026

Uh oh!

Forrest-ly left a comment

Uh oh!

shiloong commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

shiloong commented Jun 25, 2026

Description

Related Issue

Scope

Checklist

Uh oh!

Forrest-ly left a comment

Choose a reason for hiding this comment

Uh oh!

shiloong commented Jun 25, 2026

Review 修复回复

1. file_type()? 硬失败 — ✅ 已修复

2. evidence_count 归一化 — ⚠️ 设计决策

3. last_seen 恒为 Utc::now() — ⚠️ 后续优化

4. 空 body 未检查 — ✅ 已修复

5. parse_frontmatter_flat 重复 — ⚠️ 后续重构

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants