feat: 添加内存诊断工具#1665
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
Walkthrough该 PR 添加周期性内存诊断子系统:配置与依赖、MemoryDiagnosticsTask 实现、多维度采集器、二进制估算与采样、JSONL 快照持久化与轮转、启动集成、单元测试与运维指南文档。 变更说明内存诊断服务完整实现
Sequence Diagram(s)sequenceDiagram
participant Scheduler as AsyncTaskManager
participant Task as MemoryDiagnosticsTask
participant Collector as _collect_snapshot
participant Proc as ProcessMetrics
participant HF as HeartflowCollector
participant Trace as TracemallocCollector
participant Writer as _write_snapshot
Scheduler->>Task: 调度周期性运行
Task->>Collector: 汇聚各子系统指标
Collector->>Proc: 收集进程/子进程指标 (psutil)
Collector->>HF: 收集会话/消息二进制估算
Collector->>Trace: 可选 tracemalloc diff
Collector-->>Task: 返回快照
Task->>Writer: 持久化为 JSONL 并触发轮转/清理
Writer-->>Task: 完成并记录摘要/告警
预估代码审查工作量🎯 4 (Complex) | ⏱️ ~60 minutes 可能相关的 PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/memory_diagnostics_guide.md`:
- Around line 103-123: The documentation shows an inconsistent default path: the
canonical file is "logs/memory_diagnostics/memory_diagnostics.jsonl" but the
PowerShell examples and rotation example omit the "memory_diagnostics" folder
and base filename; update all occurrences (including the PowerShell examples at
the two places called out and the rotation example) to use the full path and
filename—e.g., change "Get-Content logs\memory_diagnostics.jsonl -Tail 20" to
"Get-Content logs\memory_diagnostics\memory_diagnostics.jsonl -Tail 20" and make
the rotated example include the directory and base file name like
"logs/memory_diagnostics/memory_diagnostics.20260509-153000.jsonl"; scan the doc
for other instances in the 423-439 range and make them consistent as well.
In `@src/services/memory_diagnostics_service.py`:
- Around line 1002-1012: The current code appends full child process cmdlines
into child_items using _safe_process_cmdline, which may leak private paths or
secrets; modify _safe_process_cmdline (or wrap its use where child_items is
built) to return a sanitized value containing only the executable basename and a
short hashed/length-limited summary (or a flag like "<redacted>") instead of the
raw cmdline, and ensure the JSONL output uses that sanitized string for the
"cmdline" field so sensitive arguments are never written out.
- Around line 595-603: The config memory_diagnostics_jsonl_max_total_size_mb is
treated as a per-file threshold in _rotate_snapshot_file_if_needed (rotating
when the active file exceeds the value) but the code still keeps
DEFAULT_JSONL_ROTATED_FILE_KEEP rotated files, so directory total can reach
~(keep+1)*threshold; fix by enforcing a true "max total size" during cleanup: in
_cleanup_rotated_snapshot_files (called from _rotate_snapshot_file_if_needed)
compute total bytes across the active file, rotated files returned by
_build_rotated_snapshot_path pattern, and then delete the oldest rotated files
until total_bytes <= max_total_size_bytes; alternatively if you intend a
single-file limit, rename the config to indicate "per-file" limit—apply the
former change to implement the documented "max total size" semantics.
- Line 21: Replace the direct object import "global_config" in this module with
a module-level import and/or a live accessor so reads always reflect
hot-reloads; specifically stop using "from src.config.config import
global_config" and instead import the config module (e.g., "from src.config
import config") and change all uses of the symbol "global_config" in
memory_diagnostics_service to access the live object via the module
(config.global_config...) or implement a small helper function
get_global_config() that returns src.config.config.global_config; update every
place that reads configuration (sampling interval, output path, thresholds,
tracemalloc switch, etc.) and ensure ConfigManager.reload_config() will affect
those reads at runtime.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c324c86f-5e03-4511-8e0b-ce005246c210
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (8)
docs/memory_diagnostics_guide.mdpyproject.tomlrequirements.txtsrc/config/config.pysrc/config/official_configs.pysrc/main.pysrc/services/memory_diagnostics_service.pytests/test_memory_diagnostics_service.py
c2cc76e to
621ab6f
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/memory_diagnostics_guide.md`:
- Around line 31-47: Add a short dependency note at the top of the memory
diagnostics config section in docs/memory_diagnostics_guide.md stating that the
memory diagnostics feature requires psutil>=6.0.0 (as declared in pyproject.toml
and requirements.txt), and include a one-line install instruction (e.g. pip
install "psutil>=6.0.0"); keep the existing configuration example (keys like
enable_memory_diagnostics, memory_diagnostics_interval_seconds,
memory_diagnostics_top_sessions, etc.) unchanged and clarify that psutil is
required for runtime memory metrics used by the diagnostics feature.
- Around line 125-335: The docs omit the memory_automation module exposed by
_collect_memory_automation_metrics(); add a new diagnostic step (e.g., "第十步:看
memory_automation 队列与工作器") that lists the fields memory_automation.started,
memory_automation.fact_writeback_queue,
memory_automation.fact_writeback_worker_active,
memory_automation.chat_summary_queue,
memory_automation.chat_summary_worker_active, and
memory_automation.chat_summary_states, and give concise guidance to check for
queue backlog, inactive/blocked workers, and long-running summary state entries
referencing those field names so operators know to inspect the automation queues
and workers when memory rises.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 3ed66af9-5129-46c7-961a-21d790ac3536
📒 Files selected for processing (4)
docs/memory_diagnostics_guide.mdsrc/config/official_configs.pysrc/services/memory_diagnostics_service.pytests/test_memory_diagnostics_service.py
🚧 Files skipped from review as they are similar to previous changes (3)
- src/config/official_configs.py
- tests/test_memory_diagnostics_service.py
- src/services/memory_diagnostics_service.py
ModifiedBy 枚举值为大写的 "AI"/"USER",但前端 getModifierBadge 比较的是小写 'ai',导致所有自动审核的表达式均被错误标记为"人工"。 序列化时统一转为小写以与前端约定保持一致。 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
feat(A_memorix):优化聊天摘要窗口与历史回顾
WebUI 的封禁和更新接口仅修改了数据库 is_banned 字段,未同步移除 emoji_manager.emojis 内存列表中的对应项,导致插件等消费者在服务 重启前仍能选中已封禁的表情包。 同时修复 emoji_manager.ban_emoji() 中依赖身份比较(MaiEmoji 未 定义 __eq__)导致跨实例调用时移除静默失败的问题,改为按 file_hash 过滤。 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
列表过滤后未更新 _emoji_num,后续容量检查会使用过期值。 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ved-from-memory fix: WebUI 封禁表情包后未从内存列表移除
feat(A_memorix):更加优化的人物画像迭代机制

zh-CN目标翻译作为常规 GitHub 编辑面;常规翻译以 Crowdin ->l10n_*PR 回流为准,详见docs/i18n.md请填写以下内容
(删除掉中括号内的空格,并替换为小写的x)
main分支 禁止修改,请确认本次提交的分支 不是main分支src/A_memorix,我确认已阅读src/A_memorix/MODIFICATION_POLICY.md,不涉及则无需勾选其他信息
关联 Issue:Close #
截图/GIF:
附加信息:
新增长时间运行内存诊断工具,用于排查长期运行后的内存占用增长问题。
诊断任务可通过 [debug] 配置开关启用,定期采集相关状态,并输出 JSONL 快照日志,方便对比趋势和定位异常来源。并用 AI 写了个零基础使用文档。
Summary by CodeRabbit
新功能
文档
依赖更新
测试