-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Overview
Hermes Agent's memory tool currently supports only substring matching (old_text parameter) for finding entries. There is no semantic search — "What database are we using?" won't find a memory about "We migrated to MySQL" unless the exact keywords overlap. The separate session_search tool does FTS5 keyword search over past transcripts, but structured memories have no retrieval beyond exact match.
This issue adds a recall action to the memory tool: semantic vector search with composite scoring that blends similarity, recency, and importance.
Inspired by CrewAI's RecallFlow (MIT licensed).
Parent tracking issue: #509
Depends on: Memory storage migration (SQLite), Embedding infrastructure
What to Build
New recall Action
memory(action="recall", content="What database are we using?", limit=5)
Returns the top-N memories ranked by composite score, with metadata:
{
"results": [
{
"content": "We migrated from PostgreSQL to MySQL last week",
"scope": "/infrastructure/database",
"importance": 0.8,
"score": 0.82,
"match_reasons": ["semantic", "importance"],
"evidence_gaps": [],
"created_at": "2026-03-01T..."
}
]
}Composite Scoring
score = (semantic_weight * similarity) + (recency_weight * decay) + (importance_weight * importance)
decay = 0.5 ^ (age_days / recency_half_life_days)
Default weights (configurable):
- semantic_weight: 0.5
- recency_weight: 0.3
- importance_weight: 0.2
- recency_half_life_days: 30
This means a critical architecture decision from 6 months ago (high importance, moderate decay) outranks a trivial note from yesterday (low importance, high recency) that happens to mention similar keywords.
Match Metadata
Each result includes:
match_reasons: why it scored high (["semantic", "recency", "importance"])evidence_gaps: what information might be missing (populated when confidence is low)
Adaptive Depth (Optional Enhancement)
For short queries (<200 chars): pure vector search, no LLM call (~fast)
For complex queries (>200 chars): LLM distills into targeted sub-queries, searches across multiple scopes
Configuration
# ~/.hermes/config.yaml
memory:
recall:
semantic_weight: 0.5
recency_weight: 0.3
importance_weight: 0.2
recency_half_life_days: 30Files to Change
tools/memory_tool.py— Addrecallaction, composite scoring logicagent/embeddings.py— Use embedder for query embeddingtests/tools/test_memory_tool.py— Tests for recall with mocked embeddings
Acceptance Criteria
-
recallaction returns semantically relevant memories - Composite scoring blends similarity + recency + importance
- Weights are configurable via config.yaml
- Results include match_reasons and evidence_gaps
- Short queries skip LLM analysis (fast path)
- Returns results even when embedding infrastructure is degraded (falls back to FTS)
- Memory tool schema updated with
recallaction documentation