diff --git a/.lore.md b/.lore.md index 8c55a32..e19d71a 100644 --- a/.lore.md +++ b/.lore.md @@ -70,11 +70,11 @@ ### Preference - -* **Always fix cache memory leaks with TTL eviction, size cap, and scheduled pruning**: Cache memory leak fix pattern: (1) TTL check in \`.get()\` — delete and return undefined if expired; (2) LRU eviction in \`.set()\` — delete oldest key when \`store.size >= maxEntries\`; (3) \`setInterval(() => this.prune(), 60\_000)\` in constructor. Defaults: \`maxEntries = 10\_000\`, \`ttlMs = 300\_000\` (5 min). Note: \`prune()\` is NOT currently scheduled — the \`setInterval\` pattern is the prescribed fix, not existing behavior. Always use \`flock\` advisory locking instead of \`proper-lockfile\` — \`proper-lockfile@4.1.2\` fails in containerized environments where PID namespaces reset on restart, leaving stale locks. \`flock\` is automatically released on process exit. Session ground-truth: cache entries are never auto-evicted and \`prune()\` is never scheduled in current code — do not assert otherwise. + +* **Always investigate cache warming system economics with specific cost/hit data before requesting fixes**: Cache warming system investigations: user arrives with concrete numerical evidence of poor ROI (e.g., '$65.89 spent, $4.43 saved'), specific session-level data (session IDs, turn counts, hit rates, P(returns) values), and identified problems. Trace full code path across cache-warmer.ts, cost-tracker.ts, pipeline.ts, and idle.ts. Ground responses in specific constants, formulas, and line numbers. Engage at the level of break-even thresholds, ROI guards, and survival probability math. -* **Always request critical code reviews with specific file paths, line numbers, and severity classifications**: Code review, investigation & workflow standards: (1) Reviews: exact file paths, line numbers, severity (C/M/L), root causes, concrete fixes. Check state-not-cleared, consume-once flags, circuit breaker bypass, concurrency edges. (2) Investigation: read actual source, trace full execution paths, enumerate 2-4 candidate explanations before diving in, report confirmed/falsified verdict with line numbers. Demand concrete metrics before accepting fixes. (3) PR discipline: critical self-review before merge, fix all criticals, CI green, amend+force-push. Resolve \`.lore.md\` rebase conflicts with \`--ours\`. After merge, pull main before follow-up work. (4) Planning: write plan file, wait for explicit approval, then execute. Pull from origin/main before any exploration or edits. (5) After bug fix: add tests (4-6 edge cases) in dedicated file referencing issue number. (6) Sentry IDs start with \`LOREAI-GATEWAY-\`. (7) Run lint, typecheck, full test suite before committing. (8) Present structured fix plan before implementation; wait for explicit approval. Never re-propose explicitly rejected approaches. Always include migration versioning context in schema change PRs. +* **Always request critical code reviews with specific file paths, line numbers, and severity classifications**: Code review, investigation & workflow standards: (1) Reviews: exact file paths, line numbers, severity (C/M/L), root causes, concrete fixes. Check state-not-cleared, consume-once flags, circuit breaker bypass, concurrency edges. (2) Investigation: read actual source, trace full execution paths, enumerate 2-4 candidate explanations, report confirmed/falsified verdict with line numbers. Demand concrete metrics before accepting fixes. (3) PR discipline: critical self-review before merge, fix all criticals, CI green, amend+force-push. Resolve \`.lore.md\` rebase conflicts with \`--ours\`. After merge, pull main before follow-up work. (4) Planning: write plan file, wait for explicit approval, then execute. Pull from origin/main before any exploration or edits. (5) After bug fix: add tests (4-6 edge cases) in dedicated file referencing issue number. (6) Sentry IDs start with \`LOREAI-GATEWAY-\`. (7) Run lint, typecheck, full test suite before committing. (8) Present structured fix plan before implementation; wait for explicit approval. Never re-propose explicitly rejected approaches. Always include migration versioning context in schema change PRs. * **Always request worker tests with a consistent 7-case spec covering compute, missing-record, cleanup retention, and sync scenarios**: Worker test files follow a consistent 7-case spec: (1) compute job — DB lookup + update, (2) missing record — skip without throw, (3) cleanup — hard-delete records archived >30 days, (4) cleanup — preserve recently archived records, (5) sync — process a batch, (6) sync — skip missing records, (7) sync — respect dryRun flag. Tests mock DB and Redis. Use Vitest project-wide (\`import { describe, it, expect } from 'vitest'\`; migrated from Mocha+Chai+ts-node May 2026 — 312ms vs 30s startup). Use kebab-case file naming. diff --git a/packages/core/src/recall.ts b/packages/core/src/recall.ts index e0b41c2..d05028e 100644 --- a/packages/core/src/recall.ts +++ b/packages/core/src/recall.ts @@ -523,6 +523,12 @@ export async function searchRecall( weight?: number; }> = []; + // Track whether session-specific results (temporal/distillation) exist + // across any query. Used to downweight knowledge when session content is + // available — knowledge entries are general cross-session facts, and when + // temporal details exist they are more likely the answer. + let hasSessionResults = false; + // Track where primary (first-query) lists end so the MAX_RRF_LISTS cap // trims expanded-query lists first, preserving vector/supplemental lists. let primaryListEnd = 0; @@ -571,6 +577,14 @@ export async function searchRecall( } } + if (temporalResults.length > 0 || distillationResults.length > 0) { + hasSessionResults = true; + } + + // When searching all scopes AND session-specific results exist, + // downweight knowledge BM25 so session content ranks higher. + const knowledgeWeight = scope === "all" && hasSessionResults ? 0.6 : 1.0; + allRrfLists.push( { items: knowledgeResults.map((item) => ({ @@ -578,6 +592,7 @@ export async function searchRecall( item, })), key: (r) => `k:${r.item.id}`, + weight: knowledgeWeight, }, { items: distillationResults.map((item) => ({ @@ -689,11 +704,14 @@ export async function searchRecall( } } if (vectorTagged.length) { - // Same `k:` key prefix as BM25 knowledge — RRF merges, not duplicates + // Same `k:` key prefix as BM25 knowledge — RRF merges, not duplicates. + // Apply knowledge downweight so knowledge is consistently + // deprioritized when session-specific content exists. + const kvWeight = scope === "all" && hasSessionResults ? 0.6 : 1.0; allRrfLists.push({ items: vectorTagged, key: (r) => `k:${r.item.id}`, - weight: vectorWeight, + weight: vectorWeight * kvWeight, }); } }