BYK · BYK · May 20, 2026 · May 20, 2026
diff --git a/.lore.md b/.lore.md
@@ -70,11 +70,11 @@
 
 ### Preference
 
-<!-- lore:019e44c8-e3b2-70c1-afb6-d3acf24c531a -->
-* **Always fix cache memory leaks with TTL eviction, size cap, and scheduled pruning**: Cache memory leak fix pattern: (1) TTL check in \`.get()\` — delete and return undefined if expired; (2) LRU eviction in \`.set()\` — delete oldest key when \`store.size >= maxEntries\`; (3) \`setInterval(() => this.prune(), 60\_000)\` in constructor. Defaults: \`maxEntries = 10\_000\`, \`ttlMs = 300\_000\` (5 min). Note: \`prune()\` is NOT currently scheduled — the \`setInterval\` pattern is the prescribed fix, not existing behavior. Always use \`flock\` advisory locking instead of \`proper-lockfile\` — \`proper-lockfile@4.1.2\` fails in containerized environments where PID namespaces reset on restart, leaving stale locks. \`flock\` is automatically released on process exit. Session ground-truth: cache entries are never auto-evicted and \`prune()\` is never scheduled in current code — do not assert otherwise.
+<!-- lore:019e473a-2e06-7911-af1c-78191b4e3a4e -->
+* **Always investigate cache warming system economics with specific cost/hit data before requesting fixes**: Cache warming system investigations: user arrives with concrete numerical evidence of poor ROI (e.g., '$65.89 spent, $4.43 saved'), specific session-level data (session IDs, turn counts, hit rates, P(returns) values), and identified problems. Trace full code path across cache-warmer.ts, cost-tracker.ts, pipeline.ts, and idle.ts. Ground responses in specific constants, formulas, and line numbers. Engage at the level of break-even thresholds, ROI guards, and survival probability math.
 
 <!-- lore:019e4422-5b29-77a8-8956-488233ef16a4 -->
-* **Always request critical code reviews with specific file paths, line numbers, and severity classifications**: Code review, investigation & workflow standards: (1) Reviews: exact file paths, line numbers, severity (C/M/L), root causes, concrete fixes. Check state-not-cleared, consume-once flags, circuit breaker bypass, concurrency edges. (2) Investigation: read actual source, trace full execution paths, enumerate 2-4 candidate explanations before diving in, report confirmed/falsified verdict with line numbers. Demand concrete metrics before accepting fixes. (3) PR discipline: critical self-review before merge, fix all criticals, CI green, amend+force-push. Resolve \`.lore.md\` rebase conflicts with \`--ours\`. After merge, pull main before follow-up work. (4) Planning: write plan file, wait for explicit approval, then execute. Pull from origin/main before any exploration or edits. (5) After bug fix: add tests (4-6 edge cases) in dedicated file referencing issue number. (6) Sentry IDs start with \`LOREAI-GATEWAY-\`. (7) Run lint, typecheck, full test suite before committing. (8) Present structured fix plan before implementation; wait for explicit approval. Never re-propose explicitly rejected approaches. Always include migration versioning context in schema change PRs.
+* **Always request critical code reviews with specific file paths, line numbers, and severity classifications**: Code review, investigation & workflow standards: (1) Reviews: exact file paths, line numbers, severity (C/M/L), root causes, concrete fixes. Check state-not-cleared, consume-once flags, circuit breaker bypass, concurrency edges. (2) Investigation: read actual source, trace full execution paths, enumerate 2-4 candidate explanations, report confirmed/falsified verdict with line numbers. Demand concrete metrics before accepting fixes. (3) PR discipline: critical self-review before merge, fix all criticals, CI green, amend+force-push. Resolve \`.lore.md\` rebase conflicts with \`--ours\`. After merge, pull main before follow-up work. (4) Planning: write plan file, wait for explicit approval, then execute. Pull from origin/main before any exploration or edits. (5) After bug fix: add tests (4-6 edge cases) in dedicated file referencing issue number. (6) Sentry IDs start with \`LOREAI-GATEWAY-\`. (7) Run lint, typecheck, full test suite before committing. (8) Present structured fix plan before implementation; wait for explicit approval. Never re-propose explicitly rejected approaches. Always include migration versioning context in schema change PRs.
 
 <!-- lore:019e44c8-4e3f-7835-972f-02ed2033a842 -->
 * **Always request worker tests with a consistent 7-case spec covering compute, missing-record, cleanup retention, and sync scenarios**: Worker test files follow a consistent 7-case spec: (1) compute job — DB lookup + update, (2) missing record — skip without throw, (3) cleanup — hard-delete records archived >30 days, (4) cleanup — preserve recently archived records, (5) sync — process a batch, (6) sync — skip missing records, (7) sync — respect dryRun flag. Tests mock DB and Redis. Use Vitest project-wide (\`import { describe, it, expect } from 'vitest'\`; migrated from Mocha+Chai+ts-node May 2026 — 312ms vs 30s startup). Use kebab-case file naming.

diff --git a/packages/core/src/recall.ts b/packages/core/src/recall.ts
@@ -523,6 +523,12 @@ export async function searchRecall(
     weight?: number;
   }> = [];
 
+  // Track whether session-specific results (temporal/distillation) exist
+  // across any query. Used to downweight knowledge when session content is
+  // available — knowledge entries are general cross-session facts, and when
+  // temporal details exist they are more likely the answer.
+  let hasSessionResults = false;
+
   // Track where primary (first-query) lists end so the MAX_RRF_LISTS cap
   // trims expanded-query lists first, preserving vector/supplemental lists.
   let primaryListEnd = 0;
@@ -571,13 +577,22 @@ export async function searchRecall(
       }
     }
 
+    if (temporalResults.length > 0 || distillationResults.length > 0) {
+      hasSessionResults = true;
+    }
+
+    // When searching all scopes AND session-specific results exist,
+    // downweight knowledge BM25 so session content ranks higher.
+    const knowledgeWeight = scope === "all" && hasSessionResults ? 0.6 : 1.0;
+
     allRrfLists.push(
       {
         items: knowledgeResults.map((item) => ({
           source: "knowledge" as const,
           item,
         })),
         key: (r) => `k:${r.item.id}`,
+        weight: knowledgeWeight,
       },
       {
         items: distillationResults.map((item) => ({
@@ -689,11 +704,14 @@ export async function searchRecall(
           }
         }
         if (vectorTagged.length) {
-          // Same `k:` key prefix as BM25 knowledge — RRF merges, not duplicates
+          // Same `k:` key prefix as BM25 knowledge — RRF merges, not duplicates.
+          // Apply knowledge downweight so knowledge is consistently
+          // deprioritized when session-specific content exists.
+          const kvWeight = scope === "all" && hasSessionResults ? 0.6 : 1.0;
           allRrfLists.push({
             items: vectorTagged,
             key: (r) => `k:${r.item.id}`,
-            weight: vectorWeight,
+            weight: vectorWeight * kvWeight,
           });
         }
       }