Fix: /memory/ingest hardcodes source_mode=chat, partially defeats #877 partition by mimeding · Pull Request #4 · mimeding/osaurus

mimeding · 2026-05-26T01:50:41Z

Summary

Why this matters (business)

PR osaurus-ai#877 introduced one of the more important UX guarantees in the memory subsystem: facts learned while an agent had tools available (Work mode, sandbox, etc.) should not seep back into pure-chat sessions and make the model believe it has tools it doesn't. Without that partition, agents in chat-only mode confidently reference file paths, sandbox commands, and other agentic affordances that won't actually work for the user.

The HTTP ingest endpoint — which is the path users hit for:

seeding memory from existing chat logs,
migrating from another assistant,
bulk-loading conversations for offline benchmarking (LoCoMo runs use this exactly),
programmatic admin tooling,

was tagging every ingested turn as pure .chat, regardless of what mode produced the source data. Tool-flavoured turns therefore landed in the partition that chat-only recall reads from, and the recent partition guarantee silently degraded the moment a real batch was loaded.

This is the kind of regression that doesn't surface in unit tests but shows up as "the agent keeps offering to run shell commands when I never enabled tools" in user reports.

What's wrong (technical)

                    try? db.insertChunk(
                        ...
                        sourceMode: .chat
                    )
                    ...
                        sourceMode: .chat,

Both the chunk insert and the recordConversationTurn call hardcoded .chat. MemorySourceMode is already Codable (see Work/WorkExecutionMode.swift:38), and both downstream APIs already accept an optional MemorySourceMode parameter — only the HTTP request decoder was missing the field.

Fix

Add two optional fields to MemoryIngestRequest:

source_mode — batch-level default applied to every turn that does not override it.
source_mode on MemoryIngestTurn — per-turn override for migrated logs that mix modes.

Both default to .chat, so existing callers (and the example in docs/MEMORY.md) are byte-identical. Tagging now flows correctly through db.insertChunk and MemoryService.recordConversationTurn.

Also:

Updated the validation error string to reflect the new fields, so misformatted requests get a useful hint.
Updated docs/MEMORY.md parameter table with the new fields and a one-line note on why tagging matters.

Changes

Behavior change (additive — default unchanged)
UI change
Refactor / chore
Tests (no HTTP-handler integration harness for /memory/ingest; downstream insertChunk(sourceMode:) already has coverage in MemoryDatabaseTests)
Docs (docs/MEMORY.md updated)

Test Plan

Default behavior unchanged:

curl http://127.0.0.1:1337/memory/ingest -H 'Content-Type: application/json' \
  -d '{"agent_id":"a","conversation_id":"c","turns":[{"user":"hi","assistant":"hello"}]}'

Inspect the resulting conversation_chunks row — source_mode = 'chat' as before.

Per-batch override:

curl http://127.0.0.1:1337/memory/ingest -H 'Content-Type: application/json' \
  -d '{"agent_id":"a","conversation_id":"c2","source_mode":"work_host",
       "turns":[{"user":"run ls","assistant":"file1\nfile2"}]}'

source_mode = 'work_host' on every chunk. Open the agent in pure-chat mode and confirm via Insights that those rows are filtered out of chatOnly recall.

Per-turn override:

curl http://127.0.0.1:1337/memory/ingest -H 'Content-Type: application/json' \
  -d '{"agent_id":"a","conversation_id":"c3","source_mode":"chat",
       "turns":[
         {"user":"hi","assistant":"hello"},
         {"user":"sandbox","assistant":"$ ls","source_mode":"chat_sandbox"}
       ]}'

First chunk pair tagged chat, second tagged chat_sandbox.

Checklist

I have read CONTRIBUTING.md
I added/updated tests where reasonable (see test plan; downstream layers already covered)
I updated docs/README as needed
I verified build on macOS with Xcode 16.4+ (authored in a Linux CI sandbox)

PR osaurus-ai#877 partitioned the memory store by execution mode (chat, chat_sandbox, work_host, work_sandbox) and made pure-chat recall filter out tool-using contributions to prevent phantom-tool priming. The HTTP ingest endpoint, however, was hardcoded to tag every turn as .chat, regardless of where the source turns actually came from. That means anyone seeding memory from existing logs, migrating from another system, or running offline batch ingestion (LoCoMo benchmark runs are exactly this) ends up writing tool-flavoured turns under the chat partition. When the agent later runs in pure-chat mode, the chatOnly filter happily surfaces those rows -- the very leak the partition was designed to prevent. Fix by accepting an optional source_mode at both the request level (batch default) and per-turn (override). Both fields default to .chat so existing callers keep working byte-for-byte. MemorySourceMode is already Codable with the right string raw values, so callers send 'chat' / 'chat_sandbox' / 'work_host' / 'work_sandbox' as JSON strings. Docs/MEMORY.md updated with the new fields and a short note on why tagging matters. Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>

ModelManager.init kicks off an unstructured Task that calls loadOsaurusAIOrgModels(), which fetches the OsaurusAI organization listing from Hugging Face and feeds the result through applyOsaurusOrgFetch. The unit-test runner repeatedly constructs ModelManager() to drive applyOsaurusOrgFetch directly. The background launch-time fetch races with those test calls — whichever finishes last wins, and the merge result is non-deterministic. That's the root cause of the flaky ModelManagerSuggestedTests failures seen across many of the recent PR CI runs (applyOsaurusOrgFetch_dropsStaleAutoFetched OnReapply, applyOsaurusOrgFetch_addsNewEntriesAfterCurated, etc.). Gate the launch-time fetch on a small isRunningInTestEnvironment helper that checks for any of XCTestConfigurationFilePath, XCTestBundlePath, or XCTestSessionIdentifier in the process environment. Those variables are only present inside an xctest host process; production app launches still get the HF fetch exactly as before. This is a network call, so removing it under tests also has the side benefit of making the test suite work offline / on hermetic CI runners. Co-authored-by: Michael Meding <mimeding@users.noreply.github.com>

cursoragent and others added 2 commits May 26, 2026 01:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: /memory/ingest hardcodes source_mode=chat, partially defeats #877 partition#4

Fix: /memory/ingest hardcodes source_mode=chat, partially defeats #877 partition#4
mimeding wants to merge 2 commits into
mainfrom
cursor/memory-ingest-source-mode-2812

mimeding commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants