Skip to content

fix(store): flush remote upserts for immediate search#543

Open
dommonkhouse wants to merge 1 commit into
zilliztech:mainfrom
dommonkhouse:fix/flush-remote-upserts
Open

fix(store): flush remote upserts for immediate search#543
dommonkhouse wants to merge 1 commit into
zilliztech:mainfrom
dommonkhouse:fix/flush-remote-upserts

Conversation

@dommonkhouse

Copy link
Copy Markdown

Summary

  • Flush remote Milvus collections after upsert so newly captured chunks are immediately searchable
  • Keep Milvus Lite behaviour unchanged
  • Add regression coverage for remote and local upsert behaviour

Verification

  • uv run pytest tests/test_store.py
  • uv run ruff check src/memsearch/store.py tests/test_store.py
  • git diff --check

Note: full suite currently has one unrelated failure in tests/test_embeddings_openai.py::test_embed_deterministic due exact float comparison drift.

@dommonkhouse

Copy link
Copy Markdown
Author

Fresh reproduction from a remote Milvus setup using upstream vanilla memsearch[onnx]==0.4.3:\n\n- Temporary collection: ms_vanilla_probe_<timestamp>\n- memsearch index reported Indexed 1 chunks.\n- Immediate memsearch stats for that collection reported Total indexed chunks: 0\n- Searches through the agreed 5-second window returned []\n- The same test shape using my fork with this PR's flush patch reported Total indexed chunks: 1 immediately and search found the probe at poll 0\n\nSo Milvus auto-flush is not enough for the write-then-search workflow on remote Milvus; an explicit flush after upsert is still needed for immediate visibility.

@dommonkhouse

Copy link
Copy Markdown
Author

Ready for maintainer review and merge. I verified this PR is non-draft, mergeable, and has no configured checks reporting failures.

wombatfish added a commit to wombatfish/memsearch that referenced this pull request Jun 3, 2026
…istency_level

Supersedes the per-upsert flush approach (PR zilliztech#543). Per-upsert flush seals a
tiny segment on every watcher write -> fragmentation + compaction pressure, and
is the wrong lever: cross-process read-after-write (watcher writes, recall
searches) only misses fresh data because Milvus default consistency is Bounded.

- MilvusStore: add consistency_level (ctor + config/core/cli passthrough),
  applied to hybrid_search/dense_search/query as a per-call override. Sent only
  on remote (Lite is already strong-consistent), verified honoured in pymilvus
  2.6.8 via kwargs, works on existing collections (no recreate).
- Cold-start guard: get_collection_stats counts only sealed segments, so a fresh
  remote collection reports row_count=0 and the BM25 empty-guard would drop
  unsealed writes. When a consistency override is active, confirm emptiness with
  a consistency-honouring point query before short-circuiting.
- search CLI gains --consistency; memory-recall skill opts into Strong so notes
  written earlier in a session are immediately recallable on remote Milvus.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant