Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -120,3 +120,6 @@ test-code/
localtestmcp/
*.csv
*.pickle

# Personal dev notes (not tracked)
docs/dev/
7 changes: 6 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,10 @@
"**/*.egg-info/**": true,
"**/build/**": true,
"**/dist/**": true
}
},
"accessibility.signals.terminalBell": {
"sound": "on",
"announcement": "auto"
},
"cmake.sourceDirectory": "/Users/yichuan/Desktop/code/LEANN/leann/packages/leann-backend-hnsw"
}
27 changes: 27 additions & 0 deletions docs/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Changelog

All notable changes to LEANN are documented here. Append-only, newest entries at the bottom.

Format: `## YYYY-MM-DD: <short summary>` followed by bullet points.

## 2026-03-05: IVF backend incremental update support

- Added `leann-backend-ivf` with FAISS IndexIVFFlat + DirectMap.Hashtable.
- IVF supports in-place `add_vectors` and `remove_ids` without full rebuild.
- `leann build` is now idempotent: re-running on an existing index does incremental update (add new, remove deleted, re-index modified files).
- Fixed incremental build chunking inconsistency and shared metadata dict bug.
- Fixed IVF incremental update duplicate chunks from stale `passages.jsonl`.

## 2026-03-05: MCP server v2 — build, status, and structured search

- Added `leann_build` MCP tool: build or incrementally update indexes directly from Claude Code.
- Added `leann_status` MCP tool: inspect index details (backend, embedding model, chunk/file count, size).
- `leann_search` now uses `--json` output with file paths always included, formatted as markdown code blocks.
- Fixed `float32` JSON serialization bug in `leann search --json`.
- Cleaned up MCP tool descriptions (concise, no emoji).

## 2026-03-05: Documentation — roadmap, vision, and dev guidelines

- Rewrote `docs/roadmap.md` with current P0/P1 priorities from GitHub issue #237.
- Added `docs/ultimate_goal.md` — long-term vision (personal data platform, best code retrieval MCP, multimodal, local-first).
- Added self-contained documentation principle and dev doc maintenance rules to `CLAUDE.md`.
41 changes: 41 additions & 0 deletions docs/issue-proposals/smart-embedding-default.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Smart default embedding model based on platform and corpus size

## Summary

Propose platform- and corpus-aware default embedding model selection for `leann build` when `--embedding-model` is not explicitly specified. This would improve out-of-the-box experience for different deployment scenarios (macOS CPU, NVIDIA GPU, etc.) without changing behavior when users pass an explicit model.

## Motivation

- **Current default**: `facebook/contriever` (~420MB, 768 dim) — heavy for CPU-only builds on large corpora
- **macOS users** often hit slow builds on 20K+ chunks; lighter models like `all-MiniLM-L6-v2` (~90MB) are much faster
- **NVIDIA GPU users** can leverage stronger models; smaller corpora benefit from quality (e.g. Qwen3-Embedding-0.6B), larger ones from balanced models (e.g. bge-base-en-v1.5)

## Proposed logic

| Platform | Chunk count | Default model |
|----------|-------------|---------------|
| **macOS** | ≥ 20,000 | `sentence-transformers/all-MiniLM-L6-v2` |
| **macOS** | < 20,000 | `intfloat/e5-small-v2` |
| **NVIDIA GPU** | < 5,000 | `Qwen/Qwen3-Embedding-0.6B` |
| **NVIDIA GPU** | ≥ 5,000 | `BAAI/bge-base-en-v1.5` |
| **Other** | any | `facebook/contriever` (unchanged) |

## Implementation notes

1. **Platform detection**: `torch.cuda.is_available()` for NVIDIA; `sys.platform == "darwin"` for macOS
2. **Chunk count**: Known only after loading/chunking; may need to either:
- Do a lightweight pre-scan (e.g. file count × rough chunks per file), or
- Defer default choice until after first chunking pass (and cache for incremental)
3. **Explicit override**: If user passes `--embedding-model`, always use it; this logic applies only when the flag is omitted

## Model references

- `sentence-transformers/all-MiniLM-L6-v2`: ~90MB, 384 dim, fast on CPU
- `intfloat/e5-small-v2`: ~90MB, 384 dim
- `Qwen/Qwen3-Embedding-0.6B`: 0.6B params, 1024 dim, strong retrieval
- `BAAI/bge-base-en-v1.5`: ~110M params, 768 dim, good MTEB scores

## Open questions

- Should we add a `--embedding-model auto` to explicitly opt into this logic?
- Pre-scan vs post-chunk decision: trade-off between accuracy and implementation complexity
2 changes: 1 addition & 1 deletion packages/leann-core/src/leann/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -2540,7 +2540,7 @@ async def search_documents(self, args):
json_results = [
{
"id": r.id,
"score": r.score,
"score": float(r.score),
"text": r.text,
"metadata": r.metadata,
}
Expand Down
Loading
Loading