feat: fastembed migration, daemon mode, TUI output, and associative linking by urmzd · Pull Request #4 · urmzd/mnemonist

urmzd · 2026-03-30T09:01:17Z

Summary

Embedding backend: Replace Ollama with local fastembed for zero-dependency embeddings
Daemon mode: Add llmem-server with remember endpoint and state management
TUI output: Colored stderr output for remember, learn, consolidate, and memorize commands with elapsed-time reporting
Associative linking: Consolidation phase connects similar memories via bidirectional refs; remember expands refs for edge traversal
Code indexing: Plain-text chunking fallback for unsupported file types, layered HNSW with interleaved memory/code hits
Embedding eval: Anisotropy and similarity range metrics for embedding quality assessment
Config: New output and consolidation settings (quiet mode, max_memory_tokens, merge_threshold)

Test plan

All 124 tests pass (core, index, quant, server, CLI integration)
Manual test: llmem learn with TUI output on a real repo
Manual test: llmem remember shows interleaved memory + code results
Manual test: llmem consolidate creates associative links between similar memories

Add fastembed crate for local ONNX-based embeddings (all-MiniLM-L6-v2), replacing dependency on external Ollama server. Update HTTP stack with hyper and tower for better socket and middleware support in daemon mode. Add chrono for better timestamp handling in server responses.

Rename section 16 from "Context Switching" to "Project Resolution" to reflect architectural change. Remove references to .active-ctx file and context switch command. Update server documentation: --root CLI flag at startup, /reload endpoint acepts optional ?root=<path> to switch project context dynamically. Update /health endpoint description.

Replace OllamaEmbedder with FastEmbedder for local ONNX-based embeddings. Fastembed downloads models to ~/.cache/fastembed/ on first use and provides thread-safe embeddings without external server overhead. Simplify EmbeddingConfig by removing host and model_path fields. Default model is all-MiniLM-L6-v2 (384 dims, 22MB). Supports model selection via config: all-MiniLM-L6-v2, all-MiniLM-L6-v2-q, all-MiniLM-L12-v2, BGE-small-en-v1.5, BGE-small-en-v1.5-q. Add refs field to Frontmatter for inter-layer edges (code chunk IDs or memory filenames). Add expand_refs and max_ref_expansions to RecallConfig to support ref-guided memory traversal.

Add evaluation module with three key metrics for assessing embedding distribution quality: - anisotropy: average pairwise cosine similarity (lower is better, < 0.3 target) - similarity_range: max - min off-diagonal similarity (higher is better, > 0.3 target) - discrimination_gap: intra-group vs inter-group similarity delta (higher is better, > 0.05 target) Includes helper for mean-centering and re-normalizing embeddings to reduce anisotropy.

Add fallback chunking strategy for file types without tree-sitter grammar support. Files now extract chunks gracefully using line-based chunking (MAX_CHUNK_LINES per chunk) instead of skipping silently.

…ment Transform llmem-server from simple HTTP wrapper into full daemon with warm indices and hot-reload support. Add CLI argument parsing (clap) for --root and --addr options. Add remember endpoint (/remember?q=<query>&level=<level>&budget=<budget>) for semantic search with Hebbian reinforcement (access count tracking). Endpoint integrates embedding stores with HNSW indices for efficient recall. Add reload endpoint (/reload?root=<path>) for hot-swapping project indices without restarting daemon. Extend AppState to hold both project and global embedding stores alongside indices for full semantic search capability.

Remove ctx switch/show commands (context switching moves to server --root and /reload endpoint). Add daemon integration: CLI now calls daemon_notify_reload() after memorize, note, consolidate operations to hot-reload indices. Remember command attempts daemon first (warm indices) before falling back to local file-based search. Add support for inter-layer refs in memorize command to track code chunk edges. Learn command now embeds chunks and builds HNSW indices with evaluation metrics. Consolidate command builds memory HNSW and preserves file_source as ref edges. Project root now resolved via --root CLI flag (default: current directory) instead of .active-ctx file. Update semantic search to use both .memory-index and .code-index HNSW files.

Update integration tests for removal of ctx command and config changes. Regenerates snapshots to reflect new default embedding model (all-MiniLM-L6-v2 instead of nomic-embed-text) and simplified EmbeddingConfig.

Update CLI README to document new command structure (memorize, note, remember, learn, consolidate, reflect, forget) replacing previous add/learn/recall/list/search API. Include new file locations for .code-index.hnsw and .memory-index.hnsw indices.

Update index module README to document eval module for embedding quality metrics (anisotropy, similarity_range, discrimination_gap, mean_center). Note plain-text fallback chunking for file types without supported tree-sitter grammars (shell scripts, markdown, TOML, etc.).

Update main README to document shift from Ollama embeddings to local fastembed (384-dim, ONNX-based, ~22MB model downloaded to ~/.cache/fastembed/). Explain three-layer HNSW architecture (code, memory, global) with inter-layer edges via refs frontmatter field. Document cross-layer recall, embedding quality metrics in learn output, and remove legacy context switching feature.

Add OutputConfig struct to support quiet mode for suppressing elapsed-time reporting. Add max_memory_tokens field to ConsolidationConfig to control memory body truncation. These options prepare the configuration layer for UI/UX improvements and memory size management.

Add comprehensive CLI reference documenting all commands, flags, and workflows. Add configuration reference listing all config options with descriptions and defaults. Update SKILL.md to document two-layer semantic search, edge expansion, associative linking, consolidation phases, and working memory mechanics. Include rules for saving memories and common gotchas.

- Add colored TUI output for remember, learn, and consolidate commands - Add elapsed-time reporting via CommandTimer - Add quiet flag (-q) to suppress timing output - Interleave memory and code hits in semantic search - Add associative linking phase during consolidation - Expand refs during remember for edge traversal - Truncate memory bodies to max_memory_tokens

Allow users to explicitly enable TUI mode by setting the LLMEM_TUI environment variable, in addition to the automatic detection via stderr being a terminal. This enables TUI mode in non-interactive environments and simplifies testing scenarios.

Add criterion to dev-dependencies for comprehensive benchmarking support. Add 'just bench' and 'just validate' targets to justfile for convenient testing and validation workflows.

…ction Add benchmark tables for embedding store, inbox, memory index, and eval functions with TBD placeholders. Update README and benchmarks.md to reference 'just bench' for reproduction instead of raw 'cargo bench'.

Expand sr.yaml with detailed inline comments explaining each section. Enable hooks configuration and document the full feature set. Update version management settings and consolidate release pipeline config. Remove redundant test hook from pre-commit steps.

Implement temporal scoring that blends cosine similarity with memory recency and access patterns. Ported from training/src/models/temporal.py. Provides temporal_score function (decay by recency, boost by frequency, weight by type durability) and blend function for hybrid ranking. Export from lib.rs public API.

Add comprehensive benchmarks for core storage components using criterion. Measure EmbeddingStore (upsert, get, remove, save, load), Inbox (push to capacity, eviction, drain, save, load), MemoryIndex (parse, upsert, search), and content_hash performance across varying dimensions and item counts.

Add benchmarks for distance function performance (cosine similarity, dot product, L2 distance, normalization) and index operations. Parameterize by vector dimension to track performance across embeddings of different sizes.

Add regression guards for storage format fidelity. Validate that EmbeddingStore and Inbox binary/JSON roundtrips preserve data exactly. Test embedding store dimension/entry count, inbox capacity and item order, and eviction invariants to prevent silent data corruption.

Add regression guards for approximate nearest neighbor quality. Validate HNSW recall@10 >= 90%, IVF recall@10 >= 85% on synthetic data. Test embedding space metrics: anisotropy near zero for orthogonal vectors, discrimination gap > 0.05 for clustered data, mean centering reduces anisotropy.

Add regression guards for vector quantization quality. Validate MSE roundtrip cosine similarity (2-bit >= 0.90, 4-bit >= 0.95). Test Prod inner product estimates bounded by error thresholds that scale with bit-width. Verify pack/unpack exact roundtrips across dimensions and bit-widths.

Integrate temporal scoring into memory search. Replace type-based sorting with temporal_blend that mixes cosine similarity with temporal score. Load temporal_weight from config quantization settings. Improves memory relevance by considering both semantic match and recency/frequency patterns.

urmzd added 25 commits March 30, 2026 02:38

feat(index): support plain-text chunking for unsupported files

d18d6b3

Add fallback chunking strategy for file types without tree-sitter grammar support. Files now extract chunks gracefully using line-based chunking (MAX_CHUNK_LINES per chunk) instead of skipping silently.

test: update integration tests and snapshots

41e0221

Update integration tests for removal of ctx command and config changes. Regenerates snapshots to reflect new default embedding model (all-MiniLM-L6-v2 instead of nomic-embed-text) and simplified EmbeddingConfig.

build: add criterion benchmarking and validation targets

263fd14

Add criterion to dev-dependencies for comprehensive benchmarking support. Add 'just bench' and 'just validate' targets to justfile for convenient testing and validation workflows.

urmzd merged commit 3ca2c39 into main Apr 1, 2026
4 checks passed

urmzd deleted the feat/fastembed-daemon-tui branch April 6, 2026 00:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: fastembed migration, daemon mode, TUI output, and associative linking#4

feat: fastembed migration, daemon mode, TUI output, and associative linking#4
urmzd merged 25 commits intomainfrom
feat/fastembed-daemon-tui

urmzd commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

urmzd commented Mar 30, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant