Skip to content

refactor(cmd): remove force_reindex; restore progress notifications#65

Merged
aeneasr merged 18 commits intomainfrom
reindex-fixes
Mar 24, 2026
Merged

refactor(cmd): remove force_reindex; restore progress notifications#65
aeneasr merged 18 commits intomainfrom
reindex-fixes

Conversation

@aeneasr
Copy link
Member

@aeneasr aeneasr commented Mar 23, 2026

Summary

  • Removes the force_reindex parameter from semantic_search — reindexing is now exclusively handled by the SessionStart hook and the background goroutine inside ensureIndexed
  • Restores buildProgressFunc and wires MCP progress notifications through the background indexing goroutine, fixing a regression where the Claude Code status indicator stopped animating during indexing
  • Adds structured logging to the background indexer (slog) and enriches Stats with per-file-change breakdown
  • Eliminates reindex fragmentation from multiple terminal sessions via flock-based deduplication and freshness TTL
  • Updates /lumen:reindex and /lumen:doctor skills to reflect the new reindexing model

Test Plan

  • go test ./... passes
  • make build-local succeeds
  • Opening a new terminal session triggers background reindex via SessionStart hook
  • During indexing triggered by semantic_search, the Claude Code status bar animates with file progress

🤖 Generated with Claude Code

aeneasr and others added 15 commits March 23, 2026 18:07
When reindexing takes longer than 15s, semantic_search returns stale
results with a warning instead of blocking the agent indefinitely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Buffered done channel (cap 1) to prevent goroutine leak on timeout
- Goroutine calls touchChecked on success for correct TTL behavior
- Nil progress func in goroutine (request ctx may be gone)
- Log errors from background EnsureFresh at Warn level
- sync.WaitGroup for graceful shutdown in Close()

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7-task plan with TDD approach: struct changes, WaitGroup, timeout
goroutine, formatSearchResults, and tests including a test hook
(ensureFreshFunc) to exercise the 15s timeout path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… reindex

EnsureFresh now runs in a goroutine. If it completes within 15s, results
are returned normally. If it exceeds the timeout, stale results are
returned immediately with a StaleWarning while reindexing continues in
the background (up to 10min). The goroutine acquires an exclusive flock
to avoid concurrent writes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Go 1.25+ provides wg.Go() which simplifies goroutine tracking.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…exed

Add ensureFreshFunc test hook to indexerCache (follows existing
findDonorFunc/seedFunc pattern) and three new tests:

- TestEnsureIndexed_TimeoutReturnsStaleWarning: injects a slow
  EnsureFresh that exceeds the 15s timeout, verifies StaleWarning
  is returned and Reindexed=false.
- TestEnsureIndexed_FastEnsureFreshNoWarning: injects an instant
  EnsureFresh, verifies no warning and correct stats propagation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…or vec_chunks

Handles slow embedding batches and retries on SQLite contention
without timing out. INSERT OR REPLACE prevents duplicate key errors
when re-embedding chunks that already exist in the vector table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three root causes fixed:

1. SessionStart double-spawn and no freshness gate (hook.go):
   - Remove unconditional spawnBackgroundIndexer from runHookSessionStart;
     generateSessionContextInternal now owns all spawn decisions
   - After opening the DB for stats, check last_indexed_at: skip spawn
     when indexed within backgroundIndexStaleness (5 min), spawn when
     stale or never completed. Prevents every new terminal from triggering
     a full merkle walk.

2. Goroutine zero-result treated as "fresh" (stdio.go):
   - Add skipped bool to freshResult. When TryAcquire returns nil (TOCTOU
     race — another process grabbed the lock) or errors, send
     freshResult{skipped: true}. Main select now returns StaleWarning for
     skipped results, consistent with the IsHeld fast-path. Previously the
     zero result looked like "index is fresh", silently skipping
     touchChecked and causing the next search to immediately re-spawn.

3. Redundant merkle walk after lumen index finishes (stdio.go):
   - In the goroutine, after acquiring the flock, check idx.LastIndexedAt().
     If within freshnessTTL, call touchChecked() and return without calling
     EnsureFresh. Uses the DB timestamp as a shared cross-process freshness
     signal so the MCP server doesn't duplicate the walk just completed by
     the background indexer.

Also fix pre-existing errcheck lint in tui/progress.go.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nge breakdown

- index.go: add newDebugLogger() for background path; log start, skip,
  cancel, error, and completion with full Stats fields; pass logger to
  Indexer via SetLogger() so indexWithTree can log the indexing plan
- index/index.go: add FilesAdded/FilesModified/FilesRemoved/Reason/
  OldRootHash/NewRootHash to Stats; populate them in Index, EnsureFresh,
  and indexWithTree; add SetLogger/logger field to Indexer
- hook_spawn_unix.go: discard stderr of background indexer (slog writes
  to debug.log; piping stderr would mix pterm progress into the log)
- search.go: pass nil logger to setupIndexer (interactive command)
- CLAUDE.md: document interactive vs background output strategy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@aeneasr aeneasr changed the title feat: non-blocking semantic_search with 15s timeout feat: non-blocking semantic_search, reindex fragmentation fixes, and indexing diagnostics Mar 23, 2026
The force_reindex parameter on semantic_search is removed. Reindexing is
exclusively triggered by the SessionStart hook and by the background
goroutine inside ensureIndexed.

Progress notifications are restored and now flow through the background
goroutine path so the Claude Code status indicator animates during indexing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@aeneasr aeneasr changed the title feat: non-blocking semantic_search, reindex fragmentation fixes, and indexing diagnostics refactor(cmd): remove force_reindex; restore progress notifications Mar 24, 2026
aeneasr and others added 2 commits March 24, 2026 08:19
Enrich the "indexing plan" slog entry with:
- old_root_hash: stored merkle root before this run
- new_root_hash: computed merkle root from current filesystem
- main_worktree: main git repo root (only when projectDir is a worktree)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When SQLite reports "database disk image is malformed" or "disk I/O
error", the index is permanently broken until manually purged. Every
subsequent semantic_search call would fail with the same error because
touchChecked is never set and each retry hits the same corrupted file.

This change adds automatic recovery at two layers:

- store.New: if open/schema-setup fails with a corruption error, delete
  the DB file and its WAL/SHM sidecars and retry once from a clean state.
  In-memory databases are never deleted.

- Indexer.EnsureFresh / Index: if indexWithTree returns a corruption
  error mid-operation, log ERROR "corrupted database detected, rebuilding",
  call rebuildStore() (close → delete files → reopen), then retry with an
  empty stored hash so the fresh DB receives a full index pass.

Adds IsCorruptionErr(err) to the store package as the single source of
truth for what constitutes a SQLite corruption error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@aeneasr aeneasr merged commit 652b418 into main Mar 24, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant