Skip to content

fix(index): cap parent index walk at git repo boundary + store perf improvements#58

Merged
aeneasr merged 4 commits intomainfrom
fix-consta-index
Mar 18, 2026
Merged

fix(index): cap parent index walk at git repo boundary + store perf improvements#58
aeneasr merged 4 commits intomainfrom
fix-consta-index

Conversation

@aeneasr
Copy link
Member

@aeneasr aeneasr commented Mar 18, 2026

Summary

  • fix: findEffectiveRoot now stops walking parent directories at the git repository boundary, preventing excessive scanning when opening an index from a subdirectory of a large monorepo
  • test: Integration test TestIndexerCache_FindEffectiveRoot_GitBoundary validates the boundary protection using a real git repo created in a temp directory
  • perf: Add idx_chunks_symbol index on chunks(symbol) to speed up symbol-based lookups; run ANALYZE after each batch insert so the SQLite query planner has accurate statistics immediately

Schema note

No IndexVersion bump needed — CREATE INDEX IF NOT EXISTS automatically adds the new index to existing databases on first open without incompatibility.

Test plan

  • go test ./... — all packages pass
  • Integration test creates a real temp git repo and verifies boundary protection
  • Index existence test covers both idx_chunks_file_path and idx_chunks_symbol

🤖 Generated with Claude Code

aeneasr and others added 4 commits March 18, 2026 16:01
Prevents lumen from adopting ancestor indexes outside the current git
repository (e.g. a GOPATH-level index) when EnsureFresh is called from
a subdirectory. Without this cap, the upward walk in findEffectiveRoot
could reach indexes containing the entire workspace, causing excessive
scanning and high GPU/fan usage.

Adds git.RepoRoot() utility that resolves the repository root via
git rev-parse --show-toplevel with symlink resolution.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds an integration test that creates a real git repository and a fake
ancestor index DB above the repo root, then asserts findEffectiveRoot
does not walk above the git boundary to adopt it.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add idx_chunks_symbol index to speed up symbol-based lookups (used in
search queries that filter by function/type name). Also run ANALYZE
after each batch commit so the SQLite query planner has up-to-date
statistics from the start.

No schema version bump needed — CREATE INDEX IF NOT EXISTS applies the
new index to existing databases automatically on first open.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
idx_chunks_symbol was not useful: TopSymbols uses GROUP BY symbol ORDER BY
count(*) DESC, which SQLite cannot accelerate with a column index because
the sort key is an aggregate computed after grouping. The index added write
overhead on every chunk INSERT with no measurable read benefit.

ANALYZE is now called once per full index pass (in indexWithTree, after the
final flushBatch) via a new Store.Analyze() method, rather than after every
32-row batch. Running ANALYZE dozens of times per re-index was wasteful I/O;
once at the end gives the query planner accurate statistics at the right time.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@aeneasr aeneasr enabled auto-merge (squash) March 18, 2026 15:19
@aeneasr aeneasr merged commit c5074e2 into main Mar 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant