"The map is not the territory" - Alfred Korzybski
indexion is a source code exploration and documentation tool that helps you build dynamic maps of your codebase.
- Similarity Analysis: Find duplicated or similar code patterns
- Refactoring Planning: Generate actionable refactoring checklists
- Documentation Generation: KGF-based intelligent doc extraction
- Multi-language Support: Extensible via KGF specifications
Analyze similarity across files in a directory.
Explores a directory and calculates pairwise similarity between all files. Useful for understanding code patterns and finding potential duplications.
indexion explore [options] <directory>Generate refactoring plan based on file similarity analysis.
Analyzes files in a directory, identifies similar code patterns using TF-IDF or NCD algorithms, and generates a Markdown checklist for refactoring candidates. Detects duplicate code blocks with function-name annotation via KGF tokenization (language-agnostic).
indexion plan refactor [options] <directory>Detect unnecessary wrapper functions and plan their removal.
Scans source files for wrapper functions whose body is a single delegation call with no added logic. Callers should use the delegate directly. Supports report, dry-run preview, and auto-fix modes.
indexion plan unwrap [options] <directory>
indexion plan unwrap --dry-run --include='*.mbt' src/
indexion plan unwrap --fix --include='*.mbt' --exclude='*_wbtest.mbt' src/indexion plan solid builds a solidification plan for extracting duplicated code from multiple source directories into a shared target.
The command parses --from, --to, --rules, --rule, --threshold, --strategy, --include, --exclude, --output, and --format, then:
- collects files from the source directories
- computes cross-package similarity
- applies extraction rules to matched files
- emits a
SolidPlanJSONsummary - converts that summary into a
PlanDocument
The intermediate plan is organized around ExtractionGroup, ExtractionFile, UnmatchedFile, SolidConfigJSON, and SolidSummaryJSON.
indexion plan solid --from=pkg1/,pkg2/ --to=components/
indexion plan solid --from=pkg1/,pkg2/ --rules=.solidrc
indexion plan solid --from=pkg1/,pkg2/ --to=components/ --rule="auth/** -> auth/"Important options:
--from=DIRS: comma-separated source directories--to=DIR: target directory for extracted code--rules=FILE: load rules from a rules file--rule=RULE: add inline extraction rules--threshold=FLOAT: minimum similarity score--strategy=NAME: similarity strategy such astfidf,apted, ortsed--include=PATTERN/--exclude=PATTERN: filter collected files--output=FILE: write the rendered plan to a file--format=md|json|github-issue: choose the final renderer
Rule syntax is pattern -> target. Relative targets use --to, and absolute targets can use an @pkg/... prefix.
indexion plan reconcile detects drift candidates between implementation and documentation.
Its statuses are suggestion classes, not proofs. The goal is to surface likely review work with evidence and reuse review decisions, not to fully prove semantic consistency.
It builds a code graph from the target directory, extracts code symbols and inline docs, and matches them against external document fragments. Document extraction is KGF-based, so Markdown, plaintext, TOML, and other detectable document specs can participate in the same scan.
Direct symbol-name matches come from KGF feature metadata. Specs can declare reference_token_kinds, programming specs can declare document_symbol_kinds, and both code and document specs can declare coverage_token_kinds for module coverage, so reconcile does not need per-format token-kind guesses or command-local symbol-kind whitelists for supported languages.
Matching is symbol-first, then module-scoped. If a README or design section clearly covers a whole module, reconcile suppresses per-symbol missing_doc noise for that module and reports the coverage in the summary instead.
When many symbol candidates concentrate in one module or project group, the report also emits aggregate review groups so the first action can be "review this module/project doc set" instead of triaging each symbol independently.
Code discovery respects KGF ignore patterns, so language-specific test files and generated artifacts can be excluded by the active spec set instead of by command-local suffix rules.
Cross-package behavior is regression-tested with the TypeScript fixture at fixtures/project/typescript-reconcile, so package-doc scans are validated against non-MoonBit inputs as well.
The command does not rewrite code or docs. Its job is to produce a mechanically derived report with timestamp evidence, mapping confidence, and a logical review queue for follow-up.
At a high level, the flow is:
- Discover source files and document files under the target directory.
- Extract code symbols and document fragments.
- Match fragments to symbols with symbol-first heuristics and module-scope coverage fallback.
- Compare git and mtime evidence to classify drift.
- Persist manifest, report, and DB state to avoid rechecking unchanged candidates.
indexion plan reconcile [options] <directory>Common examples:
# Scan the current project and print JSON
indexion plan reconcile .
# Render a human-readable report
indexion plan reconcile --format=md cmd/indexion/plan/reconcile
# Audit package docs with the built-in preset
indexion plan reconcile --scope=package-docs cmd/indexion/plan/reconcile
# Audit README/docs across a subtree
indexion plan reconcile --scope=tree-docs cmd/indexion/plan
# Audit package docs only (README + docs/)
indexion plan reconcile \
--doc='README.md' \
--doc='docs/**/*.md' \
--doc-spec=markdown \
cmd/indexion/plan/reconcile
# Restrict document inputs
indexion plan reconcile \
--doc='docs/**/*.md' \
--doc='spec/**/*.toml' \
--doc-spec=markdown \
--doc-spec=toml \
.
# Apply logical review decisions from a JSON file
indexion plan reconcile \
--review-results=.indexion/cache/reconcile/reviews.json \
.Main CLI options:
| Option | Purpose | Default |
|---|---|---|
--format=json|md|github-issue |
Select report renderer | json |
--output=FILE, -o= |
Write report to a file | stdout |
--scope=custom|package-docs|tree-docs |
Apply common doc-audit presets | custom |
--specs=DIR |
Override KGF spec directory | auto-detect |
--index-dir=DIR |
Override reconcile cache directory | .indexion/cache/reconcile |
--config=FILE |
Load explicit .indexion.toml or .json |
auto-discover |
--doc=GLOB |
Limit document paths, repeatable | all detectable docs |
--doc-spec=NAME |
Limit document specs, repeatable | all detected specs |
--threshold-seconds=N |
Tolerated time skew before drift | 60 |
--max-candidates=N |
Max report candidates | 200 |
--no-file-fallback |
Disable basename fallback matching | off |
--mtime-only |
Ignore git timestamps and use mtimes only | off |
--logical-review=queue|off |
Enable or disable logical review queueing | queue |
Best practice is to start with JSON output, inspect candidate IDs and review keys, and then switch to Markdown or GitHub Issue rendering for human workflows.
Notes:
- Leaving
--docunset scans all detectable document specs under the target directory, not just Markdown. --scope=package-docsexpands toREADME.md+docs/**/*.mdwithdoc_specs = ["markdown"]unless explicit--docor--doc-specvalues are provided.--scope=tree-docsexpands to**/README.md+**/docs/**/*.mdwithdoc_specs = ["markdown"]unless explicit filters are provided.docs/**/*.mdmatches bothdocs/api.mdand nested paths likedocs/reference/api.md.
Extract documentation from source files and generate README files.
Extracts /// documentation comments from MoonBit source files and
outputs them in various formats. Supports flexible package discovery
with include/exclude patterns.
indexion doc readme [options] [paths...]Calculate text similarity and distance between two texts.
Computes similarity/distance metrics between two text inputs using various algorithms (TF-IDF cosine similarity, NCD compression distance, or a weighted hybrid).
indexion sim [options] <text1> <text2>curl -fsSL https://raw.githubusercontent.com/trkbt10/indexion/main/install.sh | bashInstalls to ~/.indexion/ with KGF language specs. Add to PATH:
export PATH="$HOME/.indexion/bin:$PATH"Download from Releases:
| Platform | Archive |
|---|---|
| Linux x64 | indexion-linux-x64.tar.gz |
| macOS ARM64 | indexion-darwin-arm64.tar.gz |
| Windows x64 | indexion-windows-x64.zip |
Each archive contains:
indexionbinarykgfs/directory (60+ language specifications)
Extract and move to your preferred location:
tar -xzf indexion-darwin-arm64.tar.gz
mv indexion-darwin-arm64/indexion ~/.local/bin/
mv indexion-darwin-arm64/kgfs ~/.indexion/git clone https://github.com/trkbt10/indexion.git
cd indexion
moon build --target native --release
# Binary: _build/native/release/build/cmd/indexion/indexion.exeindexion searches for KGF specs in this order:
- Explicit specs directory CLI option for the command
INDEXION_KGFS_DIRenvironment variable[global].kgfs_dirin global configkgfs/in the target project directorykgfs/in the current working directory- OS-standard data directory
.../kgfs/when non-empty
- MoonBit toolchain (for building from source)
- No runtime dependencies
Apache License 2.0