DocImp

Impact-Driven Documentation Coverage Tool

DocImp analyzes your Python, TypeScript, and JavaScript codebases to identify undocumented code, prioritizes it by impact score, and uses Claude AI to generate high-quality documentation with validation gates.

Why DocImp?

Documentation is critical but often neglected. The challenge isn't just writing docs—it's knowing what to document first.

DocImp solves this by:

Prioritizing by Impact: Complex, public APIs get documented before simple private helpers
Supporting Multiple Languages: Python, TypeScript, and JavaScript as first-class citizens
Validating AI Output: Plugins catch errors before accepting generated documentation
Making it Interactive: Iterative workflow with context management
Local Analysis: Analyze and audit locally to save time and API costs—only send selected items to Claude

Problem: Your codebase has 500 undocumented functions. Where do you start?

Solution: DocImp analyzes cyclomatic complexity and calculates impact scores (0-100). Focus on what matters.

Features

Core Capabilities

Polyglot Analysis: Parse Python (AST), TypeScript, JavaScript (with JSDoc validation)
Smart Prioritization: Impact scoring based on cyclomatic complexity
AI-Powered Suggestions: Claude generates context-aware documentation
Validation Gates: JavaScript plugins validate JSDoc types, style, and correctness
Interactive Workflow: Step-by-step improvement with progress tracking
Multiple Module Systems: ESM, CommonJS, mixed codebases supported
Real JSDoc Type-Checking: Uses TypeScript compiler for validation, not just parsing
Graceful Error Handling: Continues analyzing valid files when encountering syntax errors

Language Support

Language	Parser	Documentation Style	Validation
Python	AST (built-in)	NumPy, Google, Sphinx	Ruff integration
TypeScript	TS Compiler	JSDoc	Full type-checking
JavaScript	TS Compiler (checkJs)	JSDoc	Parameter/type validation
Other files	Skipped	N/A	N/A

JavaScript Excellence

DocImp treats JavaScript as a first-class language, not just "TypeScript that parses .js files":

Real JSDoc Validation: Uses TypeScript compiler with checkJs: true to validate JSDoc against actual function signatures
Module System Detection: Automatically detects ESM (export/import) vs CommonJS (module.exports)
Export Pattern Recognition: Tracks named exports, default exports, re-exports
Smart Writing: Correctly inserts JSDoc above functions, arrow functions, classes, and object methods

Quick Start

Prerequisites: Install uv first:

# macOS (Homebrew)
brew install uv

# Or use the official installer (Linux/macOS)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and install from source
git clone https://github.com/nikblanchet/docimp.git
cd docimp

# Install Python dependencies with uv
uv venv
uv pip sync requirements-dev.lock
uv pip install -e .

# Install TypeScript CLI
cd cli
npm install
npm run build
npm link

# Analyze your codebase
docimp analyze ./src

# Output:
# ┌──────────────────────────────────────────┐
# │  Documentation Coverage Analysis         │
# ├──────────────────────────────────────────┤
# │  Overall:        45.2% (23/51 documented)│
# │                                          │
# │  By Language:                            │
# │  • Python:       60.0% (12/20)           │
# │  • TypeScript:   50.0% (8/16)            │
# │  • JavaScript:   20.0% (3/15) ⚠         │
# └──────────────────────────────────────────┘

# Advanced Analysis Options

# Incremental analysis (faster for large codebases)
# Only re-analyzes files modified since last run
docimp analyze ./src --incremental

# Apply previous audit ratings to analysis items
# Useful for regenerating impact scores with quality data
docimp analyze ./src --apply-audit

# Control auto-clean behavior
docimp analyze ./src --preserve-audit  # Keep audit.json (default prompts)
docimp analyze ./src --force-clean     # Skip prompt, always clean

# Combine flags
docimp analyze ./src --incremental --apply-audit --preserve-audit

# Generate improvement plan
docimp plan ./src

# Output:
# High Priority (≥70 impact score):
# 1. PaymentService.processPayment (score: 92)
# 2. AuthRepository.validateToken (score: 87)
# 3. UserService.createUser (score: 81)
#
# 15 high-priority items found

# Interactive improvement
export ANTHROPIC_API_KEY=sk-ant-...
docimp improve ./src

# Interactive workflow:
# 1. Provide your documentation style preferences (or use defaults)
# 2. Select item to document
# 3. Claude generates suggestion
# 4. Plugins validate (catches errors!)
# 5. Accept/Edit/Regenerate
# 6. Write back to file
# 7. Track progress

Installation

Prerequisites

Python: 3.13 (untested on other versions)
Node.js: 24 or later (required by package engines field)
Git: 2.28 or later (Git CLI is required for rollback/undo functionality)
Claude API Key: From console.anthropic.com

Note on Git: DocImp uses the Git CLI to track documentation changes for rollback capability. The transaction system requires Git 2.28+ (released July 2020) for improved working tree handling with --git-dir and --work-tree flags. Check your version with git --version. If Git is not installed or is older than 2.28, DocImp will run without rollback features (graceful degradation). See the Git installation guide for installation instructions. Future versions will include Dulwich (Apache 2.0 license) as a pure-Python fallback.

Install from Source

Prerequisites: Install uv:

# macOS (Homebrew)
brew install uv

# Linux/macOS (official installer)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows (PowerShell)
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

# Clone repository
git clone https://github.com/nikblanchet/docimp.git
cd docimp

# Install Python dependencies with uv
uv venv
uv pip sync requirements-dev.lock
uv pip install -e .

# Install TypeScript CLI
cd cli
npm install
npm run build
npm link

# Verify installation
docimp --version

# Verify Git is available (for rollback features)
git --version

# Set API key
export ANTHROPIC_API_KEY=sk-ant-...

Usage

Analyze

Analyze documentation coverage across your codebase.

# Analyze directory (uses docimp.config.js if present)
docimp analyze ./src

Output includes:

Overall coverage percentage
Coverage by language (Python/TypeScript/JavaScript/Skipped)
List of undocumented items sorted by impact
Complexity metrics

Audit

Rate existing documentation quality with code context displayed.

docimp audit ./src

Interactive workflow:

Reviews items that HAVE documentation
Displays code alongside documentation in configurable modes:
- Complete: Show full code with line numbers
- Truncated (default): Show first 20 lines, [C] to view full code
- Signature: Show function/class signature only, [C] to view full code
- On-demand: Hide code, [C] to view when needed
Shows boxed "CURRENT DOCSTRING" to clearly identify what's being rated
Prompts: [1-4] for quality rating, [C] for full code (if applicable), [S] to skip, [Q] to quit
- 1 = Terrible, 2 = OK, 3 = Good, 4 = Excellent
- C = Show full code (only in truncated/signature/on-demand modes)
- S = Skip (saves null for later review)
- Q = Quit (stops audit)
Calculates weighted coverage score
Saves results to .docimp/session-reports/audit.json

Example output:

Auditing: 5/23
function calculateImpactScore (typescript)
Location: src/scoring/scorer.ts:45
Complexity: 8

┌──────────────────────────────────────────────────────────┐
│ CURRENT DOCSTRING                                        │
├──────────────────────────────────────────────────────────┤
│ /**                                                      │
│  * Calculate impact score based on complexity.           │
│  * @param complexity - Cyclomatic complexity             │
│  * @returns Impact score (0-100)                         │
│  */                                                      │
└──────────────────────────────────────────────────────────┘
  45 | function calculateImpactScore(complexity: number): number {
  46 |   const baseScore = complexity * 5;
  47 |   return Math.min(100, baseScore);
  48 | }

[1] Terrible  [2] Poor  [3] Good  [4] Excellent
[C] Full code  [S] Skip  [Q] Quit

Resuming Interrupted Audit Sessions

DocImp automatically saves your progress after each rating. If your audit is interrupted, you can easily resume where you left off.

# Auto-detection: Prompts to resume if session exists
docimp audit ./src

# Explicit resume of latest session
docimp audit ./src --resume

# Explicit resume of specific session
docimp audit ./src --resume abc12345

# Start fresh session (bypass auto-detection)
docimp audit ./src --new

# Clear latest session and exit
docimp audit --clear-session

Smart File Invalidation: When resuming, DocImp automatically detects if source files have been modified since the session started. Modified files are re-analyzed, and you'll see a warning with the count of changed files. Your previous ratings for unchanged files are preserved.

Session State: Each session saves:

Current position in the audit queue
All ratings completed so far (1-4 or skipped)
File snapshots for modification detection
Display configuration (showCode mode, maxLines)

Plan

Generate prioritized improvement plan.

docimp plan ./src

Plan includes:

Items sorted by impact score
Categorized by priority (High/Medium/Low)
Coverage improvement projection

Status

View workflow state and get actionable suggestions for next steps.

# Formatted output (default)
docimp status

# Raw JSON for scripting/automation
docimp status --json

Status display shows:

Command execution history: Which commands have been run (analyze, audit, plan, improve)
Timestamps: When each command was last executed (e.g., "2h ago", "30m ago")
Item and file counts: Number of items processed and files tracked per command
Staleness warnings: Alerts when data is outdated (e.g., "audit is stale - analyze re-run since audit")
File modifications: Count of files changed since last analyze
Actionable suggestions: Recommended next steps (e.g., "Run 'docimp audit' to rate documentation quality")

Example formatted output:

Workflow State (.docimp/workflow-state.json)

Command    Status   Last Run    Items
─────────────────────────────────────
✓ analyze  run      2h ago      23 items, 5 files
✗ audit    not run  —           —
✗ plan     not run  —           —
✗ improve  not run  —           —

Suggestions:
  → Run 'docimp audit <path>' to rate documentation quality

JSON output (for automation):

docimp status --json | jq '.file_modifications'
# Output: 3

docimp status --json | jq '.commands[] | select(.command == "analyze") | .item_count'
# Output: 23

Use cases:

Interactive: Quick workflow state check before starting work
Interactive: Understand what commands have been run and when
Interactive: Get reminders about stale data that needs refreshing
Interactive: See suggested next actions in the workflow
Automation: CI/CD workflows checking for stale data
Automation: Shell scripts triggering re-analysis when files change
Automation: Integration with monitoring/alerting systems

Workflow State Management

DocImp tracks command execution history, file checksums, and dependencies in .docimp/workflow-state.json. This enables incremental re-analysis, automatic staleness detection, and intelligent workflow validation.

Incremental Analysis

Re-analyze only files that have changed since the last analysis run. Typical time savings: 90-95% for large codebases with minor changes.

# Initial analysis: 100 files, 30 seconds
docimp analyze ./src

# Modify 5 files
# Incremental: analyzes only 5 changed files, ~3 seconds
docimp analyze ./src --incremental

Output example:

Incremental Analysis
Using cached results from 95 unchanged files
Re-analyzing 5 modified files...

Analyzing src/analyzer.ts... [1/5]
Analyzing src/parser.py... [2/5]
...

Analysis complete: 100 items (5 re-analyzed, 95 cached)
Time saved: ~90%

Preview mode: Use --dry-run to see what would be re-analyzed without running the analysis:

docimp analyze ./src --incremental --dry-run

Incremental Analysis (dry run mode)

Would re-analyze 3 file(s):
  • src/analyzer.ts
  • src/parser.py
  • cli/commands/analyze.ts

Would reuse results from 97 unchanged file(s)

Estimated time savings: ~97%

Audit Rating Application

Apply audit ratings from a previous docimp audit session to affect impact score calculation in docimp plan.

# Workflow: analyze → audit → re-analyze with ratings → plan
docimp analyze ./src              # Step 1: Initial analysis
docimp audit ./src                # Step 2: Rate documentation quality (1-4)
docimp analyze ./src --apply-audit  # Step 3: Apply ratings to analysis
docimp plan ./src                 # Step 4: Plan uses ratings for priority

Impact score formula with audit ratings:

Without audit: impact_score = min(100, complexity × 5)
With audit: impact_score = (0.6 × complexity_score) + (0.4 × quality_penalty)

Penalty scale: No docs=100, Terrible(1)=80, OK(2)=40, Good(3)=20, Excellent(4)=0

Combine with incremental: --apply-audit --incremental applies ratings while only re-analyzing changed files.

Smart Auto-Clean

When running docimp analyze, if audit.json exists from a previous session, DocImp prompts to clean it (analysis changes may invalidate audit ratings).

Flags:

--preserve-audit: Keep existing audit.json without prompting
--force-clean: Delete audit.json without prompting (CI/CD mode)

# Default: prompts before cleaning
docimp analyze ./src

# Keep audit.json (useful when only new files added)
docimp analyze ./src --preserve-audit

# Force clean for non-interactive mode
docimp analyze ./src --force-clean

Workflow History

DocImp automatically saves timestamped snapshots of workflow-state.json to .docimp/history/ after each command execution. This provides an audit trail and enables state recovery for debugging.

List snapshots:

# Table display (default)
docimp list-workflow-history

# Output:
# Workflow History (3 snapshots)
#
# ┌────────────────────────────────────────────┬──────────┬───────────┐
# │ Timestamp                                  │ Size     │ Age       │
# ├────────────────────────────────────────────┼──────────┼───────────┤
# │ 2025-11-12T14:30:00.123Z                  │ 2.4 KB   │ 2h ago    │
# │ 2025-11-12T10:15:30.456Z                  │ 1.8 KB   │ 6h ago    │
# │ 2025-11-11T08:00:00.789Z                  │ 1.2 KB   │ 1d ago    │
# └────────────────────────────────────────────┴──────────┴───────────┘

# JSON output for scripts
docimp list-workflow-history --json

# Limit results
docimp list-workflow-history --limit 10  # Most recent 10 snapshots
docimp list-workflow-history --limit 0   # No snapshots (useful for counting)

Restore from snapshot:

# Interactive restore (prompts for confirmation)
docimp restore-workflow-state .docimp/history/workflow-state-2025-11-12T14-30-00-123Z.json

# Preview without changes
docimp restore-workflow-state <path> --dry-run

# Skip confirmation prompt
docimp restore-workflow-state <path> --force

Restore behavior:

Creates backup of current state before overwriting (.docimp/workflow-state.json.backup-{timestamp}.json)
Validates snapshot against schema before restore
Uses atomic write (temp file + rename) for safety

Prune old snapshots:

# Delete snapshots older than 30 days
docimp prune-workflow-history --older-than 30d

# Keep only 50 most recent snapshots
docimp prune-workflow-history --keep-last 50

# Hybrid pruning (OR logic): delete if older than 7d OR beyond top 20
docimp prune-workflow-history --older-than 7d --keep-last 20

# Preview deletions without executing
docimp prune-workflow-history --dry-run --keep-last 10

Age format: 30d (days), 7d (days), 1h (hours), 30m (minutes)

Automatic rotation: Configure in docimp.config.js:

workflowHistory: {
  enabled: true,        // Auto-save snapshots (default: true)
  maxSnapshots: 100,    // Auto-prune when count exceeded (0 = unlimited)
  maxAgeDays: 90        // Auto-prune snapshots older than N days (0 = no age limit)
}

Snapshot format: workflow-state-YYYY-MM-DDTHH-MM-SS-MMMZ.json (ISO 8601, cross-platform safe)

When to use:

Debug workflow state issues
Recover from accidental state corruption
Audit command execution history
Compare workflow state across time

For detailed architecture and API reference, see Workflow State Management.

Improve

Interactive documentation improvement workflow.

# Set API key
export ANTHROPIC_API_KEY=sk-ant-...

# Start interactive session
docimp improve ./src

Interactive workflow:

Collect preferences: Prompts for documentation style guide and tone
Load plan: Uses previously generated plan (from docimp plan)
For each item (in priority order):
- Show code context
- Request Claude suggestion
- Run plugin validation
- Show suggestion with any validation errors
- User decides: [A] Accept (accepts as-is) [E] Edit (opens suggestion in editor) [R] Regenerate (prompts user for feedback then regenerates) [S] Skip [Q] Quit
Write: Insert accepted documentation into source file
Continue: Move to next item until done or user quits

Plugin validation catches:

JSDoc parameter names don't match function signature
JSDoc types are incorrect or missing
Style guide violations (preferred tags, punctuation)
Missing examples for public APIs

Resuming Interrupted Improve Sessions

DocImp automatically saves session progress after each action (accept, skip, undo). You can pause and resume improve sessions at any time.

# Auto-detection: Prompts to resume if session exists
docimp improve ./src

# Explicit resume of latest session
docimp improve ./src --resume

# Explicit resume of specific session
docimp improve ./src --resume abc12345

# Start fresh session (bypass auto-detection)
docimp improve ./src --new

# Clear latest session and exit
docimp improve --clear-session

Transaction Integration: Resume sessions continue using the existing git transaction branch, preserving full rollback capability. If resuming a committed session, DocImp creates a new transaction branch and links it to the previous session.

Smart File Invalidation: Modified files are automatically re-analyzed when resuming. Impact scores and complexity metrics are updated based on current code state.

Preference Restoration: Your original style guide and tone preferences are restored from the session state, so you don't need to re-enter them.

Session State: Each session saves:

Current position in the plan queue
Progress metrics (accepted/skipped/errors)
Transaction ID for rollback integration
User preferences (style guides, tone)
File snapshots for modification detection
Complete plan with item metadata

Rollback & Undo

DocImp uses the Git CLI to track all documentation changes with full rollback capability. Each improve session creates a Git branch in a side-car repository (.docimp/state/.git) that never interferes with your project's Git repository.

During an improve session, press [U] to undo the last accepted change.

After a session, use rollback commands to revert changes:

# List all sessions (shows session ID, timestamp, change count)
docimp list-sessions

# List changes within a specific session
docimp list-changes <session-id>
docimp list-changes last       # List changes in most recent session

# Rollback entire session (all changes)
docimp rollback-session <session-id>
docimp rollback-session last  # Rollback most recent session

# Rollback specific individual change
docimp rollback-change <entry-id>
docimp rollback-change last   # Rollback most recent change

Using "last" keyword:

last as session ID: Finds the most recent session (sorted by start time)
last as entry ID: Finds the most recent change across all sessions (sorted by timestamp)
Useful for quick undo without needing to look up IDs
Works with --no-confirm for scripting: docimp rollback-session last --no-confirm

How it works:

Each improve session = Git branch (docimp/session-<uuid>)
Each accepted change = Git commit with metadata
Rollback = Git revert with conflict detection
Side-car repo in .docimp/state/.git (never touches your repo)

Conflict handling: If files have been modified since the change was made, DocImp uses Git's 3-way merge to attempt resolution. If conflicts occur, you'll see detailed guidance with resolution options:

Common conflict scenarios:

Modified file: You edited the file after DocImp added documentation
- Resolution: Review changes, decide which version to keep, retry rollback
Deleted file: The file was deleted after documentation was added
- Resolution: Restore file or accept partial rollback
Multiple changes: Several DocImp changes modified the same lines
- Resolution: Rollback changes in reverse order, or use git directly

When conflicts occur, DocImp displays:

Which files have conflicts
Why conflicts happened (file modified since change)
Three resolution options: manual resolution, accept partial rollback, or use git directly

No Git installed? DocImp gracefully degrades - improve workflow runs normally, but rollback commands are unavailable.

Audit Session Management

DocImp allows you to manage audit sessions for long-running audits that may be interrupted or span multiple work sessions.

List audit sessions:

# List all audit sessions (shows session ID, progress, status)
docimp list-audit-sessions

Output includes:

Session ID (shortened to 12 characters)
Started timestamp (relative, e.g., "2h ago")
Completed timestamp (or "N/A" if in-progress)
Items rated (e.g., "5/23" means 5 of 23 items rated)
Status: "completed" (green) or "in-progress" (yellow)

Delete audit sessions:

# Delete a specific session
docimp delete-audit-session <session-id>

# Delete all audit sessions
docimp delete-audit-session --all

# Skip confirmation prompt
docimp delete-audit-session <session-id> --force
docimp delete-audit-session --all --force

Use cases:

Clean up incomplete sessions before starting fresh
Remove old completed sessions
Batch delete with --all flag
Script deletion with --force flag (skips confirmation)

Session file location: .docimp/session-reports/audit-session-{uuid}.json

Improve Session Management

DocImp allows you to manage improve sessions for long-running documentation improvement workflows that may be interrupted or span multiple work sessions.

List improve sessions:

# List all improve sessions (shows session ID, transaction info, progress, status)
docimp list-improve-sessions

Output includes:

Session ID (shortened to 12 characters)
Transaction ID (shortened to 12 characters)
Started timestamp (relative, e.g., "2h ago")
Completed timestamp (or "N/A" if in-progress)
Progress: Accepted/Skipped/Errors (e.g., "3/2/0")
Session Status: "completed" (green) or "in-progress" (yellow)
Transaction Status: "Active", "Committed", or "N/A"

Delete improve sessions:

# Delete a specific session
docimp delete-improve-session <session-id>

# Delete all improve sessions
docimp delete-improve-session --all

# Skip confirmation prompt
docimp delete-improve-session <session-id> --force
docimp delete-improve-session --all --force

Use cases:

Clean up incomplete sessions before starting fresh
Remove old completed sessions
Batch delete with --all flag
Script deletion with --force flag (skips confirmation)

Session file location: .docimp/session-reports/improve-session-{uuid}.json

Workflows

DocImp supports two monodirectional workflows in MVP:

Workflow A: analyze → plan → improve

Complexity-only impact scoring (no audit)

docimp analyze ./src
docimp plan ./src
docimp improve ./src

Best for: Quick start, small codebases, first-time users

Workflow B: analyze → audit → plan → improve

Quality-weighted impact scoring (with audit)

docimp analyze ./src
docimp audit ./src      # Rate existing documentation quality
docimp plan ./src       # Generates plan with quality-adjusted priorities
docimp improve ./src

Best for: Large codebases, teams prioritizing documentation quality

Rollback

DocImp tracks all documentation changes in a git-based transaction system, enabling you to rollback changes if needed.

List Sessions

View all documentation improvement sessions:

# List all sessions
docimp list-sessions

Output includes:

Session ID (UUID)
Start time
Number of changes
Status (in_progress, committed, rolled_back)

List Changes

View changes in a specific session:

# List changes in a session
docimp list-changes <session-id>

# Or use "last" for the most recent session
docimp list-changes last

Output includes:

Entry ID (git commit SHA)
File path
Item name (function/class/method)
Timestamp

Rollback Session

Revert all changes from a session:

# Rollback a specific session
docimp rollback-session <session-id>

# Or use "last" for the most recent session
docimp rollback-session last

This reverts all documentation changes made during that session using git's 3-way merge for conflict detection.

Rollback Change

Revert a specific change:

# Rollback a specific change
docimp rollback-change <entry-id>

# Or use "last" for the most recent change
docimp rollback-change last

Conflict Handling:

If files have been modified since the change, git's merge algorithm detects conflicts
Conflicts are reported with file paths
Partial rollback status is tracked

State Directory (.docimp/)

DocImp stores session data in .docimp/ (similar to .git/):

.docimp/
├── session-reports/
│   ├── audit.json          # Latest audit ratings
│   ├── plan.json           # Latest improvement plan
│   └── analyze-latest.json # Latest analysis
├── state/                  # Git-based transaction tracking
│   └── .git/               # Side-car Git repo (never touches your repo)
└── history/                # Future: audit history

Smart Auto-Clean: docimp analyze manages session reports to prevent stale data

By default, when you run docimp analyze, DocImp will:

Check if audit.json exists (contains audit ratings)
If it exists, prompt you before deleting
Show a warning about losing ratings
Wait for your confirmation (Y/n)

Override flags:

--preserve-audit: Keep audit.json, clean only plan.json (no prompt)
--force-clean: Skip prompt and always clean all files
No flags: Interactive prompt when audit.json exists

Examples:

# Default: prompts if audit.json exists
docimp analyze ./src

# Preserve audit ratings, clean plan only
docimp analyze ./src --preserve-audit

# Force clean without prompt
docimp analyze ./src --force-clean

Why prompting? Audit ratings represent manual review work. The prompt prevents accidentally losing this data when re-running analysis.

Transaction tracking: The .docimp/state/ directory contains a side-car Git repository used for rollback functionality. This repository operates independently from your project's Git repository and tracks all documentation changes made by DocImp.

Note: .docimp/ is gitignored automatically. Restore clean state with rm -rf .docimp/

Architecture

DocImp uses a three-layer polyglot architecture with clean dependency injection patterns.

┌─────────────────────────────────────────────────────────────────┐
│                     TypeScript CLI Layer                        │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  Commander.js • Config Loader (JS) • Plugin Manager       │  │
│  │  Python Bridge • Terminal Display • Interactive Session   │  │
│  └───────────────────────────────────────────────────────────┘  │
│                              ↕                                  │
│                   Subprocess Communication                       │
│                              ↕                                  │
│                    Python Analysis Engine                        │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  AST Parsers • Impact Scorer • Coverage Calculator        │  │
│  │  Claude Client • Docstring Writer                         │  │
│  └───────────────────────────────────────────────────────────┘  │
│                              ↕                                  │
│                      File System & APIs                          │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  .py .ts .js .cjs .mjs files • Claude API                 │  │
│  └───────────────────────────────────────────────────────────┘  │
│                              ↕                                  │
│                  JavaScript Config & Plugins                     │
│  ┌───────────────────────────────────────────────────────────┐  │
│  │  docimp.config.js • validate-types.js • jsdoc-style.js    │  │
│  │  TypeScript Compiler (for JSDoc validation)               │  │
│  └───────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Data Flow

User runs docimp analyze ./code
TypeScript CLI loads config from docimp.config.js (JavaScript file)
Python Bridge spawns Python subprocess with arguments
Python Analyzer discovers files and selects parser:
- .py → PythonParser (AST)
- .ts/.js/.cjs/.mjs → TypeScriptParser (TS Compiler with checkJs: true)
Parser extracts CodeItem objects (name, type, complexity, docs, exports, module system)
Impact Scorer calculates priority (0-100) based on cyclomatic complexity
Python returns AnalysisResult as JSON to stdout
TypeScript CLI parses JSON and displays formatted results

Interactive Improve Flow

User selects item to document
Python builds context prompt (code + surrounding context + style guide)
Claude generates documentation suggestion
TypeScript runs validation plugins (e.g., JSDoc type-checking with TS compiler)
Plugin returns accept/reject + optional autoFix
User accepts/edits/regenerates
Python writer inserts docstring/JSDoc into source file

Dependency Injection

All major components use constructor injection for testability:

Python:

# DocumentationAnalyzer accepts injected parsers and scorer
analyzer = DocumentationAnalyzer(
    parsers={'python': PythonParser(), 'javascript': TypeScriptParser()},
    scorer=ImpactScorer()
)

TypeScript:

// Commands accept injected bridge and display
const analyzeCommand = new AnalyzeCommand(
  pythonBridge: IPythonBridge,
  display: IDisplay
);

Impact Scoring

DocImp calculates a 0-100 impact score to prioritize documentation needs.

Formula

Without Audit:

impact_score = min(100, cyclomatic_complexity * 5)

With Audit (after running docimp audit):

impact_score = (complexity_weight × complexity_score) +
               (quality_weight × quality_penalty)

where:
  complexity_score = min(100, cyclomatic_complexity * 5)
  quality_penalty = rating_to_penalty(user_audit_rating)

Audit Rating to Penalty

User Rating	Penalty	Priority
No docs	100	Highest
Terrible (1)	80	Very High
OK (2)	40	Medium
Good (3)	20	Low
Excellent (4)	0	Lowest

Default Weights

Configurable in docimp.config.js:

module.exports = {
  impactWeights: {
    complexity: 0.6, // 60% from code complexity
    quality: 0.4, // 40% from audit rating
  },
};

Examples

Without Audit:

Simple function (complexity 1):

def add(x, y):
    return x + y

Impact Score: 5

Complex function (complexity 15):

def process_payment(user_id, amount, options):
    # 15 lines with multiple branches
    if options.get('immediate'):
        if amount > 1000:
            # verification logic
        else:
            # direct processing
    else:
        # queue for later

Impact Score: 75

With Audit:

Complex function (complexity 15) with terrible docs (rating 1):

Complexity score: 75
Quality penalty: 80
Impact Score: 75×0.6 + 80×0.4 = 77

Simple function (complexity 3) with no docs:

Complexity score: 15
Quality penalty: 100
Impact Score: 15×0.6 + 100×0.4 = 49

Future Enhancements

Planned improvements for more sophisticated scoring:

Public/Private API Detection: Boost score for exported functions, lower for internal helpers
Pattern Detection: Identify dependency injection, async patterns, decorators
Custom Pattern Matchers: User-defined heuristics (e.g., functions ending in Repository)
Test File Penalty: Lower priority for test files

Configuration

DocImp uses a JavaScript configuration file (not JSON) to allow custom logic.

Example: `docimp.config.js`

module.exports = {
  // Per-language style guides
  styleGuides: {
    // Python: 'google', 'numpy-rest', 'numpy-markdown', 'sphinx'
    python: 'google',

    // JavaScript: 'jsdoc-vanilla', 'jsdoc-google', 'jsdoc-closure'
    javascript: 'jsdoc-vanilla',

    // TypeScript: 'tsdoc-typedoc', 'tsdoc-aedoc', 'jsdoc-ts'
    typescript: 'tsdoc-typedoc',
  },

  // Tone: 'concise', 'detailed', 'friendly'
  tone: 'concise',

  // JSDoc-specific options
  jsdocStyle: {
    preferredTags: { return: 'returns', arg: 'param' },
    requireDescriptions: true,
    requireExamples: 'public', // 'all', 'public', 'none'
    enforceTypes: true,
  },

  // Audit code display configuration
  audit: {
    showCode: {
      // Display mode for code during audit:
      // - 'complete': Show full code, no truncation
      // - 'truncated': Show code up to maxLines (default)
      // - 'signature': Show just function/class signature
      // - 'on-demand': Don't show code, use [C] to view
      mode: 'truncated',

      // Maximum lines to show in 'truncated' and 'signature' modes
      // (not counting the docstring itself)
      maxLines: 20,
    },
  },

  // Impact scoring weights (used when audit data available)
  impactWeights: {
    complexity: 0.6, // 60% from cyclomatic complexity
    quality: 0.4, // 40% from audit quality rating
  },

  // Validation plugins (JavaScript files)
  plugins: ['./plugins/validate-types.js', './plugins/jsdoc-style.js'],

  // File exclusions (glob patterns)
  exclude: [
    '**/test_*.py',
    '**/*.test.ts',
    '**/node_modules/**',
    '**/venv/**',
    '**/__pycache__/**',
  ],
};

Config supports both CommonJS and ESM

CommonJS:

module.exports = {
  /* config */
};

ESM:

export default {
  /* config */
};

Environment Variables

DocImp supports several environment variables for configuration and customization:

Variable	Purpose	Required	Example
`ANTHROPIC_API_KEY`	Claude AI API key for `improve` command	Yes (for `improve`)	`sk-ant-...`
`DOCIMP_ANALYZER_PATH`	Override analyzer directory location	No	`/custom/path/to/analyzer`
`DOCIMP_PYTHON_PATH`	Override Python executable detection	No	`/usr/local/bin/python3.13`

DOCIMP_ANALYZER_PATH is useful for:

Custom installations where the analyzer is in a non-standard location
Development setups with modified project structure
Troubleshooting path resolution issues

Path Resolution Order:

When DOCIMP_ANALYZER_PATH is NOT set, DocImp uses fallback strategies based on process.cwd():

<cwd>/../analyzer - Running from cli/ directory (development, tests)
<cwd>/analyzer - Running from repository root
<cwd>/../../analyzer - Global npm install scenario

If all strategies fail, set DOCIMP_ANALYZER_PATH explicitly.

Example:

# Set analyzer path for custom installation
export DOCIMP_ANALYZER_PATH=/opt/docimp/analyzer
docimp analyze ./src

Plugin System

DocImp's plugin system provides extensible validation hooks to catch errors before accepting AI-generated documentation.

Plugin Interface

interface IPlugin {
  name: string;
  version: string;
  hooks: {
    beforeAccept?: (
      docstring: string,
      item: CodeItem,
      config: IConfig
    ) => Promise<PluginResult>;
    afterWrite?: (filepath: string, item: CodeItem) => Promise<PluginResult>;
  };
}

interface PluginResult {
  accept: boolean; // true = allow, false = block
  reason?: string; // Error message if blocked
  autoFix?: string; // Suggested correction
}

Built-in Plugins

1. `validate-types.js` - Real JSDoc Type-Checking

Uses TypeScript compiler programmatically to validate JSDoc:

// Bad: Parameter name mismatch
/**
 * @param {number} wrongName - The value
 */
function add(correctName) {
  return correctName + 1;
}

// Plugin catches: "Parameter 'wrongName' doesn't match signature 'correctName'"

How it works:

Creates in-memory TypeScript program with checkJs: true
Validates parameter names match
Validates types are correct
Returns specific error messages with line numbers
Can suggest auto-fixes

2. `jsdoc-style.js` - Style Enforcement

Enforces JSDoc style rules from config:

// Bad: Missing description ending punctuation
/**
 * Add two numbers
 * @param {number} a
 * @param {number} b
 * @returns {number}
 */

// Plugin suggests: "Description should end with punctuation"

3. Linter Integration (Future Enhancement)

The plugin system supports integration with external linters like ruff or eslint. This demonstrates the extensibility of the framework - you can add custom validation by implementing the plugin interface.

Example future plugin:

// plugins/lint-docstrings.js (not included in MVP)
module.exports = {
  name: 'lint-docstrings',
  version: '1.0.0',
  hooks: {
    async afterWrite(filepath, item) {
      // Run ruff on Python files, eslint on JS files, etc.
    },
  },
};

Security Model

Plugins are user-controlled JavaScript code with NO sandboxing.

Trade-offs:

✓ Full access to Node.js APIs and TypeScript compiler
✓ Real validation (not just pattern matching)
✗ No security boundary - plugins run with full file system access
✗ User must trust plugin source code

Default behavior: Only load plugins from:

./plugins/ directory
Paths specified in docimp.config.js

See plugins/README.md for full plugin development guide and security details.

Writing Custom Plugins

// my-plugin.js
module.exports = {
  name: 'my-validator',
  version: '1.0.0',
  hooks: {
    async beforeAccept(docstring, item, config) {
      // Validate docstring
      if (!docstring.includes('@example') && item.is_public) {
        return {
          accept: false,
          reason: 'Public APIs must include @example',
          autoFix: docstring + '\n * @example\n * // TODO: Add example',
        };
      }

      return { accept: true };
    },
  },
};

Add to docimp.config.js:

module.exports = {
  plugins: ['./my-plugin.js'],
};

JavaScript/JSDoc Support

DocImp treats JavaScript as a first-class citizen with real type-checking.

TypeScript Configuration

Critical settings in cli/tsconfig.json:

{
  "compilerOptions": {
    "allowJs": true, // Parse JavaScript files
    "checkJs": true, // Type-check JSDoc in .js files
    "module": "NodeNext", // Deterministic ESM/CJS interop
    "moduleResolution": "NodeNext"
  }
}

checkJs: true enables real JSDoc validation, not just cosmetic parsing.

Module System Detection

DocImp detects and handles:

ESM (ES Modules):

export function add(a, b) {
  return a + b;
}
export default class Calculator {}

CommonJS:

module.exports = { add };
exports.subtract = (a, b) => a - b;

Mixed (detected per-file):

// File: utils.mjs (ESM)
export const helper = () => {};

// File: legacy.cjs (CommonJS)
module.exports.helper = () => {};

JSDoc Validation

The validate-types.js plugin uses TypeScript compiler to validate:

Parameter names:

/**
 * @param {number} wrongName   // ERROR: doesn't match
 */
function add(correctName) {}

Parameter types:

/**
 * @param {string} value   // ERROR: passing number
 */
function double(value) {
  return value * 2; // TS compiler detects type mismatch
}

Return types:

/**
 * @returns {string}   // ERROR: actually returns number
 */
function getId() {
  return 123;
}

JavaScript Write Patterns

The DocstringWriter correctly handles:

// Function declaration
function foo() {}

// Export function
export function foo() {}

// Default export
export default function foo() {}

// Arrow function
const foo = () => {};

// Async arrow function
const fetchData = async () => {};

// Class method
class Service {
  async getData() {}
  static helper() {}
  get value() {}
}

// Object literal method
module.exports = {
  foo() {},
  bar: function () {},
};

// CommonJS patterns
module.exports.baz = () => {};
exports.qux = function () {};

All patterns preserve indentation and avoid duplicate comments.

Built with Claude Code

DocImp was built entirely using Claude Code, demonstrating production-grade development with AI assistance.

Development Process

16 Claude Code instances across 3-4 days
Session atomicity: Each instance completed a specific deliverable
Contract-based: Clear inputs, outputs, and rollback plans
Progressive context: Built complexity incrementally
Test-first: Validation at each step

Development artifacts including methodology playbook, instance-by-instance development logs, case studies, and terminal recordings are planned for future releases.

Key Data Models

CodeItem

Core representation of a parsed code entity:

@dataclass
class CodeItem:
    """Represents a function, class, or method extracted from source code."""
    name: str                    # Function/class name
    type: str                    # 'function', 'class', 'method'
    filepath: str
    line_number: int
    end_line: int                # Last line of code block (inclusive)
    language: str                # 'python', 'typescript', 'javascript', 'skipped'
    complexity: int              # Cyclomatic complexity
    impact_score: float          # 0-100 priority score
    has_docs: bool               # Binary: has documentation or not
    parameters: List[str]
    return_type: Optional[str]
    docstring: Optional[str]
    export_type: str             # 'named', 'default', 'commonjs', 'internal'
    module_system: str           # 'esm', 'commonjs', 'unknown'
    audit_rating: Optional[int]  # 1-4 rating from audit, or None if skipped/not audited

AnalysisResult

@dataclass
class AnalysisResult:
    """Results from analyzing a codebase."""
    items: List[CodeItem]
    coverage_percent: float
    total_items: int
    documented_items: int
    by_language: Dict[str, LanguageMetrics]

Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

Development Setup

# Clone repository
git clone https://github.com/nikblanchet/docimp.git
cd docimp

# Python setup with uv
uv venv
uv pip sync requirements-dev.lock
uv pip install -e .
cd analyzer
uv run pytest -v

# TypeScript setup
cd ../cli
npm install
uv run npm test
npm run build

# Run linters
uv run ruff check .  # Python
npm run lint  # TypeScript/JavaScript

# Run formatters
uv run ruff format .  # Python (auto-fix)
npm run format  # TypeScript/JavaScript (auto-fix)

# Check formatting without auto-fix
ruff format --check .  # Python
npm run format:check  # TypeScript/JavaScript

Pre-commit Hooks

This project uses husky and lint-staged to automatically format and lint code before commits:

# Hooks are automatically installed when you run npm install in cli/
# They run on staged files only

# To bypass hooks (use sparingly):
git commit --no-verify

The pre-commit hook will:

Format Python files with ruff format
Lint Python files with ruff check --fix
Format TypeScript/JavaScript files with Prettier
Lint TypeScript/JavaScript files with ESLint

Running Tests

# Python tests
cd analyzer
pytest -v --cov=src

# TypeScript tests
cd cli
uv run npm test

# Integration tests
uv run npm run test:integration

Known Test Coverage Limitations

The following testing gaps represent conscious trade-off decisions made during development to prioritize shipping a functional MVP:

Improve Command - Manual Testing Only

Status: No automated tests. Manual testing procedure documented in test-samples/test-workflows-improve.sh.

Why: Requires ANTHROPIC_API_KEY, interactive user input (A/E/R/S/Q choices), and incurs API costs. Mocking the Claude client is significant engineering effort.

Risk: Medium - Primary feature lacks regression testing, but functionality is straightforward.

Mitigation: Manual testing runbook available for pre-release validation.

Error Condition Testing - Limited Coverage

Status: Minimal testing of error conditions (corrupted state files, malformed JSON, filesystem errors).

Why: Users can recover by deleting .docimp/ directory. Corrupted state files are rare. Focus prioritized on happy path functionality.

Risk: Low - Edge case failures don't impact primary workflows.

Mitigation: StateManager uses standard JSON parsing with basic error handling.

Scaling and Performance - Small Test Samples

Status: Test samples intentionally kept small (~62 items). Large codebase performance not formally validated.

Why: Small samples enable complete manual audits. Cyclomatic complexity algorithms scale linearly. No algorithmic bottlenecks. Real-world validation occurs when running DocImp on itself during development.

Risk: Very low - Architecture has no scaling concerns.

Cross-Platform Testing - Ubuntu Only

Status: CI runs on ubuntu-latest only. Windows and macOS not tested in CI.

Why: Project targets Unix-like environments primarily. Multi-platform CI adds cost/complexity. Core Python/TypeScript/Node stack is inherently cross-platform.

Risk: Low - Standard tooling is well-tested across platforms.

Future: Additional platforms will be added to CI matrix if issues are discovered.

Note: These limitations are tracked in GitHub issues and will be addressed in future releases. See Issues #174 (improve testing) and #175 (error conditions) for planned enhancements.

License

DocImp is dual-licensed under AGPL-3.0 (for open-source use) or a Commercial License (for proprietary use without source code disclosure). See LICENSE for full details.

Acknowledgments

AI Assistance

Program design assisted by Claude (macOS app) and ChatGPT (macOS app)
All coding done exclusively with Claude Code, running in a terminal within VS Code on macOS

Development Tools

Editor: VS Code with Sublime Text for regex work
Font: Fira Code Nerd Font with ligatures enabled
Environment Management: uv (primary) with direnv for automatic environment activation
Git Workflow: GitHub CLI (installed via Homebrew) for pull requests and merges
Version Control: Git

Core Technologies

TypeScript Compiler for JSDoc validation
Python AST for code analysis
Anthropic Claude API for documentation generation

Open-Source Libraries

TypeScript/JavaScript: Commander.js, chalk, cli-table3, Prettier, ESLint, husky, lint-staged
Python: pytest, ruff (linting & formatting), mypy

Project Status

Current Version: 1.0.6-α

MVP Scope:

Complexity-based impact scoring
Python/TypeScript/JavaScript support
2 validation plugins (type-checking, style)
Interactive workflow (sequential)
Basic commands: analyze, audit, plan, improve

Future Enhancements:

Commands & Workflow

Save/Resume Sessions: Pause and continue improve sessions later
Progress Tracking: Show session progress and estimated time remaining
Manual Item Selection: Let users pick specific items to document
Plan Filtering: Human-readable plan output with --priority, --language, --limit flags
Batch Mode: Non-interactive improve for CI/CD pipelines
Show Existing Docs First: Display current docstring before calling Claude (enables quick skipping)
Usage Context: Include call-site examples when regenerating suggestions
Team Mode: Divide plan among multiple users, prevent conflicts

Impact Scoring

Pattern Detection: Identify dependency injection, async patterns, decorators
Public/Private API Detection: Boost score for exported functions, lower for internal helpers
Custom Pattern Matchers: User-defined heuristics (e.g., functions ending in Repository)
Test File Penalty: Lower priority for test files
Configuration Weights: Fine-tune complexity vs visibility vs patterns

Language Support

Additional Style Guides: Sphinx, Google (Python), more JSDoc variants
More Languages: Go, Rust, Ruby, etc.
Cross-language Context: Better handling of polyglot projects

Plugins & Validation

Linter Integration: Run ruff/eslint after writing docs
Auto-fix Capabilities: Automatically apply simple corrections
Plugin Marketplace: Share community plugins
Custom Validation Rules: User-defined quality checks

Developer Experience

IDE Integrations: VS Code extension, JetBrains plugin
Git Integration: Commit docs automatically, create PRs
CI/CD Pipelines: GitHub Actions, GitLab CI integration
Multi-file Context: Include related files in Claude prompts

Planned Releases:

v1.0.0: Core functionality (current MVP scope)
v1.1.0: Save/resume sessions, progress tracking
v1.2.0: Pattern detection, advanced scoring algorithms
v2.0.0: Additional languages, IDE integrations

Troubleshooting

Parse Errors in Analyzed Code

If DocImp encounters syntax errors in your codebase:

Default behavior (non-strict mode):

DocImp logs warnings for files with syntax errors
Analysis continues with remaining valid files
Parse failures are tracked and displayed in the analysis summary
You can fix the syntax errors and re-run the analysis

Example output:

⚠ Parse Failures: 2 files could not be parsed

src/broken.py: invalid syntax (line 42)
src/incomplete.ts: Unexpected token '}'

Strict mode (fail-fast):

docimp analyze ./src --strict

In strict mode, DocImp fails immediately on the first syntax error. Useful for CI/CD pipelines where you want to enforce clean syntax before analyzing documentation.

Common causes:

Work-in-progress files with incomplete code
Broken commits pushed during development
Copy-paste errors or merge conflicts
Experimental code that doesn't compile

Resolution:

Review the error message for file path and line number
Fix the syntax error in the source file
Re-run docimp analyze

Workflow State Issues

Troubleshooting common workflow state problems.

Stale data warnings:

If docimp status shows staleness warnings:

Staleness Warnings:
  • analyze is stale (2 files modified since last run)
  • plan is stale (analyze re-run since plan generated)

Resolution:

# Update analysis incrementally (fast)
docimp analyze ./src --incremental

# Regenerate plan with latest analysis
docimp plan ./src

Corrupted workflow-state.json:

If you see errors like "Failed to parse workflow state" or "Invalid schema version":

# Delete corrupted state file (safe - will be recreated)
rm .docimp/workflow-state.json

# Re-run analysis to create fresh state
docimp analyze ./src

Missing files in workflow state:

If files were deleted but still appear in workflow state:

# Re-run analysis to update file list
docimp analyze ./src

# Or use incremental mode (detects deletions automatically)
docimp analyze ./src --incremental

Reset workflow state completely:

To start fresh (clears all command history):

# Remove state file
rm .docimp/workflow-state.json

# Optionally remove all session data
rm -rf .docimp/session-reports/*

# Re-run workflow from scratch
docimp analyze ./src

Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: See README.md and inline documentation for complete usage guide

Star this repo if DocImp helps your project!

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
.github/workflows		.github/workflows
.husky		.husky
analyzer		analyzer
cli		cli
docs		docs
examples		examples
plugins		plugins
test-fixtures		test-fixtures
test-samples		test-samples
.editorconfig		.editorconfig
.envrc		.envrc
.gitignore		.gitignore
.nvmrc		.nvmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
docimp.config.js		docimp.config.js
pyproject.toml		pyproject.toml
requirements-dev.in		requirements-dev.in
requirements-dev.lock		requirements-dev.lock
requirements-dev.txt		requirements-dev.txt
requirements.in		requirements.in
requirements.lock		requirements.lock
requirements.txt		requirements.txt
ruff.toml		ruff.toml
test-config-system.sh		test-config-system.sh
test-plugins.js		test-plugins.js
uv.lock		uv.lock

License

nikblanchet/docimp

Folders and files

Latest commit

History

Repository files navigation

DocImp

Table of Contents

Why DocImp?

Features

Core Capabilities

Language Support

JavaScript Excellence

Quick Start

Installation

Prerequisites

Install from Source

Usage

Analyze

Audit

Resuming Interrupted Audit Sessions

Plan

Status

Workflow State Management

Incremental Analysis

Audit Rating Application

Smart Auto-Clean

Workflow History

Improve

Resuming Interrupted Improve Sessions

Rollback & Undo

Audit Session Management

Improve Session Management

Workflows

Workflow A: analyze → plan → improve

Workflow B: analyze → audit → plan → improve

Rollback

List Sessions

List Changes

Rollback Session

Rollback Change

State Directory (.docimp/)

Architecture

Data Flow

Interactive Improve Flow

Dependency Injection

Impact Scoring

Formula

Audit Rating to Penalty

Default Weights

Examples

Future Enhancements

Configuration

Example: docimp.config.js

Config supports both CommonJS and ESM

Environment Variables

Plugin System

Plugin Interface

Built-in Plugins

1. validate-types.js - Real JSDoc Type-Checking

2. jsdoc-style.js - Style Enforcement

3. Linter Integration (Future Enhancement)

Security Model

Writing Custom Plugins

JavaScript/JSDoc Support

TypeScript Configuration

Module System Detection

JSDoc Validation

JavaScript Write Patterns

Built with Claude Code

Development Process

Key Data Models

CodeItem

AnalysisResult

Contributing

Development Setup

Pre-commit Hooks

Running Tests

Known Test Coverage Limitations

Improve Command - Manual Testing Only

Example: `docimp.config.js`

1. `validate-types.js` - Real JSDoc Type-Checking

2. `jsdoc-style.js` - Style Enforcement