devmem -- Local-first Developer Memory & Knowledge CLI

Your project's memory, built from git history. Zero cloud. Zero API keys.

Problem Statement

Developers lose decisions, patterns, and lessons learned across long-lived projects. AI coding assistants (Claude Code, Cursor, Aider, Codex) generate rich context during sessions, but that knowledge evaporates when the session ends. Manual CLAUDE.md maintenance is inconsistent and incomplete.

Existing solutions address fragments of this problem:

chunkhound (1.2k stars): Codebase intelligence via semantic code search. Does not extract developer knowledge from git history.
claude-mem (51k stars): Auto-captures Claude Code sessions and compresses with AI. Claude-only, requires Claude API key, targets sessions not config files.
pro-workflow: Self-correcting memory for Claude Code. Single-tool, not portable.

No tool fuses git commit history + AI session context + existing rules files into a single searchable local knowledge base with direct injection into project config files (CLAUDE.md, .cursorrules, etc.).

Solution Overview

devmem is a Python CLI that auto-extracts knowledge from git commits, AI coding sessions, code annotations, and existing rules files. It builds a searchable local SQLite knowledge graph and injects relevant context into project configuration files.

Key principles:

Git-first: Commit messages and diffs are the primary knowledge source (always available, no setup)
Zero API keys: All processing is local. No cloud, no LLM calls for core features.
Cross-tool: Outputs to CLAUDE.md, .cursorrules, .windsurfrules, or any config file.
Project-scoped: One knowledge base per project, stored in .devmem/ alongside .git/.

Architecture

System Diagram

+------------------+     +------------------+     +-------------------+
|  Git Log/Diffs   |     |  AI Sessions     |     |  Existing Rules   |
|  (primary)       |     |  (Claude, etc.)  |     |  (CLAUDE.md, etc.)|
+--------+---------+     +--------+---------+     +--------+----------+
         |                        |                         |
         v                        v                         v
+------------------------------------------------------------------+
|                     Extraction Pipeline                           |
|  +------------+  +--------------+  +-----------+  +------------+ |
|  | Commit     |  | Session      |  | Annotation|  | Rules      | |
|  | Analyzer   |  | Parser       |  | Extractor |  | Ingester   | |
|  +------------+  +--------------+  +-----------+  +------------+ |
+------------------------------------------------------------------+
                              |
                              v
+------------------------------------------------------------------+
|                     Knowledge Store (SQLite)                       |
|  +----------+  +----------+  +----------+  +-------------------+  |
|  | facts    |  | patterns |  | decisions|  | project_context   |  |
|  +----------+  +----------+  +----------+  +-------------------+  |
+------------------------------------------------------------------+
                              |
              +---------------+---------------+
              |                               |
              v                               v
+---------------------------+   +-----------------------------+
|  Search (BM25 + optional  |   |  Injection Engine           |
|  semantic via embeddings) |   |  -> CLAUDE.md / .cursorrules|
+---------------------------+   +-----------------------------+

Directory Structure

devmem/
  src/
    devmem/
      __init__.py
      __main__.py          # CLI entry point
      cli.py               # Argument parsing (click)
      core/
        __init__.py
        store.py           # SQLite knowledge store
        models.py          # Data models (Pydantic)
        search.py          # BM25 + optional semantic search
        injector.py        # Context injection into config files
      extractors/
        __init__.py
        base.py            # Base extractor interface
        git_commits.py     # Git log/diff analyzer
        ai_sessions.py     # AI session file parser (Claude Code, etc.)
        annotations.py     # TODO/FIXME/HACK comment extractor
        rules_files.py     # Existing CLAUDE.md/.cursorrules parser
      graph/
        __init__.py
        builder.py         # Knowledge graph construction
        relations.py       # Cross-entity relationship mapping
  tests/
    unit/
      test_store.py
      test_search.py
      test_injector.py
      test_git_commits.py
      test_annotations.py
      test_rules_files.py
      test_models.py
    integration/
      test_full_pipeline.py
      test_injection_roundtrip.py
    fixtures/
      sample_repo/         # Fake git repo for testing
      sample_sessions/     # Sample AI session files
  docs/
    usage.md
    extraction.md
    configuration.md
  examples/
    basic/
  .github/
    workflows/
      ci.yml
    ISSUE_TEMPLATE/
      bug_report.md
      feature_request.md
    PULL_REQUEST_TEMPLATE.md
  pyproject.toml
  README.md
  README.ja.md
  LICENSE
  CONTRIBUTING.md
  CHANGELOG.md
  THIRD_PARTY_LICENSES.md
  .gitignore

Technology Choices

Component	Choice	Rationale
Language	Python 3.11+	Rich CLI ecosystem, broad developer familiarity
CLI Framework	Click	De facto standard for Python CLIs, decorator-based, well-documented
Database	SQLite (via sqlite3 + apsw)	Zero-config, local-first, built-in FTS5 for BM25
Data Models	Pydantic v2	Type-safe models, validation, serialization
Git Access	GitPython	Mature Python git library, handles log/diff parsing
Search	FTS5 (built-in) + optional sentence-transformers	BM25 via FTS5 for keyword; optional semantic via local embeddings
Console Output	Rich	Beautiful terminal output, progress bars, tables
Testing	pytest + pytest-cov	Standard Python testing, coverage reporting
Packaging	pyproject.toml (setuptools)	Modern Python packaging, pip-installable
CI	GitHub Actions	Standard, free for public repos

Project Type

CLI Tool -- Installed via pip install devmem, invoked as devmem <command>.

Feature Specifications

F1: Git Commit Analyzer

Description: Extracts knowledge artifacts from git commit history.

Acceptance Criteria:

AC1: Parses commit messages and categorizes into types: decision, fix, feature, refactor, chore
AC2: Extracts file-level change summaries from diffs (which files changed, how)
AC3: Identifies recurring patterns across commits (same file changed often = hot spot)
AC4: Handles repos with 10,000+ commits without excessive memory usage

CLI Interface:

devmem extract git [--repo PATH] [--since DATE] [--until DATE] [--author AUTHOR]

Edge Cases:

Empty repo (no commits) -> warning, skip
Merge commits -> analyze diff, not individual parents
Binary files -> skip diff, note filename only
Very large diffs -> truncate with summary

Dependencies: None beyond GitPython (core feature).

F2: Rules File Ingester

Description: Reads existing project configuration files (CLAUDE.md, .cursorrules, etc.) to seed the knowledge store.

Acceptance Criteria:

AC1: Detects and parses CLAUDE.md, .cursorrules, .windsurfrules, .github/copilot-instructions.md
AC2: Extracts structured sections (rules, preferences, conventions, architecture notes)
AC3: Preserves source file attribution (which file, which section)

CLI Interface:

devmem extract rules [--repo PATH] [--files FILE1,FILE2]

F3: Knowledge Store

Description: SQLite-based local storage for extracted knowledge artifacts.

Acceptance Criteria:

AC1: Stores artifacts with type, source, content, metadata, timestamp
AC2: Full-text search via FTS5 (BM25 ranking)
AC3: Tags and categories for filtering
AC4: Incremental updates (re-extract only new commits)

Data Model:

class KnowledgeArtifact(BaseModel):
    id: str                    # UUID
    type: ArtifactType         # decision, pattern, fix, preference, convention, hotspot
    source: SourceType         # git_commit, ai_session, annotation, rules_file
    source_ref: str            # commit hash, file path, etc.
    title: str
    content: str
    tags: list[str]
    confidence: float          # 0.0-1.0 extraction confidence
    project_path: str
    created_at: datetime
    updated_at: datetime

CLI Interface:

devmem status                  # Show store stats
devmem search <query>          # Search artifacts
devmem list [--type TYPE] [--source SOURCE]  # List artifacts

F4: Context Injector

Description: Generates and injects context summaries into project configuration files.

Acceptance Criteria:

AC1: Generates markdown summary of top-k relevant artifacts
AC2: Injects into CLAUDE.md (or other target) with clear markers (begin/end)
AC3: Updates existing injection without removing manual content
AC4: Respects file format conventions (CLAUDE.md vs .cursorrules)

CLI Interface:

devmem inject [--target FILE] [--limit N] [--categories CAT1,CAT2]
devmem inject --dry-run       # Preview without writing

Injection Format (CLAUDE.md):

<!-- devmem:start -->
## Project Knowledge (auto-generated by devmem)

### Key Decisions
- [commit abc123] Use SQLite for local storage instead of JSON files
- [commit def456] Adopt Click over argparse for CLI

### Recurring Patterns
- `src/core/` is modified in 40% of commits -- core module, handle with care
- Tests always follow `tests/unit/` and `tests/integration/` structure

### Active Hot Spots
- `src/devmem/store.py` -- changed 12 times in last 30 days
<!-- devmem:end -->

F5: AI Session Parser (Post-MVP)

Description: Parses AI coding session files for additional context.

Acceptance Criteria:

AC1: Parses Claude Code session JSON from ~/.claude/projects/
AC2: Extracts decisions, tool calls, error resolutions
AC3: Handles large session files (>10MB) gracefully

F6: Code Annotation Extractor (Post-MVP)

Description: Extracts TODO, FIXME, HACK, NOTE, DECISION comments from codebase.

Acceptance Criteria:

AC1: Scans codebase for annotation comments
AC2: Categorizes by type (TODO, FIXME, HACK, NOTE, DECISION)
AC3: Links annotations to git blame data for authorship

F7: Optional Semantic Search (Post-MVP)

Description: Adds semantic (embedding-based) search alongside BM25.

Acceptance Criteria:

AC1: Uses sentence-transformers (all-MiniLM-L6-v2) for local embeddings
AC2: Hybrid search combining BM25 + semantic scores
AC3: Falls back gracefully when sentence-transformers not installed

Testing Strategy

Layer	Framework	Coverage Target	Scope
Unit	pytest	85%	Models, extractors, store, search, injector
Integration	pytest + temp git repos	80%	Full pipeline (extract -> store -> search -> inject)
E2E	pytest + subprocess	70%	CLI commands with real git repo

Test Fixtures:

sample_repo/: Fake git repo with known commit history
sample_sessions/: Sample AI session JSON files
sample_rules/: Sample CLAUDE.md / .cursorrules files

Key Test Scenarios:

Extract from empty repo -> graceful handling
Extract 1000+ commits -> performance acceptable
Inject into CLAUDE.md with existing manual content -> no corruption
Re-extract (incremental) -> only new commits processed
Search with CJK characters -> correct tokenization
Multiple injection targets -> correct format per target

CI/CD Pipeline

# .github/workflows/ci.yml
name: CI
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.11", "3.12", "3.13"]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
      - run: pip install -e ".[dev]"
      - run: pytest --cov=src/devmem --cov-report=term-missing
      - run: ruff check src/ tests/
      - run: mypy src/

  build:
    runs-on: ubuntu-latest
    needs: test
    steps:
      - uses: actions/checkout@v4
      - run: pip install build && python -m build
      - run: twine check dist/*

Milestones

M1: Foundation (S)

Scope: Project scaffold, data models, SQLite store, CLI skeleton.

Done when:

pip install -e . works
devmem --help shows all planned commands
devmem status shows empty store
All models defined with Pydantic
SQLite tables created with FTS5
Tests passing (unit tests for models + store)

M2: Git Extraction (M)

Scope: Git commit analyzer, commit categorization, diff summarization.

Done when:

devmem extract git populates store from real repos
Commit categorization (decision/fix/feature/refactor/chore) accuracy >80%
Diff summarization extracts file-level changes
Handles 10,000+ commits without crash
Incremental extraction (only new commits)
Tests passing (unit + integration with sample_repo)

M3: Search & Rules (M)

Scope: Rules file ingestion, BM25 search, list/filter commands.

Done when:

devmem extract rules parses CLAUDE.md and .cursorrules
devmem search <query> returns ranked results via FTS5
devmem list --type decision filters by type
Search handles CJK characters correctly
Tests passing (unit + integration)

M4: Context Injection (M)

Scope: Injection engine, multi-target support, dry-run mode.

Done when:

devmem inject writes to CLAUDE.md with markers
Preserves existing manual content outside markers
--dry-run shows preview without writing
Supports multiple target file formats
Round-trip test (inject -> re-inject updates correctly)
Tests passing (unit + integration + E2E)

M5: Polish & Publish (S)

Scope: README (en + ja), CI/CD, packaging, documentation.

Done when:

README.md (English) + README.ja.md (Japanese) complete
CONTRIBUTING.md has full dev setup instructions
CI passes on Python 3.11/3.12/3.13
python -m build produces valid wheel/sdist
No hardcoded secrets
THIRD_PARTY_LICENSES.md generated
All quality checks pass

Risks & Mitigations

Risk	Likelihood	Impact	Mitigation
Git log parsing edge cases (merge commits, rebases, signed commits)	Medium	Medium	Test with real-world repos (Linux kernel, React). Use GitPython's mature API.
Commit categorization inaccuracy	Medium	Low	Start with keyword heuristics (conventional commits prefix). Allow user correction.
CLAUDE.md injection corrupts user content	Low	High	Markers with checksums. Dry-run mode. Backup before injection.
FTS5 CJK tokenization poor	Medium	Medium	Use ICU tokenizer or custom bigram tokenizer. Test with Japanese repos.
Large repo performance (>50k commits)	Medium	Medium	Incremental extraction, pagination, SQLite indexing. Benchmark early.
sentence-transformers heavy dependency	Low	Low	Make fully optional (behind `[semantic]` extra). BM25 is sufficient for MVP.

Out of Scope (MVP)

AI/LLM-powered extraction or summarization (rule-based only)
Semantic/embedding search (post-MVP, optional)
AI session parsing (post-MVP)
Code annotation extraction (post-MVP)
Web UI or TUI dashboard
Cloud sync or sharing
Multi-project aggregation
Git hosting platform APIs (GitHub, GitLab)
Real-time file watching
Windows support (macOS + Linux only for MVP)

Dependency License Audit

Dependency	Version	License	Compatible?
click	>=8.1	BSD-3	Yes
pydantic	>=2.0	MIT	Yes
GitPython	>=3.1	BSD-3	Yes
rich	>=13.0	MIT	Yes
apsw	>=3.42	MIT	Yes
pytest	>=8.0 (dev)	MIT	Yes
pytest-cov	>=5.0 (dev)	MIT	Yes
ruff	>=0.4 (dev)	MIT	Yes
mypy	>=1.10 (dev)	MIT	Yes
sentence-transformers	>=3.0 (optional)	Apache-2.0	Yes
build	>=1.0 (dev)	MIT	Yes
twine	>=5.0 (dev)	Apache-2.0	Yes

All dependencies are MIT or MIT-compatible (BSD-3, Apache-2.0). No GPL/AGPL dependencies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

devmem -- Local-first Developer Memory & Knowledge CLI

Problem Statement

Solution Overview

Architecture

System Diagram

Directory Structure

Technology Choices

Project Type

Feature Specifications

F1: Git Commit Analyzer

F2: Rules File Ingester

F3: Knowledge Store

F4: Context Injector

F5: AI Session Parser (Post-MVP)

F6: Code Annotation Extractor (Post-MVP)

F7: Optional Semantic Search (Post-MVP)

Testing Strategy

CI/CD Pipeline

Milestones

M1: Foundation (S)

M2: Git Extraction (M)

M3: Search & Rules (M)

M4: Context Injection (M)

M5: Polish & Publish (S)

Risks & Mitigations

Out of Scope (MVP)

Dependency License Audit

FilesExpand file tree

PRD.md

Latest commit

History

PRD.md

File metadata and controls

devmem -- Local-first Developer Memory & Knowledge CLI

Problem Statement

Solution Overview

Architecture

System Diagram

Directory Structure

Technology Choices

Project Type

Feature Specifications

F1: Git Commit Analyzer

F2: Rules File Ingester

F3: Knowledge Store

F4: Context Injector

F5: AI Session Parser (Post-MVP)

F6: Code Annotation Extractor (Post-MVP)

F7: Optional Semantic Search (Post-MVP)

Testing Strategy

CI/CD Pipeline

Milestones

M1: Foundation (S)

M2: Git Extraction (M)

M3: Search & Rules (M)

M4: Context Injection (M)

M5: Polish & Publish (S)

Risks & Mitigations

Out of Scope (MVP)

Dependency License Audit