Your project's memory, built from git history. Zero cloud. Zero API keys.
Developers lose decisions, patterns, and lessons learned across long-lived projects. AI coding assistants (Claude Code, Cursor, Aider, Codex) generate rich context during sessions, but that knowledge evaporates when the session ends. Manual CLAUDE.md maintenance is inconsistent and incomplete.
Existing solutions address fragments of this problem:
- chunkhound (1.2k stars): Codebase intelligence via semantic code search. Does not extract developer knowledge from git history.
- claude-mem (51k stars): Auto-captures Claude Code sessions and compresses with AI. Claude-only, requires Claude API key, targets sessions not config files.
- pro-workflow: Self-correcting memory for Claude Code. Single-tool, not portable.
No tool fuses git commit history + AI session context + existing rules files into a single searchable local knowledge base with direct injection into project config files (CLAUDE.md, .cursorrules, etc.).
devmem is a Python CLI that auto-extracts knowledge from git commits, AI coding sessions, code annotations, and existing rules files. It builds a searchable local SQLite knowledge graph and injects relevant context into project configuration files.
Key principles:
- Git-first: Commit messages and diffs are the primary knowledge source (always available, no setup)
- Zero API keys: All processing is local. No cloud, no LLM calls for core features.
- Cross-tool: Outputs to CLAUDE.md, .cursorrules, .windsurfrules, or any config file.
- Project-scoped: One knowledge base per project, stored in
.devmem/alongside.git/.
+------------------+ +------------------+ +-------------------+
| Git Log/Diffs | | AI Sessions | | Existing Rules |
| (primary) | | (Claude, etc.) | | (CLAUDE.md, etc.)|
+--------+---------+ +--------+---------+ +--------+----------+
| | |
v v v
+------------------------------------------------------------------+
| Extraction Pipeline |
| +------------+ +--------------+ +-----------+ +------------+ |
| | Commit | | Session | | Annotation| | Rules | |
| | Analyzer | | Parser | | Extractor | | Ingester | |
| +------------+ +--------------+ +-----------+ +------------+ |
+------------------------------------------------------------------+
|
v
+------------------------------------------------------------------+
| Knowledge Store (SQLite) |
| +----------+ +----------+ +----------+ +-------------------+ |
| | facts | | patterns | | decisions| | project_context | |
| +----------+ +----------+ +----------+ +-------------------+ |
+------------------------------------------------------------------+
|
+---------------+---------------+
| |
v v
+---------------------------+ +-----------------------------+
| Search (BM25 + optional | | Injection Engine |
| semantic via embeddings) | | -> CLAUDE.md / .cursorrules|
+---------------------------+ +-----------------------------+
devmem/
src/
devmem/
__init__.py
__main__.py # CLI entry point
cli.py # Argument parsing (click)
core/
__init__.py
store.py # SQLite knowledge store
models.py # Data models (Pydantic)
search.py # BM25 + optional semantic search
injector.py # Context injection into config files
extractors/
__init__.py
base.py # Base extractor interface
git_commits.py # Git log/diff analyzer
ai_sessions.py # AI session file parser (Claude Code, etc.)
annotations.py # TODO/FIXME/HACK comment extractor
rules_files.py # Existing CLAUDE.md/.cursorrules parser
graph/
__init__.py
builder.py # Knowledge graph construction
relations.py # Cross-entity relationship mapping
tests/
unit/
test_store.py
test_search.py
test_injector.py
test_git_commits.py
test_annotations.py
test_rules_files.py
test_models.py
integration/
test_full_pipeline.py
test_injection_roundtrip.py
fixtures/
sample_repo/ # Fake git repo for testing
sample_sessions/ # Sample AI session files
docs/
usage.md
extraction.md
configuration.md
examples/
basic/
.github/
workflows/
ci.yml
ISSUE_TEMPLATE/
bug_report.md
feature_request.md
PULL_REQUEST_TEMPLATE.md
pyproject.toml
README.md
README.ja.md
LICENSE
CONTRIBUTING.md
CHANGELOG.md
THIRD_PARTY_LICENSES.md
.gitignore
| Component | Choice | Rationale |
|---|---|---|
| Language | Python 3.11+ | Rich CLI ecosystem, broad developer familiarity |
| CLI Framework | Click | De facto standard for Python CLIs, decorator-based, well-documented |
| Database | SQLite (via sqlite3 + apsw) | Zero-config, local-first, built-in FTS5 for BM25 |
| Data Models | Pydantic v2 | Type-safe models, validation, serialization |
| Git Access | GitPython | Mature Python git library, handles log/diff parsing |
| Search | FTS5 (built-in) + optional sentence-transformers | BM25 via FTS5 for keyword; optional semantic via local embeddings |
| Console Output | Rich | Beautiful terminal output, progress bars, tables |
| Testing | pytest + pytest-cov | Standard Python testing, coverage reporting |
| Packaging | pyproject.toml (setuptools) | Modern Python packaging, pip-installable |
| CI | GitHub Actions | Standard, free for public repos |
CLI Tool -- Installed via pip install devmem, invoked as devmem <command>.
Description: Extracts knowledge artifacts from git commit history.
Acceptance Criteria:
- AC1: Parses commit messages and categorizes into types: decision, fix, feature, refactor, chore
- AC2: Extracts file-level change summaries from diffs (which files changed, how)
- AC3: Identifies recurring patterns across commits (same file changed often = hot spot)
- AC4: Handles repos with 10,000+ commits without excessive memory usage
CLI Interface:
devmem extract git [--repo PATH] [--since DATE] [--until DATE] [--author AUTHOR]
Edge Cases:
- Empty repo (no commits) -> warning, skip
- Merge commits -> analyze diff, not individual parents
- Binary files -> skip diff, note filename only
- Very large diffs -> truncate with summary
Dependencies: None beyond GitPython (core feature).
Description: Reads existing project configuration files (CLAUDE.md, .cursorrules, etc.) to seed the knowledge store.
Acceptance Criteria:
- AC1: Detects and parses CLAUDE.md, .cursorrules, .windsurfrules, .github/copilot-instructions.md
- AC2: Extracts structured sections (rules, preferences, conventions, architecture notes)
- AC3: Preserves source file attribution (which file, which section)
CLI Interface:
devmem extract rules [--repo PATH] [--files FILE1,FILE2]
Description: SQLite-based local storage for extracted knowledge artifacts.
Acceptance Criteria:
- AC1: Stores artifacts with type, source, content, metadata, timestamp
- AC2: Full-text search via FTS5 (BM25 ranking)
- AC3: Tags and categories for filtering
- AC4: Incremental updates (re-extract only new commits)
Data Model:
class KnowledgeArtifact(BaseModel):
id: str # UUID
type: ArtifactType # decision, pattern, fix, preference, convention, hotspot
source: SourceType # git_commit, ai_session, annotation, rules_file
source_ref: str # commit hash, file path, etc.
title: str
content: str
tags: list[str]
confidence: float # 0.0-1.0 extraction confidence
project_path: str
created_at: datetime
updated_at: datetimeCLI Interface:
devmem status # Show store stats
devmem search <query> # Search artifacts
devmem list [--type TYPE] [--source SOURCE] # List artifacts
Description: Generates and injects context summaries into project configuration files.
Acceptance Criteria:
- AC1: Generates markdown summary of top-k relevant artifacts
- AC2: Injects into CLAUDE.md (or other target) with clear markers (begin/end)
- AC3: Updates existing injection without removing manual content
- AC4: Respects file format conventions (CLAUDE.md vs .cursorrules)
CLI Interface:
devmem inject [--target FILE] [--limit N] [--categories CAT1,CAT2]
devmem inject --dry-run # Preview without writing
Injection Format (CLAUDE.md):
<!-- devmem:start -->
## Project Knowledge (auto-generated by devmem)
### Key Decisions
- [commit abc123] Use SQLite for local storage instead of JSON files
- [commit def456] Adopt Click over argparse for CLI
### Recurring Patterns
- `src/core/` is modified in 40% of commits -- core module, handle with care
- Tests always follow `tests/unit/` and `tests/integration/` structure
### Active Hot Spots
- `src/devmem/store.py` -- changed 12 times in last 30 days
<!-- devmem:end -->Description: Parses AI coding session files for additional context.
Acceptance Criteria:
- AC1: Parses Claude Code session JSON from
~/.claude/projects/ - AC2: Extracts decisions, tool calls, error resolutions
- AC3: Handles large session files (>10MB) gracefully
Description: Extracts TODO, FIXME, HACK, NOTE, DECISION comments from codebase.
Acceptance Criteria:
- AC1: Scans codebase for annotation comments
- AC2: Categorizes by type (TODO, FIXME, HACK, NOTE, DECISION)
- AC3: Links annotations to git blame data for authorship
Description: Adds semantic (embedding-based) search alongside BM25.
Acceptance Criteria:
- AC1: Uses sentence-transformers (all-MiniLM-L6-v2) for local embeddings
- AC2: Hybrid search combining BM25 + semantic scores
- AC3: Falls back gracefully when sentence-transformers not installed
| Layer | Framework | Coverage Target | Scope |
|---|---|---|---|
| Unit | pytest | 85% | Models, extractors, store, search, injector |
| Integration | pytest + temp git repos | 80% | Full pipeline (extract -> store -> search -> inject) |
| E2E | pytest + subprocess | 70% | CLI commands with real git repo |
Test Fixtures:
sample_repo/: Fake git repo with known commit historysample_sessions/: Sample AI session JSON filessample_rules/: Sample CLAUDE.md / .cursorrules files
Key Test Scenarios:
- Extract from empty repo -> graceful handling
- Extract 1000+ commits -> performance acceptable
- Inject into CLAUDE.md with existing manual content -> no corruption
- Re-extract (incremental) -> only new commits processed
- Search with CJK characters -> correct tokenization
- Multiple injection targets -> correct format per target
# .github/workflows/ci.yml
name: CI
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.11", "3.12", "3.13"]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- run: pip install -e ".[dev]"
- run: pytest --cov=src/devmem --cov-report=term-missing
- run: ruff check src/ tests/
- run: mypy src/
build:
runs-on: ubuntu-latest
needs: test
steps:
- uses: actions/checkout@v4
- run: pip install build && python -m build
- run: twine check dist/*Scope: Project scaffold, data models, SQLite store, CLI skeleton.
Done when:
pip install -e .worksdevmem --helpshows all planned commandsdevmem statusshows empty store- All models defined with Pydantic
- SQLite tables created with FTS5
- Tests passing (unit tests for models + store)
Scope: Git commit analyzer, commit categorization, diff summarization.
Done when:
devmem extract gitpopulates store from real repos- Commit categorization (decision/fix/feature/refactor/chore) accuracy >80%
- Diff summarization extracts file-level changes
- Handles 10,000+ commits without crash
- Incremental extraction (only new commits)
- Tests passing (unit + integration with sample_repo)
Scope: Rules file ingestion, BM25 search, list/filter commands.
Done when:
devmem extract rulesparses CLAUDE.md and .cursorrulesdevmem search <query>returns ranked results via FTS5devmem list --type decisionfilters by type- Search handles CJK characters correctly
- Tests passing (unit + integration)
Scope: Injection engine, multi-target support, dry-run mode.
Done when:
devmem injectwrites to CLAUDE.md with markers- Preserves existing manual content outside markers
--dry-runshows preview without writing- Supports multiple target file formats
- Round-trip test (inject -> re-inject updates correctly)
- Tests passing (unit + integration + E2E)
Scope: README (en + ja), CI/CD, packaging, documentation.
Done when:
- README.md (English) + README.ja.md (Japanese) complete
- CONTRIBUTING.md has full dev setup instructions
- CI passes on Python 3.11/3.12/3.13
python -m buildproduces valid wheel/sdist- No hardcoded secrets
- THIRD_PARTY_LICENSES.md generated
- All quality checks pass
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Git log parsing edge cases (merge commits, rebases, signed commits) | Medium | Medium | Test with real-world repos (Linux kernel, React). Use GitPython's mature API. |
| Commit categorization inaccuracy | Medium | Low | Start with keyword heuristics (conventional commits prefix). Allow user correction. |
| CLAUDE.md injection corrupts user content | Low | High | Markers with checksums. Dry-run mode. Backup before injection. |
| FTS5 CJK tokenization poor | Medium | Medium | Use ICU tokenizer or custom bigram tokenizer. Test with Japanese repos. |
| Large repo performance (>50k commits) | Medium | Medium | Incremental extraction, pagination, SQLite indexing. Benchmark early. |
| sentence-transformers heavy dependency | Low | Low | Make fully optional (behind [semantic] extra). BM25 is sufficient for MVP. |
- AI/LLM-powered extraction or summarization (rule-based only)
- Semantic/embedding search (post-MVP, optional)
- AI session parsing (post-MVP)
- Code annotation extraction (post-MVP)
- Web UI or TUI dashboard
- Cloud sync or sharing
- Multi-project aggregation
- Git hosting platform APIs (GitHub, GitLab)
- Real-time file watching
- Windows support (macOS + Linux only for MVP)
| Dependency | Version | License | Compatible? |
|---|---|---|---|
| click | >=8.1 | BSD-3 | Yes |
| pydantic | >=2.0 | MIT | Yes |
| GitPython | >=3.1 | BSD-3 | Yes |
| rich | >=13.0 | MIT | Yes |
| apsw | >=3.42 | MIT | Yes |
| pytest | >=8.0 (dev) | MIT | Yes |
| pytest-cov | >=5.0 (dev) | MIT | Yes |
| ruff | >=0.4 (dev) | MIT | Yes |
| mypy | >=1.10 (dev) | MIT | Yes |
| sentence-transformers | >=3.0 (optional) | Apache-2.0 | Yes |
| build | >=1.0 (dev) | MIT | Yes |
| twine | >=5.0 (dev) | Apache-2.0 | Yes |
All dependencies are MIT or MIT-compatible (BSD-3, Apache-2.0). No GPL/AGPL dependencies.