Theo

AI memory system with semantic indexing and validation loops - long-term memory storage with confidence scoring and relationship tracking.

Features

Long-term Memory

Memory Types: Store preferences, decisions, patterns, facts, and session context
Validation Loop: Memories build confidence through practical use
Golden Rules: High-confidence memories become protected principles
Namespace Scoping: Organize memories by project or globally

Voice Transcription

MLX Whisper: Local speech-to-text using MLX-optimized Whisper models
Streaming Transcription: Real-time transcription with silence detection
Text-to-Speech: Local TTS for voice responses
Memory Integration: Transcriptions stored as searchable memories

Agent Trace (AI Code Attribution)

Spec Compliance: Full agent-trace.dev v0.1 compliance
Auto-Capture: Line-level attribution on every commit via Claude Code hooks
Model Detection: Auto-detects model from session transcript (opus/sonnet/haiku)
Query Tools: CLI (theo trace query) and MCP tools (trace_query, trace_list)

Hook System

Context Injection: Proactive memory surfacing at SessionStart, UserPromptSubmit, and PreToolUse
Reactive Recall: Auto-recall on errors and file reads (PostToolUse)
FETCH Gate: Enforces [FETCH recall:ID] compliance — blocks tool calls until summary memories are fully retrieved
Security Validation: Blocks dangerous commands, protects sensitive files
Error Learning: Auto-captures error→fix patterns for future recall

Core Capabilities

Local Embeddings: Privacy-first using Ollama
Daemon Service: Non-blocking embedding and classification via Unix socket IPC
MCP Integration: Seamless integration with Claude Code and other MCP clients

↑ Back to top

Quick Start

Prerequisites

macOS with Apple Silicon (M1/M2/M3/M4) — required for MLX Whisper
Claude Code CLI (installation guide)
Internet connection for initial setup (~16GB total downloads)
16GB+ free disk space:
- Ollama + gemma3:12b LLM model (~8GB)
- Orpheus TTS model via Ollama (~4GB)
- MLX Whisper model (~3GB)
- Embedding model (~500MB)
- SQLite database and cache

Python, uv, Ollama, and all other dependencies are installed automatically by the setup script.

Installation

Option A: Clone and run

git clone https://github.com/harrison_adobe/theo.git
cd theo
bash setup.sh

Option B: Download from release

Download setup.sh from the latest release, then:

bash setup.sh

The script will clone the repo automatically if run outside a theo checkout.

The setup script handles everything in 12 steps: uv, Python 3.13, Ollama + gemma3:12b + Orpheus TTS model, theo dependencies, embedding model pre-warm, data directory, Claude Code hooks, Claude Code skills, multi-platform MCP registration (Claude Code, Cursor, Claude Desktop, meta-mcp), background daemon, Orpheus TTS server, and global CLAUDE.md bootstrap.

First-time setup takes ~15-20 minutes (mostly downloading models).

Manual installation

# 1. Install uv (manages Python automatically)
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Clone and install dependencies
git clone https://github.com/harrison_adobe/theo.git
cd theo
uv sync

# 3. Install Ollama and pull models
brew install --cask ollama
ollama pull gemma3:12b
ollama pull legraphista/Orpheus:3b-ft-q8

# 4. Create data directory and config
mkdir -p ~/.theo
cp .env.example .env

# 5. Install hooks and configure Claude Code
cp hooks/*.py ~/.claude/hooks/
cp hooks/settings.example.json ~/.claude/settings.json
# Edit ~/.claude/settings.json — replace /path/to/theo with your path

# 6. Start daemon
~/.claude/hooks/theo-daemon-ctl.py start

MCP Configuration

setup.sh automatically registers theo in all detected platforms:

Platform	Config File	Auto-detected?
Claude Code	`~/.claude/settings.json`	Always
meta-mcp	`~/.meta-mcp/servers.json`	If file exists
Cursor	`~/.cursor/mcp.json`	If file exists
Claude Desktop	`~/.claude.json`	If file exists

If a platform uses mcp-exec pointing to servers.json, theo is served via meta-mcp and direct registration is skipped.

Manual MCP configuration

Add to your platform's MCP config:

{
  "mcpServers": {
    "theo": {
      "command": "uv",
      "args": ["run", "--directory", "/path/to/theo", "python", "-m", "theo"],
      "env": {
        "THEO_SQLITE_PATH": "~/.theo/theo.db",
        "THEO_LOG_LEVEL": "INFO"
      }
    }
  }
}

Replace /path/to/theo with the actual path where you cloned the repo.

Verify the setup:

# Restart Claude Code to load the MCP server
# Then ask Claude:
"Show me my memory stats"

↑ Back to top

Basic Usage

Memory Operations

Store a memory:

"Remember that I prefer using FastAPI for Python APIs"

Store with explicit type:

"Store a decision: We chose PostgreSQL for the database"

Recall memories:

"What do you remember about my coding preferences?"

Validate a memory:

"That memory about FastAPI was helpful - validate it"

Delete a memory:

"Forget the memory about dark mode preferences"

↑ Back to top

Configuration

Configuration via CLI arguments (highest priority) or environment variables with THEO_ prefix:

Environment Variable	CLI Argument	Default	Description
`THEO_EMBEDDING_BACKEND`	-	`ollama`	Embedding backend (ollama)
`THEO_MLX_MODEL`	-	`mxbai-embed-large`	Embedding model name
`THEO_OLLAMA_HOST`	-	`http://localhost:11434`	Ollama server URL
`THEO_OLLAMA_LLM_MODEL`	-	`gemma3:12b`	LLM model for relationship classification
`THEO_OLLAMA_TIMEOUT`	-	`30`	Timeout in seconds
`THEO_SQLITE_PATH`	`--sqlite-path`	`~/.theo/theo.db`	SQLite database path
`THEO_LOG_LEVEL`	`--log-level`	`INFO`	Logging level
`THEO_DEFAULT_NAMESPACE`	-	`global`	Default namespace for memories
`THEO_DEFAULT_IMPORTANCE`	-	`0.5`	Default memory importance score
`THEO_RELATIONSHIP_SIMILARITY_THRESHOLD`	-	`0.6`	Minimum similarity for auto-inferred relationships
`THEO_RELATION_SIMILARITY_THRESHOLD`	-	`0.3`	Minimum similarity for strict relations (supersedes/contradicts)
`THEO_DEFAULT_TOKEN_BUDGET`	-	`4000`	Default token budget for context
`THEO_WHISPER_MODEL`	-	`mlx-community/whisper-large-v3-mlx`	MLX Whisper model
`THEO_TTS_URL`	-	`http://localhost:5005/v1/audio/speech`	Orpheus-FastAPI TTS endpoint
`THEO_TTS_VOICE`	-	`tara`	TTS voice name (tara, leah, jess, leo, dan, mia, zac, zoe)
`THEO_TTS_TIMEOUT`	-	`60`	TTS request timeout (seconds)
`THEO_AUDIO_PATH`	-	`~/.theo/audio`	Audio recording storage path
`THEO_TRACE_ENABLED`	-	`true`	Enable AI code attribution
`THEO_TRACE_GIT_NOTES`	-	`true`	Write traces to git notes

↑ Back to top

Architecture

Theo provides a memory system with validation loops:

Memory System: Store → Validate → Recall with confidence scoring
Daemon Service: Non-blocking IPC for embedding and classification operations
Agent Trace: AI code attribution tracking

┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
│  Claude Code    │────▶│   MCP Server    │────▶│   Tool Layer    │
│  (MCP Client)   │     │   (FastMCP)     │     │  (Async Handlers)│
└─────────────────┘     └─────────────────┘     └────────┬────────┘
                                                         │
                                                   ┌─────▼─────┐
                                                   │  Memory   │
                                                   │   Tools   │
                                                   └─────┬─────┘
                                                         │
                                                  ┌──────▼──────┐
                                                  │   Daemon    │
                                                  │   Client    │
                                                  └──────┬──────┘
                                                         │
                              ┌──────────────────────────┼──────────────────────┐
                              │                          │                      │
                        ┌─────▼─────┐            ┌──────▼──────┐        ┌──────▼──────┐
                        │ Validation│            │  Embedding  │        │   SQLite    │
                        │   Loop    │            │  Provider   │        │   Store     │
                        └───────────┘            └─────────────┘        └─────────────┘

See docs/architecture.md for detailed architecture documentation.

↑ Back to top

API Reference

Theo exposes MCP tools for memory management:

Memory Tools

memory_store(content, memory_type, namespace, importance, relates_to, supersedes_query, session_id, skip_infer) - Store memory (use relates_to for graph edges, supersedes_query to auto-replace old memories, session_id to auto-create relates_to edges between memories in the same session, skip_infer=true to skip auto relationship inference for faster writes). Strict relations (supersedes/contradicts) require similarity >= 0.3; rejected relations returned in relation_errors
memory_recall(query, n_results, namespace, memory_type, min_importance, min_confidence, include_related, max_depth, exclude_types, include_golden_rules, golden_rule_limit) - Recall memories with graph expansion and golden rules (excludes documents and sessions by default)
memory_forget(memory_id, query, force) - Delete memories
memory_context(query, namespace, token_budget) - Generate LLM context
memory_apply(memory_id, context) - Record memory usage (TRY phase)
memory_outcome(memory_id, success, skip_event) - Record result + adjust confidence (use skip_event=True for direct validation)
memory_relate(source_id, target_id, relation_type) - Create relationships
memory_edge_forget(edge_id, memory_id, source_id, target_id, relation, direction) - Delete relationship edges (by ID, memory, or pair)
memory_inspect_graph(memory_id, max_depth, output_format) - Visualize graph
memory_count(namespace, memory_type) - Count memories with filters
memory_list(namespace, memory_type, limit) - List memories with pagination
memory_list_namespaces() - List all namespaces with counts
validation_history(memory_id, event_type, limit) - Get validation timeline
memory_analyze_health(namespace, include_contradictions) - Analyze memory system health (includes contradiction detection)
memory_backfill_edges(namespace, batch_size, max_memories, dry_run) - Backfill orphan memory graph edges
memory_reclassify(namespace, limit) - Re-enqueue memories for retroactive supersession/contradiction classification

Trace Tools

trace_query(file, line) - Query AI attribution for code via git blame
trace_list(conversation_url, limit) - List recorded traces

See docs/API.md for complete API specifications.

↑ Back to top

Claude Code Skills

Theo provides Claude Code skills for convenient CLI access:

Skill	Description	Example
`/store`	Store new memories	`/store Always use TypeScript --type=pattern`
`/recall`	Recall memories via semantic search	`/recall coding preferences --expand`
`/forget`	Delete memories by ID or query	`/forget mem_abc123`
`/list`	Browse memories with pagination	`/list --type=preference --limit=10`
`/relate`	Manage memory relationships	`/relate mem_a supersedes mem_b`
`/stats`	Show memory statistics	`/stats`
`/validate`	TRY-LEARN validation cycle	`/validate apply mem_abc123 "testing"`
`/context`	Get formatted context for LLM injection	`/context authentication --budget 2000`
`/health`	Analyze memory system health (includes contradictions)	`/health --include-contradictions`
`/history`	View validation event timeline	`/history mem_abc123`
`/graph`	Visualize memory relationships	`/graph mem_abc123 --format mermaid`
`/contradictions`	Detect contradicting memories	`/contradictions --namespace project:theo`

Installing Skills

Skills are located in skills/ directory. Copy to your Claude Code skills folder:

cp -r skills/* ~/.claude/skills/

↑ Back to top

Development

Running Tests

# Run all tests
uv run pytest tests/ -v

# Run with coverage
uv run pytest tests/ -v --cov=src/theo --cov-report=html

# Run integration tests
uv run pytest tests/integration/ -v

Code Quality

# Format code
uv run black src/ tests/

# Lint code
uv run ruff check src/ tests/

# Type checking
uv run mypy src/

# Sort imports
uv run isort src/ tests/

Running the MCP Server

# Run with default configuration
uv run python -m theo

# Run with debug logging
uv run python -m theo --log-level DEBUG

# View all options
uv run python -m theo --help

↑ Back to top

Troubleshooting

Embedding Model Download Failed

Error: Failed to download embedding model or slow first-run

Solution:

Ensure Ollama is running: ollama serve
Pull the embedding model: ollama pull $THEO_MLX_MODEL
Verify model is available: ollama list

Ollama Connection Failed

Error: Failed to connect to Ollama

Solution:

Check if Ollama is running: ollama list
Start Ollama if needed: ollama serve
Pull the embedding model: ollama pull "$THEO_MLX_MODEL"

SQLite Permission Error

Error: Permission denied: ~/.theo/theo.db

Solution:

Create directory: mkdir -p ~/.theo
Fix permissions: chmod 755 ~/.theo
Or specify different path via THEO_SQLITE_PATH

Can't Delete Golden Rules

Issue: Memory deletion fails for high-confidence memories

Solution: Golden rules (confidence >= 0.9) are protected. Use force=true:

"Forget memory mem_123 with force"

SQLite FTS5 Issues

Error: Full-text search returns unexpected results or PRAGMA integrity_check shows warnings

Cause: The FTS5 index can become stale after crashes or improper shutdowns.

Solution:

# Connect to the Theo SQLite database
sqlite3 ~/.theo/theo.db

# Rebuild the FTS5 index
INSERT INTO memories_fts(memories_fts) VALUES('rebuild');

# Verify the fix
PRAGMA integrity_check;

# Exit
.quit

Prevention:

Always gracefully shutdown the daemon
After system crashes, run the FTS5 rebuild command above
If corruption persists, delete ~/.theo/theo.db and re-index

↑ Back to top

License

MIT License - see LICENSE file for details.

↑ Back to top

Name		Name	Last commit message	Last commit date
Latest commit History 2,023 Commits
docs		docs
hooks		hooks
plans		plans
scripts		scripts
skills		skills
src/theo		src/theo
tests		tests
.env.example		.env.example
.env.test		.env.test
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
check		check
pyproject.toml		pyproject.toml
roadmap-memory-ops.json		roadmap-memory-ops.json
setup.sh		setup.sh
worktree-add		worktree-add

blueman82/theo

Folders and files

Latest commit

History

Repository files navigation

Theo

Table of Contents

Features

Long-term Memory

Voice Transcription

Agent Trace (AI Code Attribution)

Hook System

Core Capabilities

Quick Start

Prerequisites

Installation

Option A: Clone and run

Option B: Download from release

MCP Configuration

Basic Usage

Memory Operations

Configuration

Architecture

API Reference

Memory Tools

Trace Tools

Claude Code Skills

Installing Skills

Development

Running Tests

Code Quality

Running the MCP Server

Troubleshooting

Embedding Model Download Failed

Ollama Connection Failed

SQLite Permission Error

Can't Delete Golden Rules

SQLite FTS5 Issues

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 150

Packages 0

Contributors 2

Uh oh!

Languages

Packages