Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,8 @@ build/
dist/
wheels/
*.egg-info

.mcp.json
.osgrep
# Virtual environments
.venv

Expand Down Expand Up @@ -41,4 +42,8 @@ hindsight-docs/static/llms-full.txt
hindsight-dev/benchmarks/locomo/results/
hindsight-dev/benchmarks/longmemeval/results/
hindsight-cli/target
hindsight-clients/rust/target
hindsight-clients/rust/target
.claude
whats-next.md
TASK.md
CHANGELOG.md
61 changes: 46 additions & 15 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,11 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co

## Project Overview

Hindsight is an agent memory system that provides long-term memory for AI agents using biomimetic data structures. It stores memories as World facts, Experiences, Opinions, and Observations across memory banks.
Hindsight is an agent memory system that provides long-term memory for AI agents using biomimetic data structures. Memories are organized as:
- **World facts**: General knowledge ("The sky is blue")
- **Experience facts**: Personal experiences ("I visited Paris in 2023")
- **Opinion facts**: Beliefs with confidence scores ("Paris is beautiful" - 0.9 confidence)
- **Observations**: Complex mental models derived from reflection

## Development Commands

Expand All @@ -13,14 +17,21 @@ Hindsight is an agent memory system that provides long-term memory for AI agents
# Start API server (loads .env automatically)
./scripts/dev/start-api.sh

# Run tests
# Run all tests (parallelized with pytest-xdist)
cd hindsight-api && uv run pytest tests/

# Run specific test file
cd hindsight-api && uv run pytest tests/test_http_api_integration.py -v

# Lint
# Run single test function
cd hindsight-api && uv run pytest tests/test_retain.py::test_retain_simple -v

# Lint and format
cd hindsight-api && uv run ruff check .
cd hindsight-api && uv run ruff format .

# Type checking (uses ty - extremely fast type checker from Astral)
cd hindsight-api && uv run ty check hindsight_api/
```

### Control Plane (Next.js)
Expand All @@ -37,7 +48,7 @@ cd hindsight-control-plane && npm run dev

### Generating Clients/OpenAPI
```bash
# Regenerate OpenAPI spec after API changes
# Regenerate OpenAPI spec after API changes (REQUIRED after changing endpoints)
./scripts/generate-openapi.sh

# Regenerate all client SDKs (Python, TypeScript, Rust)
Expand All @@ -57,27 +68,40 @@ cd hindsight-control-plane && npm run dev
- **hindsight-api/**: Core FastAPI server with memory engine (Python, uv)
- **hindsight/**: Embedded Python bundle (hindsight-all package)
- **hindsight-control-plane/**: Admin UI (Next.js, npm)
- **hindsight-cli/**: CLI tool (Rust, cargo)
- **hindsight-cli/**: CLI tool (Rust, cargo, uses progenitor for API client)
- **hindsight-clients/**: Generated SDK clients (Python, TypeScript, Rust)
- **hindsight-docs/**: Docusaurus documentation site
- **hindsight-integrations/**: Framework integrations (LiteLLM, OpenAI)
- **hindsight-dev/**: Development tools and benchmarks

### Core Engine (hindsight-api/hindsight_api/engine/)
- `memory_engine.py`: Main orchestrator for retain/recall/reflect operations
- `memory_engine.py`: Main orchestrator (~170KB) for retain/recall/reflect operations
- `llm_wrapper.py`: LLM abstraction supporting OpenAI, Anthropic, Gemini, Groq, Ollama, LM Studio
- `embeddings.py`: Embedding generation (local or TEI)
- `embeddings.py`: Embedding generation (local sentence-transformers or TEI)
- `cross_encoder.py`: Reranking (local or TEI)
- `entity_resolver.py`: Entity extraction and normalization
- `query_analyzer.py`: Query intent analysis
- `retain/`: Memory ingestion pipeline
- `search/`: Multi-strategy retrieval (semantic, BM25, graph, temporal)

**retain/**: Memory ingestion pipeline
- `orchestrator.py`: Coordinates the retain flow
- `fact_extraction.py`: LLM-based fact extraction from content
- `link_utils.py`: Entity link creation and management

**search/**: Multi-strategy retrieval
- `retrieval.py`: Main retrieval orchestrator
- `graph_retrieval.py`: Entity/relationship graph traversal
- `mpfp_retrieval.py`: Multi-Path Fact Propagation retrieval
- `fusion.py`: Reciprocal rank fusion for combining results
- `reranking.py`: Cross-encoder reranking

### API Layer (hindsight-api/hindsight_api/api/)
FastAPI routers for all endpoints. Main operations:
- `http.py`: FastAPI HTTP routers (~80KB) for all REST endpoints
- `mcp.py`: Model Context Protocol server implementation

Main operations:
- **Retain**: Store memories, extracts facts/entities/relationships
- **Recall**: Retrieve memories via parallel search strategies + reranking
- **Reflect**: Deep analysis forming new opinions/observations
- **Recall**: Retrieve memories via 4 parallel strategies (semantic, BM25, graph, temporal) + reranking
- **Reflect**: Deep analysis forming new opinions/observations (disposition-aware)

### Database
PostgreSQL with pgvector. Schema managed via Alembic migrations in `hindsight-api/hindsight_api/alembic/`. Migrations run automatically on API startup.
Expand All @@ -94,20 +118,22 @@ Key tables: `banks`, `memory_units`, `documents`, `entities`, `entity_links`
This runs the same checks as the pre-commit hook (Ruff for Python, ESLint/Prettier for TypeScript).

### Memory Banks
- Each bank is isolated (no cross-bank data access)
- Each bank is an isolated memory store (like a "brain" for one user/agent)
- Banks have dispositions (skepticism, literalism, empathy traits 1-5) affecting reflect
- Banks can have background context
- Bank isolation is strict - no cross-bank data leakage

### API Design
- All endpoints operate on a single bank per request
- Multi-bank queries are client responsibility
- Multi-bank queries are client responsibility to orchestrate
- Disposition traits only affect reflect, not recall

### Python Style
- Python 3.11+, type hints required
- Async throughout (asyncpg, async FastAPI)
- Pydantic models for request/response
- Ruff for linting (line-length 120)
- No Python files at project root - maintain clean directory structure

### TypeScript Style
- Next.js App Router for control plane
Expand Down Expand Up @@ -145,11 +171,16 @@ cp .env.example .env
# Python deps
uv sync --directory hindsight-api/

# Node deps (workspace)
# Node deps (uses npm workspaces)
npm install
```

Required env vars:
- `HINDSIGHT_API_LLM_PROVIDER`: openai, anthropic, gemini, groq, ollama, lmstudio
- `HINDSIGHT_API_LLM_API_KEY`: Your API key
- `HINDSIGHT_API_LLM_MODEL`: Model name (e.g., o3-mini, claude-sonnet-4-20250514)

Optional (uses local models by default):
- `HINDSIGHT_API_EMBEDDINGS_PROVIDER`: local (default) or tei
- `HINDSIGHT_API_RERANKER_PROVIDER`: local (default) or tei
- `HINDSIGHT_API_DATABASE_URL`: External PostgreSQL (uses embedded pg0 by default)
103 changes: 22 additions & 81 deletions hindsight-api/hindsight_api/api/mcp.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@
from fastmcp import FastMCP

from hindsight_api import MemoryEngine
from hindsight_api.api.http import BankListItem, BankListResponse, BankProfileResponse, DispositionTraits
from hindsight_api.engine.response_models import VALID_RECALL_FACT_TYPES
from hindsight_api.models import RequestContext

Expand Down Expand Up @@ -54,7 +53,12 @@ def create_mcp_server(memory: MemoryEngine) -> FastMCP:
mcp = FastMCP("hindsight-mcp-server", stateless_http=True)

@mcp.tool()
async def retain(content: str, context: str = "general", bank_id: str | None = None) -> str:
async def retain(
content: str,
context: str = "general",
async_processing: bool = True,
bank_id: str | None = None,
) -> str:
"""
Store important information to long-term memory.

Expand All @@ -70,18 +74,28 @@ async def retain(content: str, context: str = "general", bank_id: str | None = N
Args:
content: The fact/memory to store (be specific and include relevant details)
context: Category for the memory (e.g., 'preferences', 'work', 'hobbies', 'family'). Default: 'general'
async_processing: If True, queue for background processing and return immediately. If False, wait for completion. Default: True
bank_id: Optional bank to store in (defaults to session bank). Use for cross-bank operations.
"""
try:
target_bank = bank_id or get_current_bank_id()
if target_bank is None:
return "Error: No bank_id configured"
await memory.retain_batch_async(
bank_id=target_bank,
contents=[{"content": content, "context": context}],
request_context=RequestContext(),
)
return f"Memory stored successfully in bank '{target_bank}'"
contents = [{"content": content, "context": context}]
if async_processing:
# Queue for background processing and return immediately
result = await memory.submit_async_retain(
bank_id=target_bank, contents=contents, request_context=RequestContext()
)
return f"Memory queued for background processing (operation_id: {result.get('operation_id', 'N/A')})"
else:
# Wait for completion
await memory.retain_batch_async(
bank_id=target_bank,
contents=contents,
request_context=RequestContext(),
)
return f"Memory stored successfully in bank '{target_bank}'"
except Exception as e:
logger.error(f"Error storing memory: {e}", exc_info=True)
return f"Error: {str(e)}"
Expand Down Expand Up @@ -173,79 +187,6 @@ async def reflect(query: str, context: str | None = None, budget: str = "low", b
logger.error(f"Error reflecting: {e}", exc_info=True)
return f'{{"error": "{e}", "text": ""}}'

@mcp.tool()
async def list_banks() -> str:
"""
List all available memory banks.

Use this to discover banks for orchestration or to find
the correct bank_id for cross-bank operations.

Returns:
JSON object with banks array containing bank_id, name, disposition, background, and timestamps
"""
try:
banks = await memory.list_banks(request_context=RequestContext())
bank_items = [
BankListItem(
bank_id=b.get("bank_id") or b.get("id"),
name=b.get("name"),
disposition=DispositionTraits(
**b.get("disposition", {"skepticism": 3, "literalism": 3, "empathy": 3})
),
background=b.get("background"),
created_at=str(b.get("created_at")) if b.get("created_at") else None,
updated_at=str(b.get("updated_at")) if b.get("updated_at") else None,
)
for b in banks
]
return BankListResponse(banks=bank_items).model_dump_json(indent=2)
except Exception as e:
logger.error(f"Error listing banks: {e}", exc_info=True)
return f'{{"error": "{e}", "banks": []}}'

@mcp.tool()
async def create_bank(bank_id: str, name: str | None = None, background: str | None = None) -> str:
"""
Create or update a memory bank.

Use this to create new banks for different agents, sessions, or purposes.
Banks are isolated memory stores - each bank has its own memories and personality.

Args:
bank_id: Unique identifier for the bank (e.g., 'orchestrator-memory', 'agent-1')
name: Human-readable name for the bank
background: Context about what this bank stores or its purpose
"""
try:
# Get or create the bank profile (auto-creates with defaults)
await memory.get_bank_profile(bank_id, request_context=RequestContext())

# Update name and/or background if provided
if name is not None or background is not None:
await memory.update_bank(bank_id, name=name, background=background, request_context=RequestContext())

# Get final profile and return using BankProfileResponse model
profile = await memory.get_bank_profile(bank_id, request_context=RequestContext())
disposition = profile.get("disposition")
if hasattr(disposition, "model_dump"):
disposition_traits = DispositionTraits(**disposition.model_dump())
else:
disposition_traits = DispositionTraits(
**dict(disposition or {"skepticism": 3, "literalism": 3, "empathy": 3})
)

response = BankProfileResponse(
bank_id=bank_id,
name=profile.get("name") or "",
disposition=disposition_traits,
background=profile.get("background") or "",
)
return response.model_dump_json(indent=2)
except Exception as e:
logger.error(f"Error creating bank: {e}", exc_info=True)
return json.dumps({"error": str(e)})

return mcp


Expand Down