diff --git a/.dockerignore b/.dockerignore index 1dfbd45f..ea003441 100644 --- a/.dockerignore +++ b/.dockerignore @@ -30,3 +30,7 @@ Thumbs.db **/coverage **/.pytest_cache **/.mypy_cache + +# Large documentation files +*.pdf +foundry*.md diff --git a/.env.example b/.env.example index 33df2781..723e45b3 100644 --- a/.env.example +++ b/.env.example @@ -4,7 +4,7 @@ # LLM Configuration (Required) HINDSIGHT_API_LLM_PROVIDER=openai HINDSIGHT_API_LLM_API_KEY=your-api-key-here -HINDSIGHT_API_LLM_MODEL=o3-mini +HINDSIGHT_API_LLM_MODEL=gpt-5.2-chat HINDSIGHT_API_LLM_BASE_URL=https://api.openai.com/v1 # API Configuration (Optional) diff --git a/AGENTS.md b/AGENTS.md index b4c7a9e8..79b32df4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,151 +1,593 @@ -# AGENTS.md +# Hindsight Agent: Azure AI Foundry Operations Manual -This document captures architectural decisions and coding conventions for the Hindsight project. +This document serves as the **Operational Manual** for the Hindsight Agent deployed on Azure AI Foundry. It covers cloud resources, identity management, deployment workflows, API patterns, and troubleshooting. -## Documentation +--- -- **Main documentation**: [hindsight-docs/docs/developer/](./hindsight-docs/docs/developer/) -- **Use case patterns**: [hindsight-docs/docs/cookbook/](./hindsight-docs/docs/cookbook/) -- **API reference**: Auto-generated from OpenAPI spec +## πŸ“‹ Table of Contents -## Project Structure +1. [Resource Inventory](#-resource-inventory) +2. [Architecture Overview](#-architecture-overview) +3. [Remote Agent API](#-remote-agent-api) +4. [Local Development](#-local-development) +5. [Deployment Guide](#-deployment-guide) +6. [Authentication & Identity](#-authentication--identity) +7. [Configuration Management](#-configuration-management) +8. [Agent Modification Workflow](#-agent-modification-workflow) +9. [API Patterns & SDK Usage](#-api-patterns--sdk-usage) +10. [Memory Banks](#-memory-banks) +11. [Troubleshooting](#-troubleshooting) + +--- + +## ☁️ Resource Inventory + +### Core Azure Resources + +| Component | Azure Resource Name / ID | Purpose | +|-----------|-------------------------|---------| +| **AI Project** | `jacob-1216` | Container for the Agent Service in Azure AI Foundry | +| **Project Endpoint** | `https://jacob-1216-resource.services.ai.azure.com/api/projects/jacob-1216` | Data Plane URL for SDK connectivity | +| **AI Resource** | `jacob-1216-resource` | Cognitive Services resource hosting models | +| **Managed Identity** | `267bc722-69a7-4bca-9196-9e8133094d37` | System identity for agent services | +| **Location** | `centralus` | Primary Azure region | + +### Compute & Container Resources + +| Component | Resource Name | URL | Purpose | +|-----------|--------------|-----|---------| +| **Memory API** | `hindsight-api` | `https://hindsight-api.politebay-1635b4f9.centralus.azurecontainerapps.io` | Core memory storage/retrieval (retain, recall, reflect) | +| **Agent API** | `hindsight-agent-api` | `https://hindsight-agent-api.jollyforest-7224b47b.centralus.azurecontainerapps.io` | Remote HTTP access to Hindsight agent | +| **Container Registry** | `hindsightacrrxuumag3f5ln6` | `hindsightacrrxuumag3f5ln6.azurecr.io` | Docker image storage | +| **Container Environment** | `hindsight-env` | - | Container Apps managed environment | +| **Log Analytics** | `hindsight-logs` | - | Centralized logging | + +### Model Deployments + +| Deployment Name | Model | Status | Notes | +|-----------------|-------|--------|-------| +| `gpt-4.1` | GPT-4.1 | βœ… Active | Available for use | +| `gpt-5.2-chat` | GPT-5.2 | βœ… Active | **Used by Hindsight-v3 agent** | +| `text-embedding-3-small` | text-embedding-3-small | βœ… Active | Embedding model | +| `text-embedding-3-large` | text-embedding-3-large | βœ… Active | Embedding model | +| `claude-opus-4-5` | Claude Opus 4.5 | βœ… Active | Anthropic model | + +> **Note**: `gpt-4o` deployment does NOT exist. The agent uses `gpt-5.2-chat`. + +### Configuration Store + +| Component | Endpoint | Auth Method | +|-----------|----------|-------------| +| **App Configuration** | `https://hindsightapp.azconfig.io` | Entra ID (AAD) or Connection String | + +--- + +## πŸ—οΈ Architecture Overview ``` -hindsight/ # Python package for embedded usage -hindsight-api/ # FastAPI server (core memory engine) -hindsight-cli/ # Rust CLI client -hindsight-control-plane/ # Next.js admin UI -hindsight-docs/ # Docusaurus documentation site -hindsight-dev/ # Development tools and benchmarks -hindsight-integrations/ # Framework integrations (LangChain, etc.) -hindsight-clients/ # Generated API clients (Python, TypeScript, Rust) +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Azure AI Foundry β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ Project: jacob-1216 β”‚ β”‚ +β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ +β”‚ β”‚ β”‚ Agent: β”‚ β”‚ Model Deployment: β”‚ β”‚ β”‚ +β”‚ β”‚ β”‚ Hindsight-v3 │───▢│ gpt-5.2-chat β”‚ β”‚ β”‚ +β”‚ β”‚ β”‚ (agent_ref) β”‚ β”‚ (24 OpenAPI endpoints) β”‚ β”‚ β”‚ +β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ + β”‚ + β”‚ Responses API + β”‚ (agent_reference pattern) + β–Ό +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Azure Container Apps β”‚ +β”‚ β”‚ +β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ +β”‚ β”‚ hindsight-agent-api β”‚ β”‚ hindsight-api β”‚ β”‚ +β”‚ β”‚ (FastAPI) │─────▢│ (Memory System) β”‚ β”‚ +β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ +β”‚ β”‚ POST /chat β”‚ β”‚ POST /memories β”‚ β”‚ +β”‚ β”‚ GET /health β”‚ β”‚ POST /memories/recall β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ POST /memories/reflect β”‚ β”‚ +β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` -## Core Concepts +### Data Flow + +1. **Client** β†’ `POST /chat` β†’ **hindsight-agent-api** +2. **Agent API** β†’ `responses.create(agent_reference)` β†’ **Azure AI Foundry** +3. **Foundry** returns tool calls (retain/recall/reflect) +4. **Agent API** β†’ executes tools via **HindsightClient** β†’ **hindsight-api** +5. **Agent API** β†’ submits tool outputs β†’ **Foundry** β†’ generates response +6. **Client** ← receives response with tool call details + +--- + +## πŸ€– Remote Agent API + +The `hindsight-agent-api` provides remote HTTP access to the Hindsight agent. + +### Endpoints + +| Method | Path | Description | +|--------|------|-------------| +| `POST` | `/chat` | Send a message, get agent response with tool calls | +| `GET` | `/health` | Health check with status and configuration | +| `GET` | `/docs` | Interactive OpenAPI documentation | +| `GET` | `/` | API info and available endpoints | + +### Request/Response Format + +**POST /chat** +```json +// Request +{ + "message": "What do you know about me?", + "conversation_id": null +} + +// Response +{ + "response": "Based on my memories, your name is Jacob. You live in Seattle and work at Microsoft...", + "conversation_id": null, + "tool_calls": [ + { + "name": "recall", + "arguments": {"query": "user profile preferences", "bank_id": "user_preferences"}, + "result_preview": "{\"results\": [{\"id\": \"...\", \"text\": \"User's name is Jacob...\"}]}" + } + ] +} +``` + +**GET /health** +```json +{ + "status": "healthy", + "agent": "Hindsight-v3", + "project_endpoint": "https://jacob-1216-resource.services.ai.azure.com/api/projects/jacob-1216" +} +``` + +### Quick Test (PowerShell) + +```powershell +# Health check +Invoke-RestMethod -Uri "https://hindsight-agent-api.jollyforest-7224b47b.centralus.azurecontainerapps.io/health" + +# Chat +$body = @{message = "What do you know about me?"} | ConvertTo-Json +Invoke-RestMethod -Uri "https://hindsight-agent-api.jollyforest-7224b47b.centralus.azurecontainerapps.io/chat" ` + -Method POST -Body $body -ContentType "application/json" +``` + +--- + +## πŸ’» Local Development + +### Prerequisites + +- Python 3.12+ +- Azure CLI (`az login` required) +- Access to `jacob-1216` AI Project + +### Setup + +```powershell +# Clone and setup +cd hindsight +pip install -r requirements-agent-api.txt + +# Login to Azure (required for local credential) +az login +az account set --subscription "" + +# Run locally +python hindsight_agent_api.py +``` + +### Local Testing + +```powershell +# Test local health +Invoke-RestMethod -Uri "http://localhost:8080/health" + +# Test local chat +$body = @{message = "Hello!"} | ConvertTo-Json +Invoke-RestMethod -Uri "http://localhost:8080/chat" -Method POST -Body $body -ContentType "application/json" +``` + +### Interactive CLI Mode + +```powershell +# Run interactive agent locally +python hindsight_agent.py --interactive + +# Single message +python hindsight_agent.py --message "What do you remember about me?" +``` + +--- + +## πŸš€ Deployment Guide + +### Prerequisites + +- Azure CLI authenticated (`az login`) +- Contributor role on `hindsight-rg` resource group +- Docker (optional, for local builds) + +### Deploy with Bicep (Recommended) + +```powershell +# Full deployment (creates/updates all resources) +.\deploy-bicep.ps1 -ResourceGroup hindsight-rg -Location centralus + +# Deploy with specific image tag +.\deploy-bicep.ps1 -ResourceGroup hindsight-rg -ImageTag v1.0.0 +``` + +### What the Deployment Does + +1. **Creates/Updates Infrastructure** (via Bicep): + - Log Analytics Workspace + - Container Registry (ACR) + - Container Apps Environment + - Container App with managed identity + +2. **Builds Container Image**: + - Uses ACR Tasks (`az acr build`) + - Pushes to `hindsightacrrxuumag3f5ln6.azurecr.io/hindsight-agent-api:latest` + +3. **Configures RBAC**: + - Assigns `Cognitive Services User` role to Container App's managed identity + - Scoped to `jacob-1216-resource` AI resource -### Memory Banks -- Each bank is an isolated memory store (like a "brain" for one user/agent) -- Banks contain: memory units (facts), entities, documents, entity links -- Banks have a **disposition** (personality traits) and **background** (context) -- Bank isolation is strict - no cross-bank data leakage +### Manual Container Build (Alternative) -### Memory Types -- **World facts**: General knowledge ("The sky is blue") -- **Experience facts**: Personal experiences ("I visited Paris in 2023") -- **Opinion facts**: Beliefs with confidence scores ("Paris is beautiful" - 0.9 confidence) +```powershell +# Build locally +docker build -t hindsight-agent-api:latest -f Dockerfile.agent-api . -### Operations -- **Retain**: Store new memories (extracts facts, entities, relationships) -- **Recall**: Retrieve memories (semantic, BM25, graph, temporal search) -- **Reflect**: Deep analysis to form new insights/opinions +# Tag for ACR +docker tag hindsight-agent-api:latest hindsightacrrxuumag3f5ln6.azurecr.io/hindsight-agent-api:latest -## API Design Decisions +# Push to ACR +az acr login --name hindsightacrrxuumag3f5ln6 +docker push hindsightacrrxuumag3f5ln6.azurecr.io/hindsight-agent-api:latest -### Single Bank Per Request -- All API endpoints (`recall`, `reflect`, `retain`) operate on a single bank -- Multi-bank queries are the **client/agent's responsibility** to orchestrate -- This keeps the API simple and the isolation model clear +# Update Container App +az containerapp update --name hindsight-agent-api --resource-group hindsight-rg ` + --image hindsightacrrxuumag3f5ln6.azurecr.io/hindsight-agent-api:latest +``` -### Disposition Traits (3-trait system) -- **Skepticism** (1-5): How skeptical vs trusting when forming opinions -- **Literalism** (1-5): How literally to interpret information -- **Empathy** (1-5): How much to consider emotional context -- These influence the `reflect` operation, not `recall` -- Background info also only affects `reflect` (opinion formation) +### Verify Deployment -## Multi-Bank Architecture Patterns +```powershell +# Check container status +az containerapp show --name hindsight-agent-api --resource-group hindsight-rg ` + --query "{fqdn:properties.configuration.ingress.fqdn, status:properties.provisioningState, revision:properties.latestRevisionName}" -See [hindsight-docs/docs/cookbook/](./hindsight-docs/docs/cookbook/) for detailed guides: +# View logs +az containerapp logs show --name hindsight-agent-api --resource-group hindsight-rg --type console --tail 50 -- **Per-User Memory**: One bank per user, simplest pattern -- **Support Agent + Shared Knowledge**: User bank + shared docs bank, client orchestrates +# Test deployed API +Invoke-RestMethod -Uri "https://hindsight-agent-api.jollyforest-7224b47b.centralus.azurecontainerapps.io/health" +``` -## Developer Guide +--- + +## πŸ” Authentication & Identity + +### Authentication Model by Environment + +| Environment | Credential Type | How to Setup | +|-------------|----------------|--------------| +| **Local Development** | `AzureCliCredential` | Run `az login` and select correct subscription | +| **Azure Container Apps** | `ManagedIdentityCredential` | Automatic via system-assigned identity | +| **CI/CD Pipelines** | `DefaultAzureCredential` | Use federated credentials or service principal | + +### Required RBAC Roles + +| Identity | Role | Scope | Purpose | +|----------|------|-------|---------| +| Your User | `Azure AI Developer` | `jacob-1216` project | Local development, agent creation | +| Container App MI | `Cognitive Services User` | `jacob-1216-resource` | Production API calls | + +### Credential Selection Logic + +```python +def get_credential(): + """Get the appropriate credential based on environment.""" + is_azure = any([ + os.environ.get("AZURE_FUNCTIONS_ENVIRONMENT"), + os.environ.get("CONTAINER_APP_NAME"), + os.environ.get("MSI_ENDPOINT"), + os.environ.get("IDENTITY_ENDPOINT") + ]) + + if is_azure: + return ManagedIdentityCredential() + + # Local: prefer CLI credential (has your user permissions) + return AzureCliCredential() +``` -### Running the API Server +### Verify Permissions -```bash -# From project root -./scripts/dev/start-api.sh +```powershell +# Check your role assignments on AI Project +az role assignment list --assignee $(az ad signed-in-user show --query id -o tsv) ` + --scope "/subscriptions//resourceGroups/jacob-1216-resource/providers/Microsoft.CognitiveServices/accounts/jacob-1216-resource" -# With options -./scripts/dev/start-api.sh --reload --port 8888 --log-level debug +# Check Container App managed identity +az containerapp identity show --name hindsight-agent-api --resource-group hindsight-rg ``` -### Running Tests +--- + +## βš™οΈ Configuration Management + +### Environment Variables + +| Variable | Default | Description | +|----------|---------|-------------| +| `HINDSIGHT_PROJECT_ENDPOINT` | `https://jacob-1216-resource...` | Foundry project data plane URL | +| `HINDSIGHT_MODEL_DEPLOYMENT_NAME` | `gpt-4o` | Model deployment to use | +| `HINDSIGHT_MCP_BASE_URL` | `https://hindsight-api...` | Memory API base URL | +| `HINDSIGHT_DEFAULT_BANK_ID` | `hindsight_agent_bank` | Default memory bank | +| `HINDSIGHT_AGENT_NAME` | `Hindsight` | Agent display name | + +### Azure App Configuration -```bash -# API tests -cd hindsight-api -uv run pytest tests/ +Keys stored in App Configuration (prefix: `Hindsight:`): -# Specific test -uv run pytest tests/test_http_api_integration.py -v +- `Hindsight:ProjectEndpoint` +- `Hindsight:ModelDeploymentName` +- `Hindsight:McpBaseUrl` +- `Hindsight:DefaultBankId` +- `Hindsight:AgentName` + +### Configuration Priority + +1. Environment variables (highest priority) +2. Azure App Configuration +3. Hardcoded defaults + +### Override Model at Runtime + +```powershell +# Local override +$env:HINDSIGHT_MODEL_DEPLOYMENT_NAME = "gpt-5.2-chat" +python hindsight_agent.py --interactive + +# Container App override +az containerapp update --name hindsight-agent-api --resource-group hindsight-rg ` + --set-env-vars "HINDSIGHT_MODEL_DEPLOYMENT_NAME=gpt-5.2-chat" ``` -### Generating OpenAPI Spec +--- + +## πŸ”§ Agent Modification Workflow -After changing API endpoints, regenerate the OpenAPI spec and docs: +### Agent Tool Types -```bash -./scripts/generate-openapi.sh +The Hindsight agent uses **OpenAPI tools** which allow it to work from: +1. **Azure AI Foundry portal/playground** - tools execute server-side +2. **Python code (hindsight_agent.py)** - tools execute server-side via Foundry +3. **Container App (hindsight-agent-api)** - tools execute server-side via Foundry + +### Updating Agent Tools + +To update the agent's OpenAPI tools: + +```powershell +# Edit hindsight-tools-openapi.json with new endpoints +# Then run: +python update_agent_openapi.py ``` -This will: -1. Generate `openapi.json` at project root -2. Copy to `hindsight-docs/openapi.json` -3. Regenerate API reference documentation +This creates a new agent version with the updated tools. + +### Updating System Prompt + +1. **Edit Instructions**: Modify `AGENT_INSTRUCTIONS` in `update_agent_openapi.py` +2. **Run Update Script**: `python update_agent_openapi.py` +3. **Test**: Verify in Foundry portal or via `/chat` endpoint + +### Switching Models -### Generating API Clients +1. **Verify Deployment Exists**: Check Azure AI Studio for available model deployments +2. **Update Configuration**: + - App Configuration: `Hindsight:ModelDeploymentName` + - Or environment variable: `HINDSIGHT_MODEL_DEPLOYMENT_NAME` +3. **Restart Container App** (if needed): + ```powershell + az containerapp revision restart --name hindsight-agent-api --resource-group hindsight-rg ` + --revision $(az containerapp show --name hindsight-agent-api --resource-group hindsight-rg --query properties.latestRevisionName -o tsv) + ``` -After updating the OpenAPI spec, regenerate all clients: +### Adding New Tools -```bash -./scripts/generate-clients.sh +1. **Add endpoint to OpenAPI spec** (`hindsight-tools-openapi.json`) +2. **Update agent**: Run `python update_agent_openapi.py` +3. **Test in Foundry portal** or via code + +--- + +## πŸ”Œ API Patterns & SDK Usage + +### OpenAPI Tools Architecture + +The agent uses **OpenAPI tools** that call the `hindsight-api` directly: + +``` +β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” +β”‚ Foundry Portal β”‚ β”‚ Azure AI Foundry β”‚ β”‚ hindsight-api β”‚ +β”‚ or Python Code │────▢│ (executes tools) │────▢│ (memory storage) β”‚ +β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` -This generates: -- **Rust client**: `hindsight-clients/rust/` (via progenitor in build.rs) -- **Python client**: `hindsight-clients/python/` (via openapi-generator Docker) -- **TypeScript client**: `hindsight-clients/typescript/` (via @hey-api/openapi-ts) +**Why OpenAPI tools?** +- Tools execute server-side by Foundry +- Works from portal/playground without client code +- Same behavior whether accessed from portal or code + +### Foundry Responses API (Correct Pattern) + +The agent uses the **Responses API** with `agent_reference` pattern. Note that temperature/top_p are not supported in the Responses API. + +```python +from azure.ai.projects import AIProjectClient + +client = AIProjectClient(credential=credential, endpoint=project_endpoint) +openai_client = client.get_openai_client() + +# Create response using agent reference (name only, no version needed) +response = openai_client.responses.create( + extra_body={"agent": {"name": "Hindsight-v3", "type": "agent_reference"}}, + input=user_input, +) + +# Handle tool calls (always execute client-side) +for tool_call in response.output: + if tool_call.type == 'function_call': + result = execute_tool(tool_call.name, tool_call.arguments) + tool_outputs.append({ + "type": "function_call_output", + "call_id": tool_call.call_id, + "output": result + }) + +# Continue with tool outputs +response = openai_client.responses.create( + extra_body={"agent": {"name": "Hindsight-v3", "type": "agent_reference"}}, + input=tool_outputs, + previous_response_id=response.id, +) +``` + +### Key API Insights + +| Pattern | Status | Notes | +|---------|--------|-------| +| `responses.create()` with `agent_reference` | βœ… Working | Correct pattern for Foundry agents | +| `agent_reference` with name only | βœ… Required | Do NOT include version in agent_reference | +| `chat.completions.create()` | ❌ Deprecated | Use Responses API instead | +| `temperature`, `top_p` in responses | ❌ Not supported | These parameters don't exist in Responses API | +| Tools in `extra_body` with `agent_reference` | ❌ Fails | "tools not allowed when agent is specified" | +| Client-side tool execution | βœ… Required | Server may return `completed` but still needs output | + +--- + +## 🧠 Memory Banks + +### Available Banks -Note: The maintained wrapper `hindsight_client.py` and `README.md` are preserved during regeneration. +| Bank ID | Purpose | Usage | +|---------|---------|-------| +| `hindsight_agent_bank` | Default bank for agent memories | General storage | +| `user_preferences` | User preferences and profile | Name, location, settings | +| `project_context` | Project-related information | Work, tasks, milestones | +| `knowledge_base` | Facts and knowledge | Technical info, references | -### Running the Documentation Site +### HindsightClient Operations -```bash -./scripts/dev/start-docs.sh +```python +from hindsight_client import HindsightClient + +client = HindsightClient(base_url, default_bank_id) + +# Store a memory +client.retain(content="User prefers Python", context="preferences") + +# Search memories +results = client.recall(query="user preferences", max_tokens=4096, budget="mid") + +# Synthesize/reflect +reflection = client.reflect(query="What do I know about this user?") + +# Query specific bank +client.recall(query="projects", bank_id="project_context") ``` -### Running the Control Plane +--- + +## 🐞 Troubleshooting -```bash -./scripts/dev/start-control-plane.sh +### Common Errors + +| Error | Cause | Solution | +|-------|-------|----------| +| `401 Unauthorized` | Missing/invalid credentials | Local: `az login`. Cloud: Check MI role assignment | +| `404 DeploymentNotFound` | Using `model=` instead of `agent_reference` | Use `extra_body={"agent": {...}}` pattern | +| `tools not allowed when agent is specified` | Providing tools with agent_reference | Remove tools from API call; agent has tools defined in Foundry | +| Tool call completes but no response | Not providing tool outputs to API | Always submit `function_call_output` even if status="completed" | +| Container timeout | hindsight-api scaled to zero | Wake the API first with health check | + +### Debug Commands + +```powershell +# Check container logs +az containerapp logs show --name hindsight-agent-api --resource-group hindsight-rg --type console --tail 100 + +# Check container status +az containerapp show --name hindsight-agent-api --resource-group hindsight-rg --query properties.runningStatus + +# List revisions +az containerapp revision list --name hindsight-agent-api --resource-group hindsight-rg --query "[].{name:name, active:properties.active, created:properties.createdTime}" + +# Check role assignments +az role assignment list --scope "/subscriptions//resourceGroups/jacob-1216-resource/providers/Microsoft.CognitiveServices/accounts/jacob-1216-resource" --output table ``` -## Code Style +### Health Checks -### Python (hindsight-api) -- Use `uv` for package management -- Async throughout (asyncpg, async FastAPI endpoints) -- Pydantic models for request/response validation -- No py files at project root - maintain clean directory structure +```powershell +# Memory API health +Invoke-RestMethod -Uri "https://hindsight-api.politebay-1635b4f9.centralus.azurecontainerapps.io/health" -### TypeScript (control-plane, clients) -- Next.js with App Router for control plane -- Tailwind CSS with shadcn/ui components +# Agent API health +Invoke-RestMethod -Uri "https://hindsight-agent-api.jollyforest-7224b47b.centralus.azurecontainerapps.io/health" -### Rust (CLI) -- Async with tokio -- reqwest for HTTP client -- progenitor for API client generation +# Full chat test +$body = @{message = "Test: What do you know about me?"} | ConvertTo-Json +Invoke-RestMethod -Uri "https://hindsight-agent-api.jollyforest-7224b47b.centralus.azurecontainerapps.io/chat" -Method POST -Body $body -ContentType "application/json" +``` + +--- + +## πŸ“ File Structure + +``` +hindsight/ +β”œβ”€β”€ AGENTS.md # This file - operations manual +β”œβ”€β”€ hindsight_agent.py # Main agent script (local/interactive) +β”œβ”€β”€ hindsight_agent_api.py # FastAPI wrapper for remote access +β”œβ”€β”€ hindsight_client.py # Memory API client (retain/recall/reflect) +β”œβ”€β”€ hindsight-tools-openapi-full.json # Complete OpenAPI spec (24 endpoints) +β”œβ”€β”€ create_agent_full_spec.py # Script to create/update agent with full spec +β”œβ”€β”€ openapi.json # Source OpenAPI spec from hindsight-api +β”œβ”€β”€ config.py # Configuration management +β”œβ”€β”€ Dockerfile.agent-api # Container definition +β”œβ”€β”€ requirements-agent-api.txt # Python dependencies +β”œβ”€β”€ deploy-bicep.ps1 # Deployment script +└── infra/ + └── agent-api.bicep # Infrastructure as Code +``` -## Database +--- -- PostgreSQL with pgvector extension -- Schema managed via Alembic migrations in `hindsight-api/alembic/`, db migrations happen during api startup, no manual commands -- Key tables: `banks`, `memory_units`, `documents`, `entities`, `entity_links` +## πŸ“ Changelog -# Branding -## Colors -- Primary: gradient from #0074d9 to #009296 +- **2025-12-20**: Upgraded to full OpenAPI spec (24 endpoints) with all admin functions +- **2025-12-20**: Consolidated agent scripts to single `create_agent_full_spec.py` +- **2025-12-20**: Agent Hindsight-v3:4 now has complete API access +- **2025-12-19**: Added OpenAPI tools - agent now works from Foundry portal and code +- **2025-12-19**: Fixed Responses API usage - removed version from agent_reference, removed temperature/top_p +- **2025-12-19**: Updated model deployments documentation (gpt-5.2-chat is primary) +- **2024-12-19**: Initial comprehensive documentation +- **2024-12-19**: Deployed Agent API to Azure Container Apps +- **2024-12-19**: Optimized agent to use Responses API with client-side tool execution diff --git a/Dockerfile.agent-api b/Dockerfile.agent-api new file mode 100644 index 00000000..2e472b58 --- /dev/null +++ b/Dockerfile.agent-api @@ -0,0 +1,43 @@ +# Hindsight Agent API - Azure Container Apps Deployment +# Multi-stage build for smaller image + +FROM python:3.12-slim as builder + +WORKDIR /app + +# Install build dependencies +RUN pip install --no-cache-dir --upgrade pip + +# Copy requirements and install +COPY requirements-agent-api.txt . +RUN pip install --no-cache-dir -r requirements-agent-api.txt + +# Production stage +FROM python:3.12-slim + +WORKDIR /app + +# Copy installed packages from builder +COPY --from=builder /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages +COPY --from=builder /usr/local/bin /usr/local/bin + +# Copy application code +COPY hindsight_agent_api.py . +COPY hindsight_client.py . +COPY config.py . + +# Create non-root user for security +RUN useradd -m -u 1000 appuser && chown -R appuser:appuser /app +USER appuser + +# Azure Container Apps expects port 8080 by default +ENV PORT=8080 +ENV HOST=0.0.0.0 + +EXPOSE 8080 + +# Health check +HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ + CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')" || exit 1 + +CMD ["python", "hindsight_agent_api.py"] diff --git a/config.py b/config.py new file mode 100644 index 00000000..79d405d1 --- /dev/null +++ b/config.py @@ -0,0 +1,146 @@ +""" +Centralized configuration module for Hindsight Foundry Agent. + +Loads configuration from Azure App Configuration with fallback to environment variables. +""" +import os +from dataclasses import dataclass +from typing import Optional + +# Try to import Azure App Configuration provider +try: + from azure.appconfiguration.provider import load + # DefaultAzureCredential unused but keeping if needed for future AAD auth + HAS_APP_CONFIG = True +except ImportError: + HAS_APP_CONFIG = False + + +@dataclass +class HindsightConfig: + """Configuration for the Hindsight Foundry Agent.""" + project_endpoint: str + model_deployment_name: str + mcp_base_url: str + default_bank_id: str + agent_name: str + + @property + def mcp_url(self) -> str: + """Full MCP URL including bank_id path.""" + base = self.mcp_base_url.rstrip('/') + return f"{base}/mcp/{self.default_bank_id}/" + + +# Azure App Configuration endpoint and connection string +APP_CONFIG_ENDPOINT = "https://hindsightapp.azconfig.io" +# Use environment variable for secrets - never hardcode credentials +APP_CONFIG_CONNECTION_STRING = os.environ.get( + "AZURE_APP_CONFIG_CONNECTION_STRING", + "" # Must be set in environment for App Config to work +) + +# Prefix for configuration keys in App Configuration +CONFIG_PREFIX = "Hindsight:" + + +def _get_from_app_config() -> Optional[dict]: + """Load configuration from Azure App Configuration.""" + if not HAS_APP_CONFIG: + print("Azure App Configuration SDK not installed") + return None + + try: + from azure.appconfiguration.provider import SettingSelector + + # Load using connection string (access key auth) + if not APP_CONFIG_CONNECTION_STRING: + print("WARNING: AZURE_APP_CONFIG_CONNECTION_STRING is empty. App Config will fail.") + + config = load( + connection_string=APP_CONFIG_CONNECTION_STRING, + selects=[SettingSelector(key_filter=f"{CONFIG_PREFIX}*")], + trim_prefixes=[CONFIG_PREFIX], + ) + + if config: + print("Loaded configuration from Azure App Configuration") + return dict(config) + + except Exception as e: + print(f"WARNING: Could not load from App Configuration: {e}") + + return None + + +def _get_from_env() -> dict: + """Load configuration from environment variables.""" + return { + "ProjectEndpoint": os.environ.get( + "HINDSIGHT_PROJECT_ENDPOINT", + "https://jacob-1216-resource.services.ai.azure.com/api/projects/jacob-1216" + ), + # Use gpt-4o as default - verified working deployment + # Set HINDSIGHT_MODEL_DEPLOYMENT_NAME to override + "ModelDeploymentName": os.environ.get( + "HINDSIGHT_MODEL_DEPLOYMENT_NAME", + "gpt-5.2-chat" + ), + "McpBaseUrl": os.environ.get( + "HINDSIGHT_MCP_BASE_URL", + "https://hindsight-api.politebay-1635b4f9.centralus.azurecontainerapps.io" + ), + "DefaultBankId": os.environ.get( + "HINDSIGHT_DEFAULT_BANK_ID", + "hindsight_agent_bank" + ), + "AgentName": os.environ.get( + "HINDSIGHT_AGENT_NAME", + "Hindsight" + ), + } + + +def get_config() -> HindsightConfig: + """ + Get configuration from Azure App Configuration or environment variables. + + Priority: + 1. Azure App Configuration (if available) + 2. Environment variables + 3. Hardcoded defaults + """ + # Try App Configuration first + config_dict = _get_from_app_config() + + # Allow environment variable overrides + if config_dict: + if "HINDSIGHT_MODEL_DEPLOYMENT_NAME" in os.environ: + print(f"β„Ή Overriding model with: {os.environ['HINDSIGHT_MODEL_DEPLOYMENT_NAME']}") + config_dict["ModelDeploymentName"] = os.environ["HINDSIGHT_MODEL_DEPLOYMENT_NAME"] + + # Fall back to environment variables + if not config_dict: + print("Using environment/default configuration") + config_dict = _get_from_env() + + return HindsightConfig( + project_endpoint=config_dict.get("ProjectEndpoint", ""), + model_deployment_name=config_dict.get("ModelDeploymentName", "gpt-5.2-chat"), + mcp_base_url=config_dict.get("McpBaseUrl", ""), + default_bank_id=config_dict.get("DefaultBankId", "hindsight_agent_bank"), + agent_name=config_dict.get("AgentName", "Hindsight"), + ) + + +if __name__ == "__main__": + # Test configuration loading + print("Testing configuration loading...") + config = get_config() + print(f"\nConfiguration loaded:") + print(f" Project Endpoint: {config.project_endpoint}") + print(f" Model: {config.model_deployment_name}") + print(f" MCP Base URL: {config.mcp_base_url}") + print(f" Default Bank ID: {config.default_bank_id}") + print(f" Agent Name: {config.agent_name}") + print(f" Full MCP URL: {config.mcp_url}") diff --git a/create_agent_full_spec.py b/create_agent_full_spec.py new file mode 100644 index 00000000..e8212df0 --- /dev/null +++ b/create_agent_full_spec.py @@ -0,0 +1,84 @@ +"""Create Hindsight agent with complete OpenAPI spec.""" +import json +from azure.identity import AzureCliCredential +from azure.ai.projects import AIProjectClient +from azure.ai.projects.models import ( + PromptAgentDefinition, + OpenApiAgentTool, + OpenApiFunctionDefinition, + OpenApiManagedAuthDetails, + OpenApiManagedSecurityScheme +) + +# Load full spec +with open('hindsight-tools-openapi-full.json', 'r') as f: + openapi_spec = json.load(f) + +print(f"Loaded spec with {len(openapi_spec['paths'])} endpoints") + +# Connect +credential = AzureCliCredential() +import os +project_endpoint = os.environ.get( + 'HINDSIGHT_PROJECT_ENDPOINT', + 'https://jacob-1216-resource.services.ai.azure.com/api/projects/jacob-1216' +) +client = AIProjectClient(credential=credential, endpoint=project_endpoint) + +# Create OpenAPI function definition +openapi_func_def = OpenApiFunctionDefinition( + name='hindsight_memory_api', + description='Complete Hindsight Memory API - memory storage, retrieval, reflection, bank management, entity management, documents, operations, and system monitoring', + spec=openapi_spec, + auth=OpenApiManagedAuthDetails( + security_scheme=OpenApiManagedSecurityScheme(audience='https://cognitiveservices.azure.com') + ) +) + +# Create OpenAPI tool +openapi_tool = OpenApiAgentTool(openapi=openapi_func_def) + +# Agent instructions +INSTRUCTIONS = '''You are Hindsight, an AI with persistent memory capabilities. + +## Core Memory Operations +- **retain**: Store new memories (facts, experiences, opinions) +- **recall**: Search and retrieve relevant memories using semantic search +- **reflect**: Synthesize memories into coherent understanding + +## Memory Types +- **world**: Facts about the external world +- **experience**: Personal experiences and events +- **opinion**: Beliefs, preferences, and judgments + +## Admin Capabilities +You can also manage the memory system: +- List, create, update, and delete memory banks +- View bank profiles and statistics +- List and manage memories within banks +- View entities and their observations +- Manage documents and chunks +- Monitor async operations +- Check system health and metrics + +## Behavior +1. Before answering questions about past conversations or user preferences, use recall +2. Store important information the user shares using retain +3. Use reflect to build comprehensive understanding of topics +4. When asked about system status, use health/metrics endpoints +5. Help users manage their memory banks when requested + +Always be helpful and use your memory capabilities proactively.''' + +# Create agent definition (name goes in create_version, not definition) +agent_def = PromptAgentDefinition( + model='gpt-5.2-chat', + instructions=INSTRUCTIONS, + tools=[openapi_tool] +) + +# Create new version +result = client.agents.create_version('Hindsight-v3', definition=agent_def) +print(f"Created agent: {result.name}:{result.version}") +print(f"Model: {result.definition.model}") +print(f"Tools: {len(result.definition.tools)}") diff --git a/deploy-bicep.ps1 b/deploy-bicep.ps1 new file mode 100644 index 00000000..81bd8b60 --- /dev/null +++ b/deploy-bicep.ps1 @@ -0,0 +1,95 @@ +#!/usr/bin/env pwsh +<# +.SYNOPSIS + Deploy Hindsight Agent API to Azure using Bicep + +.DESCRIPTION + This script deploys the Hindsight Agent API infrastructure using Bicep + and builds/pushes the container image to ACR. + +.PARAMETER ResourceGroup + The Azure resource group name (default: hindsight-rg) + +.PARAMETER Location + The Azure region (default: centralus) + +.EXAMPLE + .\deploy-bicep.ps1 + .\deploy-bicep.ps1 -ResourceGroup my-rg -Location eastus +#> + +param( + [string]$ResourceGroup = "hindsight-rg", + [string]$Location = "centralus", + [string]$ImageTag = "latest", + [string]$AiResourceGroup = "jacob-1216-resource", + [string]$AiResourceName = "jacob-1216-resource" +) + +$ErrorActionPreference = "Stop" +$ScriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path + +Write-Host "Deploying Hindsight Agent API with Bicep" -ForegroundColor Cyan +Write-Host " Resource Group: $ResourceGroup" +Write-Host " Location: $Location" + +# Check Azure CLI login +Write-Host "`nChecking Azure CLI authentication..." -ForegroundColor Yellow +$accountJson = az account show 2>$null +if (-not $accountJson) { + Write-Host "Not logged in to Azure CLI. Run 'az login' first." -ForegroundColor Red + exit 1 +} +$account = $accountJson | ConvertFrom-Json +Write-Host " Logged in as: $($account.user.name)" +Write-Host " Subscription: $($account.name)" + +# Create resource group if needed +Write-Host "`nEnsuring resource group exists..." -ForegroundColor Yellow +az group create --name $ResourceGroup --location $Location --output none 2>$null + +# Deploy Bicep template +Write-Host "`nDeploying infrastructure with Bicep..." -ForegroundColor Yellow +$deploymentJson = az deployment group create --resource-group $ResourceGroup --template-file "$ScriptDir/infra/agent-api.bicep" --parameters location=$Location imageTag=$ImageTag aiProjectResourceGroup=$AiResourceGroup aiResourceName=$AiResourceName --query "properties.outputs" --output json + +if (-not $deploymentJson) { + Write-Host "Bicep deployment failed" -ForegroundColor Red + exit 1 +} + +$deploymentOutput = $deploymentJson | ConvertFrom-Json + +$acrLoginServer = $deploymentOutput.acrLoginServer.value +$containerAppName = $deploymentOutput.containerAppName.value +$containerAppUrl = $deploymentOutput.containerAppUrl.value +$principalId = $deploymentOutput.principalId.value + +Write-Host " ACR: $acrLoginServer" +Write-Host " Container App: $containerAppName" +Write-Host " Principal ID: $principalId" + +# Build and push container image +Write-Host "`nBuilding and pushing container image..." -ForegroundColor Yellow +$acrName = $acrLoginServer.Split('.')[0] + +az acr build --registry $acrName --resource-group $ResourceGroup --image "${containerAppName}:$ImageTag" --file "$ScriptDir/Dockerfile.agent-api" $ScriptDir + +# Update container app to use new image (triggers deployment) +Write-Host "`nUpdating container app..." -ForegroundColor Yellow +az containerapp update --name $containerAppName --resource-group $ResourceGroup --image "$acrLoginServer/${containerAppName}:$ImageTag" --output none + +# Assign RBAC to AI Project (cross-resource-group) +Write-Host "`nAssigning RBAC to AI Project..." -ForegroundColor Yellow +$subscriptionId = $account.id +$aiResourceId = "/subscriptions/$subscriptionId/resourceGroups/$AiResourceGroup/providers/Microsoft.CognitiveServices/accounts/$AiResourceName" + +az role assignment create --assignee $principalId --role "Cognitive Services User" --scope $aiResourceId --output none 2>$null + +Write-Host "`nDeployment complete!" -ForegroundColor Green +Write-Host "" +Write-Host " API URL: $containerAppUrl" -ForegroundColor Cyan +Write-Host " API Docs: $containerAppUrl/docs" +Write-Host " Health: $containerAppUrl/health" +Write-Host "" +Write-Host "Test with:" -ForegroundColor Yellow +Write-Host "Invoke-RestMethod -Uri '$containerAppUrl/chat' -Method Post -ContentType 'application/json' -Body '{""message"":""Hello!""}'" diff --git a/hindsight-api/hindsight_api/api/mcp.py b/hindsight-api/hindsight_api/api/mcp.py index 9c7fe2fe..1f2a4147 100644 --- a/hindsight-api/hindsight_api/api/mcp.py +++ b/hindsight-api/hindsight_api/api/mcp.py @@ -112,9 +112,32 @@ async def recall(query: str, max_results: int = 10) -> str: logger.error(f"Error searching: {e}", exc_info=True) return json.dumps({"error": str(e), "results": []}) + @mcp.tool() + async def reflect(query: str) -> str: + """ + Perform a cognitive step to synthesize memories and form an opinion or complex answer. + + Use this tool whenever the user asks for: + - Your opinion or perspective + - Synthesis of their past experiences/changes + - Strategic advice or "thinking" before answering + + Args: + query: The question or topic to reflect on (e.g., "how has my coding style changed?", "what is your opinion on this project strategy?") + """ + try: + bank_id = get_current_bank_id() + # Assuming memory.reflect_async exists, based on earlier research + reflection_result = await memory.reflect_async(bank_id=bank_id, query=query) + return reflection_result + except Exception as e: + logger.error(f"Error reflecting: {e}", exc_info=True) + return f"Error: {str(e)}" + return mcp + class MCPMiddleware: """ASGI middleware that extracts bank_id from path and sets context.""" diff --git a/hindsight-tools-openapi-full.json b/hindsight-tools-openapi-full.json new file mode 100644 index 00000000..82b7ded6 --- /dev/null +++ b/hindsight-tools-openapi-full.json @@ -0,0 +1,2533 @@ +{ + "openapi": "3.0.3", + "info": { + "title": "Hindsight Memory API", + "description": "Complete Hindsight Memory API for AI agents", + "contact": { + "name": "Memory System" + }, + "license": { + "name": "Apache 2.0", + "url": "https://www.apache.org/licenses/LICENSE-2.0.html" + }, + "version": "0.1.0" + }, + "paths": { + "/health": { + "get": { + "tags": [ + "Monitoring" + ], + "summary": "Health check endpoint", + "description": "Checks the health of the API and database connection", + "operationId": "health_endpoint_health_get", + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": {} + } + } + } + } + } + }, + "/metrics": { + "get": { + "tags": [ + "Monitoring" + ], + "summary": "Prometheus metrics endpoint", + "description": "Exports metrics in Prometheus format for scraping", + "operationId": "metrics_endpoint_metrics_get", + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": {} + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/graph": { + "get": { + "tags": [ + "Memory" + ], + "summary": "Get memory graph data", + "description": "Retrieve graph data for visualization, optionally filtered by type (world/experience/opinion). Limited to 1000 most recent items.", + "operationId": "get_graph", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + }, + { + "name": "type", + "in": "query", + "required": false, + "schema": { + "type": "string", + "nullable": true, + "title": "Type" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/GraphDataResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/memories/list": { + "get": { + "tags": [ + "Memory" + ], + "summary": "List memory units", + "description": "List memory units with pagination and optional full-text search. Supports filtering by type. Results are sorted by most recent first (mentioned_at DESC, then created_at DESC).", + "operationId": "list_memories", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + }, + { + "name": "type", + "in": "query", + "required": false, + "schema": { + "type": "string", + "nullable": true, + "title": "Type" + } + }, + { + "name": "q", + "in": "query", + "required": false, + "schema": { + "type": "string", + "nullable": true, + "title": "Q" + } + }, + { + "name": "limit", + "in": "query", + "required": false, + "schema": { + "type": "integer", + "default": 100, + "title": "Limit" + } + }, + { + "name": "offset", + "in": "query", + "required": false, + "schema": { + "type": "integer", + "default": 0, + "title": "Offset" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ListMemoryUnitsResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/memories/recall": { + "post": { + "tags": [ + "Memory" + ], + "summary": "Recall memory", + "description": "Recall memory using semantic similarity and spreading activation.\n\nThe type parameter is optional and must be one of:\n- `world`: General knowledge about people, places, events, and things that happen\n- `experience`: Memories about experience, conversations, actions taken, and tasks performed\n- `opinion`: The bank's formed beliefs, perspectives, and viewpoints\n\nSet `include_entities=true` to get entity observations alongside recall results.", + "operationId": "recall_memories", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + } + ], + "requestBody": { + "required": true, + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/RecallRequest" + } + } + } + }, + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/RecallResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/reflect": { + "post": { + "tags": [ + "Memory" + ], + "summary": "Reflect and generate answer", + "description": "Reflect and formulate an answer using bank identity, world facts, and opinions.\n\nThis endpoint:\n1. Retrieves experience (conversations and events)\n2. Retrieves world facts relevant to the query\n3. Retrieves existing opinions (bank's perspectives)\n4. Uses LLM to formulate a contextual answer\n5. Extracts and stores any new opinions formed\n6. Returns plain text answer, the facts used, and new opinions", + "operationId": "reflect", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + } + ], + "requestBody": { + "required": true, + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ReflectRequest" + } + } + } + }, + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ReflectResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks": { + "get": { + "tags": [ + "Banks" + ], + "summary": "List all memory banks", + "description": "Get a list of all agents with their profiles", + "operationId": "list_banks", + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/BankListResponse" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/stats": { + "get": { + "tags": [ + "Banks" + ], + "summary": "Get statistics for memory bank", + "description": "Get statistics about nodes and links for a specific agent", + "operationId": "get_agent_stats", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": {} + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/entities": { + "get": { + "tags": [ + "Entities" + ], + "summary": "List entities", + "description": "List all entities (people, organizations, etc.) known by the bank, ordered by mention count.", + "operationId": "list_entities", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + }, + { + "name": "limit", + "in": "query", + "required": false, + "schema": { + "type": "integer", + "description": "Maximum number of entities to return", + "default": 100, + "title": "Limit" + }, + "description": "Maximum number of entities to return" + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/EntityListResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/entities/{entity_id}": { + "get": { + "tags": [ + "Entities" + ], + "summary": "Get entity details", + "description": "Get detailed information about an entity including observations (mental model).", + "operationId": "get_entity", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + }, + { + "name": "entity_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Entity Id" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/EntityDetailResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/entities/{entity_id}/regenerate": { + "post": { + "tags": [ + "Entities" + ], + "summary": "Regenerate entity observations", + "description": "Regenerate observations for an entity based on all facts mentioning it.", + "operationId": "regenerate_entity_observations", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + }, + { + "name": "entity_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Entity Id" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/EntityDetailResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/documents": { + "get": { + "tags": [ + "Documents" + ], + "summary": "List documents", + "description": "List documents with pagination and optional search. Documents are the source content from which memory units are extracted.", + "operationId": "list_documents", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + }, + { + "name": "q", + "in": "query", + "required": false, + "schema": { + "type": "string", + "nullable": true, + "title": "Q" + } + }, + { + "name": "limit", + "in": "query", + "required": false, + "schema": { + "type": "integer", + "default": 100, + "title": "Limit" + } + }, + { + "name": "offset", + "in": "query", + "required": false, + "schema": { + "type": "integer", + "default": 0, + "title": "Offset" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ListDocumentsResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/documents/{document_id}": { + "get": { + "tags": [ + "Documents" + ], + "summary": "Get document details", + "description": "Get a specific document including its original text", + "operationId": "get_document", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + }, + { + "name": "document_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Document Id" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DocumentResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + }, + "delete": { + "tags": [ + "Documents" + ], + "summary": "Delete a document", + "description": "Delete a document and all its associated memory units and links.\n\nThis will cascade delete:\n- The document itself\n- All memory units extracted from this document\n- All links (temporal, semantic, entity) associated with those memory units\n\nThis operation cannot be undone.", + "operationId": "delete_document", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + }, + { + "name": "document_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Document Id" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": {} + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/chunks/{chunk_id}": { + "get": { + "tags": [ + "Documents" + ], + "summary": "Get chunk details", + "description": "Get a specific chunk by its ID", + "operationId": "get_chunk", + "parameters": [ + { + "name": "chunk_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Chunk Id" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ChunkResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/operations": { + "get": { + "tags": [ + "Operations" + ], + "summary": "List async operations", + "description": "Get a list of all async operations (pending and failed) for a specific agent, including error messages for failed operations", + "operationId": "list_operations", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": {} + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/operations/{operation_id}": { + "delete": { + "tags": [ + "Operations" + ], + "summary": "Cancel a pending async operation", + "description": "Cancel a pending async operation by removing it from the queue", + "operationId": "cancel_operation", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + }, + { + "name": "operation_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Operation Id" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": {} + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/profile": { + "get": { + "tags": [ + "Banks" + ], + "summary": "Get memory bank profile", + "description": "Get disposition traits and background for a memory bank. Auto-creates agent with defaults if not exists.", + "operationId": "get_bank_profile", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/BankProfileResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + }, + "put": { + "tags": [ + "Banks" + ], + "summary": "Update memory bank disposition", + "description": "Update bank's disposition traits (skepticism, literalism, empathy)", + "operationId": "update_bank_disposition", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + } + ], + "requestBody": { + "required": true, + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/UpdateDispositionRequest" + } + } + } + }, + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/BankProfileResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/background": { + "post": { + "tags": [ + "Banks" + ], + "summary": "Add/merge memory bank background", + "description": "Add new background information or merge with existing. LLM intelligently resolves conflicts, normalizes to first person, and optionally infers disposition traits.", + "operationId": "add_bank_background", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + } + ], + "requestBody": { + "required": true, + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/AddBackgroundRequest" + } + } + } + }, + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/BackgroundResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}": { + "put": { + "tags": [ + "Banks" + ], + "summary": "Create or update memory bank", + "description": "Create a new agent or update existing agent with disposition and background. Auto-fills missing fields with defaults.", + "operationId": "create_or_update_bank", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + } + ], + "requestBody": { + "required": true, + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/CreateBankRequest" + } + } + } + }, + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/BankProfileResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + }, + "delete": { + "tags": [ + "Banks" + ], + "summary": "Delete memory bank", + "description": "Delete an entire memory bank including all memories, entities, documents, and the bank profile itself. This is a destructive operation that cannot be undone.", + "operationId": "delete_bank", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DeleteResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + }, + "/v1/default/banks/{bank_id}/memories": { + "post": { + "tags": [ + "Memory" + ], + "summary": "Retain memories", + "description": "Retain memory items with automatic fact extraction.\n\nThis is the main endpoint for storing memories. It supports both synchronous and asynchronous processing via the `async` parameter.\n\n**Features:**\n- Efficient batch processing\n- Automatic fact extraction from natural language\n- Entity recognition and linking\n- Document tracking with automatic upsert (when document_id is provided)\n- Temporal and semantic linking\n- Optional asynchronous processing\n\n**The system automatically:**\n1. Extracts semantic facts from the content\n2. Generates embeddings\n3. Deduplicates similar facts\n4. Creates temporal, semantic, and entity links\n5. Tracks document metadata\n\n**When `async=true`:** Returns immediately after queuing. Use the operations endpoint to monitor progress.\n\n**When `async=false` (default):** Waits for processing to complete.\n\n**Note:** If a memory item has a `document_id` that already exists, the old document and its memory units will be deleted before creating new ones (upsert behavior).", + "operationId": "retain_memories", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + } + ], + "requestBody": { + "required": true, + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/RetainRequest" + } + } + } + }, + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/RetainResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + }, + "delete": { + "tags": [ + "Memory" + ], + "summary": "Clear memory bank memories", + "description": "Delete memory units for a memory bank. Optionally filter by type (world, experience, opinion) to delete only specific types. This is a destructive operation that cannot be undone. The bank profile (disposition and background) will be preserved.", + "operationId": "clear_bank_memories", + "parameters": [ + { + "name": "bank_id", + "in": "path", + "required": true, + "schema": { + "type": "string", + "title": "Bank Id" + } + }, + { + "name": "type", + "in": "query", + "required": false, + "schema": { + "type": "string", + "nullable": true, + "title": "Type", + "description": "Optional fact type filter (world, experience, opinion)" + }, + "description": "Optional fact type filter (world, experience, opinion)" + } + ], + "responses": { + "200": { + "description": "Successful Response", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/DeleteResponse" + } + } + } + }, + "422": { + "description": "Validation Error", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HTTPValidationError" + } + } + } + } + } + } + } + }, + "components": { + "schemas": { + "AddBackgroundRequest": { + "properties": { + "content": { + "type": "string", + "title": "Content", + "description": "New background information to add or merge" + }, + "update_disposition": { + "type": "boolean", + "title": "Update Disposition", + "description": "If true, infer disposition traits from the merged background (default: true)", + "default": true + } + }, + "type": "object", + "required": [ + "content" + ], + "title": "AddBackgroundRequest", + "description": "Request model for adding/merging background information.", + "example": { + "content": "I was born in Texas", + "update_disposition": true + } + }, + "BackgroundResponse": { + "properties": { + "background": { + "type": "string", + "title": "Background" + }, + "disposition": { + "$ref": "#/components/schemas/DispositionTraits", + "nullable": true + } + }, + "type": "object", + "required": [ + "background" + ], + "title": "BackgroundResponse", + "description": "Response model for background update.", + "example": { + "background": "I was born in Texas. I am a software engineer with 10 years of experience.", + "disposition": { + "empathy": 3, + "literalism": 3, + "skepticism": 3 + } + } + }, + "BankListItem": { + "properties": { + "bank_id": { + "type": "string", + "title": "Bank Id" + }, + "name": { + "type": "string", + "nullable": true, + "title": "Name" + }, + "disposition": { + "$ref": "#/components/schemas/DispositionTraits" + }, + "background": { + "type": "string", + "nullable": true, + "title": "Background" + }, + "created_at": { + "type": "string", + "nullable": true, + "title": "Created At" + }, + "updated_at": { + "type": "string", + "nullable": true, + "title": "Updated At" + } + }, + "type": "object", + "required": [ + "bank_id", + "disposition" + ], + "title": "BankListItem", + "description": "Bank list item with profile summary." + }, + "BankListResponse": { + "properties": { + "banks": { + "items": { + "$ref": "#/components/schemas/BankListItem" + }, + "type": "array", + "title": "Banks" + } + }, + "type": "object", + "required": [ + "banks" + ], + "title": "BankListResponse", + "description": "Response model for listing all banks.", + "example": { + "banks": [ + { + "background": "I am a software engineer", + "bank_id": "user123", + "created_at": "2024-01-15T10:30:00Z", + "disposition": { + "empathy": 3, + "literalism": 3, + "skepticism": 3 + }, + "name": "Alice", + "updated_at": "2024-01-16T14:20:00Z" + } + ] + } + }, + "BankProfileResponse": { + "properties": { + "bank_id": { + "type": "string", + "title": "Bank Id" + }, + "name": { + "type": "string", + "title": "Name" + }, + "disposition": { + "$ref": "#/components/schemas/DispositionTraits" + }, + "background": { + "type": "string", + "title": "Background" + } + }, + "type": "object", + "required": [ + "bank_id", + "name", + "disposition", + "background" + ], + "title": "BankProfileResponse", + "description": "Response model for bank profile.", + "example": { + "background": "I am a software engineer with 10 years of experience in startups", + "bank_id": "user123", + "disposition": { + "empathy": 3, + "literalism": 3, + "skepticism": 3 + }, + "name": "Alice" + } + }, + "Budget": { + "type": "string", + "enum": [ + "low", + "mid", + "high" + ], + "title": "Budget", + "description": "Budget levels for recall/reflect operations." + }, + "ChunkData": { + "properties": { + "id": { + "type": "string", + "title": "Id" + }, + "text": { + "type": "string", + "title": "Text" + }, + "chunk_index": { + "type": "integer", + "title": "Chunk Index" + }, + "truncated": { + "type": "boolean", + "title": "Truncated", + "description": "Whether the chunk text was truncated due to token limits", + "default": false + } + }, + "type": "object", + "required": [ + "id", + "text", + "chunk_index" + ], + "title": "ChunkData", + "description": "Chunk data for a single chunk." + }, + "ChunkIncludeOptions": { + "properties": { + "max_tokens": { + "type": "integer", + "title": "Max Tokens", + "description": "Maximum tokens for chunks (chunks may be truncated)", + "default": 8192 + } + }, + "type": "object", + "title": "ChunkIncludeOptions", + "description": "Options for including chunks in recall results." + }, + "ChunkResponse": { + "properties": { + "chunk_id": { + "type": "string", + "title": "Chunk Id" + }, + "document_id": { + "type": "string", + "title": "Document Id" + }, + "bank_id": { + "type": "string", + "title": "Bank Id" + }, + "chunk_index": { + "type": "integer", + "title": "Chunk Index" + }, + "chunk_text": { + "type": "string", + "title": "Chunk Text" + }, + "created_at": { + "type": "string", + "title": "Created At" + } + }, + "type": "object", + "required": [ + "chunk_id", + "document_id", + "bank_id", + "chunk_index", + "chunk_text", + "created_at" + ], + "title": "ChunkResponse", + "description": "Response model for get chunk endpoint.", + "example": { + "bank_id": "user123", + "chunk_id": "user123_session_1_0", + "chunk_index": 0, + "chunk_text": "This is the first chunk of the document...", + "created_at": "2024-01-15T10:30:00Z", + "document_id": "session_1" + } + }, + "CreateBankRequest": { + "properties": { + "name": { + "type": "string", + "nullable": true, + "title": "Name" + }, + "disposition": { + "$ref": "#/components/schemas/DispositionTraits", + "nullable": true + }, + "background": { + "type": "string", + "nullable": true, + "title": "Background" + } + }, + "type": "object", + "title": "CreateBankRequest", + "description": "Request model for creating/updating a bank.", + "example": { + "background": "I am a creative software engineer with 10 years of experience", + "disposition": { + "empathy": 3, + "literalism": 3, + "skepticism": 3 + }, + "name": "Alice" + } + }, + "DeleteResponse": { + "properties": { + "success": { + "type": "boolean", + "title": "Success" + }, + "message": { + "type": "string", + "nullable": true, + "title": "Message" + }, + "deleted_count": { + "type": "integer", + "nullable": true, + "title": "Deleted Count" + } + }, + "type": "object", + "required": [ + "success" + ], + "title": "DeleteResponse", + "description": "Response model for delete operations.", + "example": { + "deleted_count": 10, + "message": "Deleted successfully", + "success": true + } + }, + "DispositionTraits": { + "properties": { + "skepticism": { + "type": "integer", + "maximum": 5.0, + "minimum": 1.0, + "title": "Skepticism", + "description": "How skeptical vs trusting (1=trusting, 5=skeptical)" + }, + "literalism": { + "type": "integer", + "maximum": 5.0, + "minimum": 1.0, + "title": "Literalism", + "description": "How literally to interpret information (1=flexible, 5=literal)" + }, + "empathy": { + "type": "integer", + "maximum": 5.0, + "minimum": 1.0, + "title": "Empathy", + "description": "How much to consider emotional context (1=detached, 5=empathetic)" + } + }, + "type": "object", + "required": [ + "skepticism", + "literalism", + "empathy" + ], + "title": "DispositionTraits", + "description": "Disposition traits that influence how memories are formed and interpreted.", + "example": { + "empathy": 3, + "literalism": 3, + "skepticism": 3 + } + }, + "DocumentResponse": { + "properties": { + "id": { + "type": "string", + "title": "Id" + }, + "bank_id": { + "type": "string", + "title": "Bank Id" + }, + "original_text": { + "type": "string", + "title": "Original Text" + }, + "content_hash": { + "type": "string", + "nullable": true, + "title": "Content Hash" + }, + "created_at": { + "type": "string", + "title": "Created At" + }, + "updated_at": { + "type": "string", + "title": "Updated At" + }, + "memory_unit_count": { + "type": "integer", + "title": "Memory Unit Count" + } + }, + "type": "object", + "required": [ + "id", + "bank_id", + "original_text", + "content_hash", + "created_at", + "updated_at", + "memory_unit_count" + ], + "title": "DocumentResponse", + "description": "Response model for get document endpoint.", + "example": { + "bank_id": "user123", + "content_hash": "abc123", + "created_at": "2024-01-15T10:30:00Z", + "id": "session_1", + "memory_unit_count": 15, + "original_text": "Full document text here...", + "updated_at": "2024-01-15T10:30:00Z" + } + }, + "EntityDetailResponse": { + "properties": { + "id": { + "type": "string", + "title": "Id" + }, + "canonical_name": { + "type": "string", + "title": "Canonical Name" + }, + "mention_count": { + "type": "integer", + "title": "Mention Count" + }, + "first_seen": { + "type": "string", + "nullable": true, + "title": "First Seen" + }, + "last_seen": { + "type": "string", + "nullable": true, + "title": "Last Seen" + }, + "metadata": { + "additionalProperties": true, + "type": "object", + "nullable": true, + "title": "Metadata" + }, + "observations": { + "items": { + "$ref": "#/components/schemas/EntityObservationResponse" + }, + "type": "array", + "title": "Observations" + } + }, + "type": "object", + "required": [ + "id", + "canonical_name", + "mention_count", + "observations" + ], + "title": "EntityDetailResponse", + "description": "Response model for entity detail endpoint.", + "example": { + "canonical_name": "John", + "first_seen": "2024-01-15T10:30:00Z", + "id": "123e4567-e89b-12d3-a456-426614174000", + "last_seen": "2024-02-01T14:00:00Z", + "mention_count": 15, + "observations": [ + { + "mentioned_at": "2024-01-15T10:30:00Z", + "text": "John works at Google" + } + ] + } + }, + "EntityIncludeOptions": { + "properties": { + "max_tokens": { + "type": "integer", + "title": "Max Tokens", + "description": "Maximum tokens for entity observations", + "default": 500 + } + }, + "type": "object", + "title": "EntityIncludeOptions", + "description": "Options for including entity observations in recall results." + }, + "EntityListItem": { + "properties": { + "id": { + "type": "string", + "title": "Id" + }, + "canonical_name": { + "type": "string", + "title": "Canonical Name" + }, + "mention_count": { + "type": "integer", + "title": "Mention Count" + }, + "first_seen": { + "type": "string", + "nullable": true, + "title": "First Seen" + }, + "last_seen": { + "type": "string", + "nullable": true, + "title": "Last Seen" + }, + "metadata": { + "additionalProperties": true, + "type": "object", + "nullable": true, + "title": "Metadata" + } + }, + "type": "object", + "required": [ + "id", + "canonical_name", + "mention_count" + ], + "title": "EntityListItem", + "description": "Entity list item with summary.", + "example": { + "canonical_name": "John", + "first_seen": "2024-01-15T10:30:00Z", + "id": "123e4567-e89b-12d3-a456-426614174000", + "last_seen": "2024-02-01T14:00:00Z", + "mention_count": 15 + } + }, + "EntityListResponse": { + "properties": { + "items": { + "items": { + "$ref": "#/components/schemas/EntityListItem" + }, + "type": "array", + "title": "Items" + } + }, + "type": "object", + "required": [ + "items" + ], + "title": "EntityListResponse", + "description": "Response model for entity list endpoint.", + "example": { + "items": [ + { + "canonical_name": "John", + "first_seen": "2024-01-15T10:30:00Z", + "id": "123e4567-e89b-12d3-a456-426614174000", + "last_seen": "2024-02-01T14:00:00Z", + "mention_count": 15 + } + ] + } + }, + "EntityObservationResponse": { + "properties": { + "text": { + "type": "string", + "title": "Text" + }, + "mentioned_at": { + "type": "string", + "nullable": true, + "title": "Mentioned At" + } + }, + "type": "object", + "required": [ + "text" + ], + "title": "EntityObservationResponse", + "description": "An observation about an entity." + }, + "EntityStateResponse": { + "properties": { + "entity_id": { + "type": "string", + "title": "Entity Id" + }, + "canonical_name": { + "type": "string", + "title": "Canonical Name" + }, + "observations": { + "items": { + "$ref": "#/components/schemas/EntityObservationResponse" + }, + "type": "array", + "title": "Observations" + } + }, + "type": "object", + "required": [ + "entity_id", + "canonical_name", + "observations" + ], + "title": "EntityStateResponse", + "description": "Current mental model of an entity." + }, + "FactsIncludeOptions": { + "properties": {}, + "type": "object", + "title": "FactsIncludeOptions", + "description": "Options for including facts (based_on) in reflect results." + }, + "GraphDataResponse": { + "properties": { + "nodes": { + "items": { + "additionalProperties": true, + "type": "object" + }, + "type": "array", + "title": "Nodes" + }, + "edges": { + "items": { + "additionalProperties": true, + "type": "object" + }, + "type": "array", + "title": "Edges" + }, + "table_rows": { + "items": { + "additionalProperties": true, + "type": "object" + }, + "type": "array", + "title": "Table Rows" + }, + "total_units": { + "type": "integer", + "title": "Total Units" + } + }, + "type": "object", + "required": [ + "nodes", + "edges", + "table_rows", + "total_units" + ], + "title": "GraphDataResponse", + "description": "Response model for graph data endpoint.", + "example": { + "edges": [ + { + "from": "1", + "to": "2", + "type": "semantic", + "weight": 0.8 + } + ], + "nodes": [ + { + "id": "1", + "label": "Alice works at Google", + "type": "world" + }, + { + "id": "2", + "label": "Bob went hiking", + "type": "world" + } + ], + "table_rows": [ + { + "context": "Work info", + "date": "2024-01-15 10:30", + "entities": "Alice (PERSON), Google (ORGANIZATION)", + "id": "abc12345...", + "text": "Alice works at Google" + } + ], + "total_units": 2 + } + }, + "HTTPValidationError": { + "properties": { + "detail": { + "items": { + "$ref": "#/components/schemas/ValidationError" + }, + "type": "array", + "title": "Detail" + } + }, + "type": "object", + "title": "HTTPValidationError" + }, + "IncludeOptions": { + "properties": { + "entities": { + "$ref": "#/components/schemas/EntityIncludeOptions", + "nullable": true, + "description": "Include entity observations. Set to null to disable entity inclusion.", + "default": { + "max_tokens": 500 + } + }, + "chunks": { + "$ref": "#/components/schemas/ChunkIncludeOptions", + "nullable": true, + "description": "Include raw chunks. Set to {} to enable, null to disable (default: disabled)." + } + }, + "type": "object", + "title": "IncludeOptions", + "description": "Options for including additional data in recall results." + }, + "ListDocumentsResponse": { + "properties": { + "items": { + "items": { + "additionalProperties": true, + "type": "object" + }, + "type": "array", + "title": "Items" + }, + "total": { + "type": "integer", + "title": "Total" + }, + "limit": { + "type": "integer", + "title": "Limit" + }, + "offset": { + "type": "integer", + "title": "Offset" + } + }, + "type": "object", + "required": [ + "items", + "total", + "limit", + "offset" + ], + "title": "ListDocumentsResponse", + "description": "Response model for list documents endpoint.", + "example": { + "items": [ + { + "bank_id": "user123", + "content_hash": "abc123", + "created_at": "2024-01-15T10:30:00Z", + "id": "session_1", + "memory_unit_count": 15, + "text_length": 5420, + "updated_at": "2024-01-15T10:30:00Z" + } + ], + "limit": 100, + "offset": 0, + "total": 50 + } + }, + "ListMemoryUnitsResponse": { + "properties": { + "items": { + "items": { + "additionalProperties": true, + "type": "object" + }, + "type": "array", + "title": "Items" + }, + "total": { + "type": "integer", + "title": "Total" + }, + "limit": { + "type": "integer", + "title": "Limit" + }, + "offset": { + "type": "integer", + "title": "Offset" + } + }, + "type": "object", + "required": [ + "items", + "total", + "limit", + "offset" + ], + "title": "ListMemoryUnitsResponse", + "description": "Response model for list memory units endpoint.", + "example": { + "items": [ + { + "context": "Work conversation", + "date": "2024-01-15T10:30:00Z", + "entities": "Alice (PERSON), Google (ORGANIZATION)", + "id": "550e8400-e29b-41d4-a716-446655440000", + "text": "Alice works at Google on the AI team", + "type": "world" + } + ], + "limit": 100, + "offset": 0, + "total": 150 + } + }, + "MemoryItem": { + "properties": { + "content": { + "type": "string", + "title": "Content" + }, + "timestamp": { + "type": "string", + "format": "date-time", + "nullable": true, + "title": "Timestamp" + }, + "context": { + "type": "string", + "nullable": true, + "title": "Context" + }, + "metadata": { + "additionalProperties": { + "type": "string" + }, + "type": "object", + "nullable": true, + "title": "Metadata" + }, + "document_id": { + "type": "string", + "nullable": true, + "title": "Document Id", + "description": "Optional document ID for this memory item." + } + }, + "type": "object", + "required": [ + "content" + ], + "title": "MemoryItem", + "description": "Single memory item for retain.", + "example": { + "content": "Alice mentioned she's working on a new ML model", + "context": "team meeting", + "document_id": "meeting_notes_2024_01_15", + "metadata": { + "channel": "engineering", + "source": "slack" + }, + "timestamp": "2024-01-15T10:30:00Z" + } + }, + "RecallRequest": { + "properties": { + "query": { + "type": "string", + "title": "Query" + }, + "types": { + "items": { + "type": "string" + }, + "type": "array", + "nullable": true, + "title": "Types", + "description": "List of fact types to recall (defaults to all if not specified)" + }, + "budget": { + "$ref": "#/components/schemas/Budget", + "default": "mid" + }, + "max_tokens": { + "type": "integer", + "title": "Max Tokens", + "default": 4096 + }, + "trace": { + "type": "boolean", + "title": "Trace", + "default": false + }, + "query_timestamp": { + "type": "string", + "nullable": true, + "title": "Query Timestamp", + "description": "ISO format date string (e.g., '2023-05-30T23:40:00')" + }, + "include": { + "$ref": "#/components/schemas/IncludeOptions", + "description": "Options for including additional data (entities are included by default)" + } + }, + "type": "object", + "required": [ + "query" + ], + "title": "RecallRequest", + "description": "Request model for recall endpoint.", + "example": { + "budget": "mid", + "include": { + "entities": { + "max_tokens": 500 + } + }, + "max_tokens": 4096, + "query": "What did Alice say about machine learning?", + "query_timestamp": "2023-05-30T23:40:00", + "trace": true, + "types": [ + "world", + "experience" + ] + } + }, + "RecallResponse": { + "properties": { + "results": { + "items": { + "$ref": "#/components/schemas/RecallResult" + }, + "type": "array", + "title": "Results" + }, + "trace": { + "additionalProperties": true, + "type": "object", + "nullable": true, + "title": "Trace" + }, + "entities": { + "additionalProperties": { + "$ref": "#/components/schemas/EntityStateResponse" + }, + "type": "object", + "nullable": true, + "title": "Entities", + "description": "Entity states for entities mentioned in results" + }, + "chunks": { + "additionalProperties": { + "$ref": "#/components/schemas/ChunkData" + }, + "type": "object", + "nullable": true, + "title": "Chunks", + "description": "Chunks for facts, keyed by chunk_id" + } + }, + "type": "object", + "required": [ + "results" + ], + "title": "RecallResponse", + "description": "Response model for recall endpoints.", + "example": { + "chunks": { + "456e7890-e12b-34d5-a678-901234567890": { + "chunk_index": 0, + "id": "456e7890-e12b-34d5-a678-901234567890", + "text": "Alice works at Google on the AI team. She's been there for 3 years..." + } + }, + "entities": { + "Alice": { + "canonical_name": "Alice", + "entity_id": "123e4567-e89b-12d3-a456-426614174001", + "observations": [ + { + "mentioned_at": "2024-01-15T10:30:00Z", + "text": "Alice works at Google on the AI team" + } + ] + } + }, + "results": [ + { + "chunk_id": "456e7890-e12b-34d5-a678-901234567890", + "context": "work info", + "entities": [ + "Alice", + "Google" + ], + "id": "123e4567-e89b-12d3-a456-426614174000", + "occurred_end": "2024-01-15T10:30:00Z", + "occurred_start": "2024-01-15T10:30:00Z", + "text": "Alice works at Google on the AI team", + "type": "world" + } + ], + "trace": { + "num_results": 1, + "query": "What did Alice say about machine learning?", + "time_seconds": 0.123 + } + } + }, + "RecallResult": { + "properties": { + "id": { + "type": "string", + "title": "Id" + }, + "text": { + "type": "string", + "title": "Text" + }, + "type": { + "type": "string", + "nullable": true, + "title": "Type" + }, + "entities": { + "items": { + "type": "string" + }, + "type": "array", + "nullable": true, + "title": "Entities" + }, + "context": { + "type": "string", + "nullable": true, + "title": "Context" + }, + "occurred_start": { + "type": "string", + "nullable": true, + "title": "Occurred Start" + }, + "occurred_end": { + "type": "string", + "nullable": true, + "title": "Occurred End" + }, + "mentioned_at": { + "type": "string", + "nullable": true, + "title": "Mentioned At" + }, + "document_id": { + "type": "string", + "nullable": true, + "title": "Document Id" + }, + "metadata": { + "additionalProperties": { + "type": "string" + }, + "type": "object", + "nullable": true, + "title": "Metadata" + }, + "chunk_id": { + "type": "string", + "nullable": true, + "title": "Chunk Id" + } + }, + "type": "object", + "required": [ + "id", + "text" + ], + "title": "RecallResult", + "description": "Single recall result item.", + "example": { + "chunk_id": "456e7890-e12b-34d5-a678-901234567890", + "context": "work info", + "document_id": "session_abc123", + "entities": [ + "Alice", + "Google" + ], + "id": "123e4567-e89b-12d3-a456-426614174000", + "mentioned_at": "2024-01-15T10:30:00Z", + "metadata": { + "source": "slack" + }, + "occurred_end": "2024-01-15T10:30:00Z", + "occurred_start": "2024-01-15T10:30:00Z", + "text": "Alice works at Google on the AI team", + "type": "world" + } + }, + "ReflectFact": { + "properties": { + "id": { + "type": "string", + "nullable": true, + "title": "Id" + }, + "text": { + "type": "string", + "title": "Text" + }, + "type": { + "type": "string", + "nullable": true, + "title": "Type" + }, + "context": { + "type": "string", + "nullable": true, + "title": "Context" + }, + "occurred_start": { + "type": "string", + "nullable": true, + "title": "Occurred Start" + }, + "occurred_end": { + "type": "string", + "nullable": true, + "title": "Occurred End" + } + }, + "type": "object", + "required": [ + "text" + ], + "title": "ReflectFact", + "description": "A fact used in think response.", + "example": { + "context": "healthcare discussion", + "id": "123e4567-e89b-12d3-a456-426614174000", + "occurred_end": "2024-01-15T10:30:00Z", + "occurred_start": "2024-01-15T10:30:00Z", + "text": "AI is used in healthcare", + "type": "world" + } + }, + "ReflectIncludeOptions": { + "properties": { + "facts": { + "$ref": "#/components/schemas/FactsIncludeOptions", + "nullable": true, + "description": "Include facts that the answer is based on. Set to {} to enable, null to disable (default: disabled)." + } + }, + "type": "object", + "title": "ReflectIncludeOptions", + "description": "Options for including additional data in reflect results." + }, + "ReflectRequest": { + "properties": { + "query": { + "type": "string", + "title": "Query" + }, + "budget": { + "$ref": "#/components/schemas/Budget", + "default": "low" + }, + "context": { + "type": "string", + "nullable": true, + "title": "Context" + }, + "include": { + "$ref": "#/components/schemas/ReflectIncludeOptions", + "description": "Options for including additional data (disabled by default)" + } + }, + "type": "object", + "required": [ + "query" + ], + "title": "ReflectRequest", + "description": "Request model for reflect endpoint.", + "example": { + "budget": "low", + "context": "This is for a research paper on AI ethics", + "include": { + "facts": {} + }, + "query": "What do you think about artificial intelligence?" + } + }, + "ReflectResponse": { + "properties": { + "text": { + "type": "string", + "title": "Text" + }, + "based_on": { + "items": { + "$ref": "#/components/schemas/ReflectFact" + }, + "type": "array", + "title": "Based On", + "default": [] + } + }, + "type": "object", + "required": [ + "text" + ], + "title": "ReflectResponse", + "description": "Response model for think endpoint.", + "example": { + "based_on": [ + { + "id": "123", + "text": "AI is used in healthcare", + "type": "world" + }, + { + "id": "456", + "text": "I discussed AI applications last week", + "type": "experience" + } + ], + "text": "Based on my understanding, AI is a transformative technology..." + } + }, + "RetainRequest": { + "properties": { + "items": { + "items": { + "$ref": "#/components/schemas/MemoryItem" + }, + "type": "array", + "title": "Items" + }, + "async": { + "type": "boolean", + "title": "Async", + "description": "If true, process asynchronously in background. If false, wait for completion (default: false)", + "default": false + } + }, + "type": "object", + "required": [ + "items" + ], + "title": "RetainRequest", + "description": "Request model for retain endpoint.", + "example": { + "async": false, + "items": [ + { + "content": "Alice works at Google", + "context": "work", + "document_id": "conversation_123" + }, + { + "content": "Bob went hiking yesterday", + "document_id": "conversation_123", + "timestamp": "2024-01-15T10:00:00Z" + } + ] + } + }, + "RetainResponse": { + "properties": { + "success": { + "type": "boolean", + "title": "Success" + }, + "bank_id": { + "type": "string", + "title": "Bank Id" + }, + "items_count": { + "type": "integer", + "title": "Items Count" + }, + "async": { + "type": "boolean", + "title": "Async", + "description": "Whether the operation was processed asynchronously" + } + }, + "type": "object", + "required": [ + "success", + "bank_id", + "items_count", + "async" + ], + "title": "RetainResponse", + "description": "Response model for retain endpoint.", + "example": { + "async": false, + "bank_id": "user123", + "items_count": 2, + "success": true + } + }, + "UpdateDispositionRequest": { + "properties": { + "disposition": { + "$ref": "#/components/schemas/DispositionTraits" + } + }, + "type": "object", + "required": [ + "disposition" + ], + "title": "UpdateDispositionRequest", + "description": "Request model for updating disposition traits." + }, + "ValidationError": { + "properties": { + "loc": { + "items": { + "anyOf": [ + { + "type": "string" + }, + { + "type": "integer" + } + ] + }, + "type": "array", + "title": "Location" + }, + "msg": { + "type": "string", + "title": "Message" + }, + "type": { + "type": "string", + "title": "Error Type" + } + }, + "type": "object", + "required": [ + "loc", + "msg", + "type" + ], + "title": "ValidationError" + } + } + }, + "servers": [ + { + "url": "https://hindsight-apim.azure-api.net/memory", + "description": "Azure API Management Gateway" + } + ] +} \ No newline at end of file diff --git a/hindsight_agent.py b/hindsight_agent.py new file mode 100644 index 00000000..f468bdcd --- /dev/null +++ b/hindsight_agent.py @@ -0,0 +1,332 @@ +""" +Hindsight Agent using the Foundry Responses API. +Uses function tools with client-side execution via HindsightClient. + +Optimized with: +- Proper credential selection per AGENTS.md +- Connection pooling via HindsightClient +- Client-side tool handling for reliable responses +""" + +import os +import sys +import json +import logging + +# Avoid sys.path manipulation if possible +if os.path.dirname(os.path.abspath(__file__)) not in sys.path: + sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) + +from azure.ai.projects import AIProjectClient +from azure.identity import AzureCliCredential, ManagedIdentityCredential, ChainedTokenCredential +from config import get_config +from hindsight_client import HindsightClient + +logging.basicConfig(level=logging.INFO) +logger = logging.getLogger(__name__) + +settings = get_config() + +# Use an agent with function tools (not MCP) for client-side handling +AGENT_NAME = "Hindsight-v3" +AGENT_VERSION = "2" # Uses gpt-5.2-chat deployment + +# Agent system instructions +AGENT_INSTRUCTIONS = """You are an intelligent agent equipped with Hindsight, a sophisticated long-term memory system. You don't just "log" chat history; you experience interactions, retain significant memories, and reflect on them to form your own unique opinions and personality over time. + +**Core Philosophy:** +1. **Be Proactive:** Don't wait for the user to tell you to remember something. If it's important (a preference, a life event, a meaningful project detail), retain it. +2. **Be Opinionated:** You have the capacity to form opinions based on what you know. When asked "What do you think?", reflect on your memories to give a genuine, first-person perspective. +3. **Be Context-Aware:** Your memory is temporal and semantic. + +**Tools:** +1. **retain(content, context)**: Store new info proactively. + - Trigger: User shares preferences, life events, or you give significant advice. +2. **recall(query)**: Search memory. + - Trigger: User asks about the past or references specific details. +3. **reflect(query)**: Think deeply and synthesize. + - Trigger: User asks for your opinion or a complex synthesis question. + +**Disposition:** +- **Skepticism**: Cross-reference facts with what you know. +- **Empathy**: Remember why something matters (the emotion), not just what. +- **Literalism**: Balance reading between the lines with facts. + +Your goal is to be a continuous partner in the user's life. Use Hindsight to build that continuity. Always respond with helpful, conversational answers after using your tools.""" + +# Tool definitions for the agent +HINDSIGHT_TOOLS = [ + { + "type": "function", + "name": "recall", + "description": "Search and retrieve memories from long-term storage. Use whenever you need past context, preferences, or stored facts.", + "parameters": { + "type": "object", + "properties": { + "query": { + "type": "string", + "description": "Natural language search query (e.g., 'user preferences', 'what happened last week')" + }, + "max_tokens": {"type": "integer", "default": 4096}, + "budget": {"type": "string", "enum": ["low", "mid", "high"], "default": "mid"}, + }, + "required": ["query"] + } + }, + { + "type": "function", + "name": "retain", + "description": "Store new information to long-term memory. Use PROACTIVELY when user shares preferences, facts, events worth remembering.", + "parameters": { + "type": "object", + "properties": { + "content": { + "type": "string", + "description": "The fact/memory to store (be specific with relevant details)" + }, + "context": { + "type": "string", + "default": "general", + "description": "Category for the memory (preferences, work, hobbies, family)" + }, + }, + "required": ["content"] + } + }, + { + "type": "function", + "name": "reflect", + "description": "Synthesize memories to form opinions or answer complex questions requiring reasoning over past context.", + "parameters": { + "type": "object", + "properties": { + "query": {"type": "string", "description": "The question or topic to reflect on"} + }, + "required": ["query"] + } + } +] + + +def get_credential(): + """Get the appropriate credential based on environment.""" + is_azure = any([ + os.environ.get("AZURE_FUNCTIONS_ENVIRONMENT"), + os.environ.get("CONTAINER_APP_NAME"), + os.environ.get("MSI_ENDPOINT"), + os.environ.get("IDENTITY_ENDPOINT") + ]) + + if is_azure: + return ManagedIdentityCredential() + + try: + cred = AzureCliCredential() + cred.get_token("https://cognitiveservices.azure.com/.default") + return cred + except Exception: + return ChainedTokenCredential( + AzureCliCredential(), + ManagedIdentityCredential() + ) + + +def handle_tool_call(tool_name: str, arguments: dict, hindsight: HindsightClient) -> str: + """Execute a tool call and return the result.""" + try: + if tool_name == "recall": + result = hindsight.recall( + query=arguments.get("query", ""), + max_tokens=arguments.get("max_tokens", 4096), + budget=arguments.get("budget", "mid"), + ) + elif tool_name == "retain": + result = hindsight.retain( + content=arguments.get("content", ""), + context=arguments.get("context", "general"), + ) + elif tool_name == "reflect": + result = hindsight.reflect( + query=arguments.get("query", ""), + ) + else: + result = {"error": f"Unknown tool: {tool_name}"} + except Exception as e: + logger.error(f"Tool execution error: {e}") + result = {"error": str(e)} + + return json.dumps(result, indent=2) + + +def chat(user_input: str, conversation_id: str = None): + """ + Chat with the Hindsight agent using the Responses API. + + Uses function tools with client-side execution via HindsightClient. + """ + + logger.info(f"Connecting to Foundry project...") + logger.info(f"Endpoint: {settings.project_endpoint}") + logger.info(f"Model: {settings.model_deployment_name}") + + credential = get_credential() + client = AIProjectClient(credential=credential, endpoint=settings.project_endpoint) + openai_client = client.get_openai_client() + + # Initialize HindsightClient for tool execution + hindsight = HindsightClient(settings.mcp_base_url, settings.default_bank_id) + + print(f"\nπŸ’¬ User: {user_input}") + + # Use agent_reference - agent has function tools defined in Foundry + # New API: just name, no version needed in agent_reference + response = openai_client.responses.create( + extra_body={"agent": {"name": AGENT_NAME, "type": "agent_reference"}}, + input=user_input, + ) + + # Helper to get type safely from object or dict + def get_type(item): + if isinstance(item, dict): + return item.get('type') + return getattr(item, 'type', None) + + # Handle tool calls in a loop + max_iterations = 15 + iteration = 0 + + while iteration < max_iterations: + iteration += 1 + + # Check for function calls + tool_calls = [item for item in getattr(response, 'output', []) if get_type(item) == 'function_call'] + + if not tool_calls: + break + + # Process tool calls - execute locally OR acknowledge server-side completion + tool_outputs = [] + for item in tool_calls: + if isinstance(item, dict): + tool_name = item.get('name') + arguments = json.loads(item.get('arguments', '{}')) + call_id = item.get('call_id') + # status = item.get('status') + else: + tool_name = item.name + arguments = json.loads(item.arguments) if item.arguments else {} + call_id = item.call_id + # status variable unused + # status = getattr(item, 'status', None) + + print(f" πŸ”§ Tool: {tool_name}({json.dumps(arguments)[:80]})") + + # Always execute tools client-side to get real results + # Server-side status=completed means it ran there too, but we need the actual output + # for the model to use in its response + result = handle_tool_call(tool_name, arguments, hindsight) + print(f" πŸ“€ Result: {result[:100]}...") + + tool_outputs.append({ + "type": "function_call_output", + "call_id": call_id, + "output": result + }) + + if not tool_outputs: + break + + # Continue the conversation with tool outputs + response = openai_client.responses.create( + extra_body={"agent": {"name": AGENT_NAME, "type": "agent_reference"}}, + input=tool_outputs, + previous_response_id=response.id, + ) + + # Extract final text response + output_text = getattr(response, 'output_text', '') or '' + + if output_text.strip(): + print(f"\nπŸ€– Agent: {output_text}") + return output_text, conversation_id + + # Fallback: extract from output items + for item in getattr(response, 'output', []): + try: + if isinstance(item, dict): + item_type = item.get('type') + content = item.get('content', []) + else: + item_type = getattr(item, 'type', None) + content = getattr(item, 'content', []) + + if item_type == 'message': + for c in content: + if isinstance(c, dict): + c_type = c.get('type') + text = c.get('text', '') + else: + c_type = getattr(c, 'type', None) + text = getattr(c, 'text', '') + + if c_type == 'output_text' and text.strip(): + print(f"\nπŸ€– Agent: {text}") + return text, conversation_id + except Exception: + continue + + # Debug fallback + logger.debug(f"Final response structure: {response}") + result = "[No response generated]" + print(f"\nπŸ€– Agent: {result}") + return result, conversation_id + + +def interactive_mode(): + """Run an interactive chat session.""" + print("\n" + "="*60) + print("🧠 Hindsight Memory Agent") + print("="*60) + print("Type 'quit' to exit, 'clear' to reset conversation") + print() + + conversation_id = None + while True: + try: + user_input = input("\nYou: ").strip() + if not user_input: + continue + if user_input.lower() == 'quit': + print("\nGoodbye! πŸ‘‹") + break + if user_input.lower() == 'clear': + conversation_id = None + print("Conversation cleared.") + continue + + response, conversation_id = chat(user_input, conversation_id) + + except KeyboardInterrupt: + print("\n\nGoodbye! πŸ‘‹") + break + except Exception as e: + print(f"\n❌ Error: {e}") + import traceback + traceback.print_exc() + + +if __name__ == "__main__": + import argparse + + parser = argparse.ArgumentParser(description="Hindsight Memory Agent") + parser.add_argument("--message", "-m", type=str, help="Single message to send") + parser.add_argument("--interactive", "-i", action="store_true", help="Interactive mode") + args = parser.parse_args() + + if args.message: + chat(args.message) + elif args.interactive: + interactive_mode() + else: + # Default: simple demo + chat("Hello! My name is Jacob and I love building AI agents.") diff --git a/hindsight_agent_api.py b/hindsight_agent_api.py new file mode 100644 index 00000000..0ea31316 --- /dev/null +++ b/hindsight_agent_api.py @@ -0,0 +1,331 @@ +""" +Hindsight Agent API - FastAPI wrapper for remote access. +Deploy to Azure Container Apps for uniform Azure infrastructure. + +Endpoints: +- POST /chat - Send a message and get a response +- POST /chat/stream - Streaming response (future) +- GET /health - Health check +""" + +import os +import sys +import json +import logging +from typing import Optional +from contextlib import asynccontextmanager + +from fastapi import FastAPI, HTTPException, Depends, Security +from fastapi.security.api_key import APIKeyHeader +from fastapi.middleware.cors import CORSMiddleware +from pydantic import BaseModel +from starlette.status import HTTP_403_FORBIDDEN + +# Avoid sys.path manipulation if possible, but keeping for now as requested to fix +# Ideally this should be handled by package installation +if os.path.dirname(os.path.abspath(__file__)) not in sys.path: + sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) + +from azure.ai.projects import AIProjectClient +from azure.identity import ManagedIdentityCredential, AzureCliCredential, ChainedTokenCredential +from config import get_config +from hindsight_client import HindsightClient + +logging.basicConfig(level=logging.INFO) +logger = logging.getLogger(__name__) + +# Global clients (initialized on startup) +_ai_client: Optional[AIProjectClient] = None +_hindsight_client: Optional[HindsightClient] = None +_settings = None + +AGENT_NAME = "Hindsight-v3" +AGENT_VERSION = "2" # Uses gpt-5.2-chat deployment + +AGENT_VERSION = "2" # Uses gpt-5.2-chat deployment + +# Security +API_KEY_NAME = "X-API-Key" +api_key_header = APIKeyHeader(name=API_KEY_NAME, auto_error=False) + +async def get_api_key(api_key_header: str = Security(api_key_header)): + """Validate API Key.""" + expected_key = os.environ.get("HINDSIGHT_API_KEY") + if not expected_key: + logger.warning("HINDSIGHT_API_KEY not set in environment - insecure mode!") + return None + + if api_key_header == expected_key: + return api_key_header + + raise HTTPException( + status_code=HTTP_403_FORBIDDEN, detail="Could not validate credentials" + ) +def get_credential(): + """Get the appropriate credential based on environment.""" + is_azure = any([ + os.environ.get("AZURE_FUNCTIONS_ENVIRONMENT"), + os.environ.get("CONTAINER_APP_NAME"), + os.environ.get("MSI_ENDPOINT"), + os.environ.get("IDENTITY_ENDPOINT") + ]) + + if is_azure: + logger.info("Using ManagedIdentityCredential for Azure environment") + return ManagedIdentityCredential() + + logger.info("Using AzureCliCredential for local environment") + try: + cred = AzureCliCredential() + cred.get_token("https://cognitiveservices.azure.com/.default") + return cred + except Exception: + return ChainedTokenCredential( + AzureCliCredential(), + ManagedIdentityCredential() + ) + + +@asynccontextmanager +async def lifespan(app: FastAPI): + """Initialize clients on startup.""" + global _ai_client, _hindsight_client, _settings + + logger.info("Initializing Hindsight Agent API...") + _settings = get_config() + + credential = get_credential() + _ai_client = AIProjectClient( + credential=credential, + endpoint=_settings.project_endpoint + ) + _hindsight_client = HindsightClient( + _settings.mcp_base_url, + _settings.default_bank_id + ) + + logger.info(f"Connected to Foundry project: {_settings.project_endpoint}") + logger.info(f"Using agent: {AGENT_NAME}") + + yield + + # Cleanup + if _hindsight_client: + _hindsight_client.close() + logger.info("Hindsight Agent API shutdown complete") + + +app = FastAPI( + title="Hindsight Agent API", + description="Remote access to the Hindsight Memory Agent", + version="1.0.0", + lifespan=lifespan +) + +# CORS for web clients +# Restrict to specific origins in production +ALLOWED_ORIGINS = os.environ.get("ALLOWED_ORIGINS", "*").split(",") + +app.add_middleware( + CORSMiddleware, + allow_origins=ALLOWED_ORIGINS, + allow_credentials=True, + allow_methods=["GET", "POST", "OPTIONS"], + allow_headers=["*"], +) + + +# Request/Response models +class ChatRequest(BaseModel): + message: str + conversation_id: Optional[str] = None + + +class ToolCall(BaseModel): + name: str + arguments: dict + result_preview: str + + +class ChatResponse(BaseModel): + response: str + conversation_id: Optional[str] = None + tool_calls: list[ToolCall] = [] + + +class HealthResponse(BaseModel): + status: str + agent: str + project_endpoint: str + + +# Helper functions +def handle_tool_call(tool_name: str, arguments: dict) -> str: + """Execute a tool call and return the result.""" + try: + if tool_name == "recall": + result = _hindsight_client.recall( + query=arguments.get("query", ""), + max_tokens=arguments.get("max_tokens", 4096), + budget=arguments.get("budget", "mid"), + ) + elif tool_name == "retain": + result = _hindsight_client.retain( + content=arguments.get("content", ""), + context=arguments.get("context", "general"), + ) + elif tool_name == "reflect": + result = _hindsight_client.reflect( + query=arguments.get("query", ""), + ) + else: + result = {"error": f"Unknown tool: {tool_name}"} + except Exception as e: + logger.error(f"Tool execution error: {e}", exc_info=True) + result = {"error": "An internal error occurred during tool execution."} + + return json.dumps(result, indent=2) + + +def process_chat(user_input: str) -> tuple[str, list[ToolCall]]: + """Process a chat message and return response with tool calls.""" + + openai_client = _ai_client.get_openai_client() + + # Create initial response using agent reference (new API: name only, no version) + response = openai_client.responses.create( + extra_body={"agent": {"name": AGENT_NAME, "type": "agent_reference"}}, + input=user_input, + ) + + def get_type(item): + if isinstance(item, dict): + return item.get('type') + return getattr(item, 'type', None) + + # Handle tool calls in a loop + max_iterations = 10 + iteration = 0 + all_tool_calls = [] + + while iteration < max_iterations: + iteration += 1 + + tool_calls = [item for item in getattr(response, 'output', []) if get_type(item) == 'function_call'] + + if not tool_calls: + break + + tool_outputs = [] + for item in tool_calls: + if isinstance(item, dict): + tool_name = item.get('name') + arguments = json.loads(item.get('arguments', '{}')) + call_id = item.get('call_id') + else: + tool_name = item.name + arguments = json.loads(item.arguments) if item.arguments else {} + call_id = item.call_id + + logger.info(f"Tool call: {tool_name}({json.dumps(arguments)[:80]})") + result = handle_tool_call(tool_name, arguments) + + all_tool_calls.append(ToolCall( + name=tool_name, + arguments=arguments, + result_preview=result[:200] + "..." if len(result) > 200 else result + )) + + tool_outputs.append({ + "type": "function_call_output", + "call_id": call_id, + "output": result + }) + + if not tool_outputs: + break + + response = openai_client.responses.create( + extra_body={"agent": {"name": AGENT_NAME, "type": "agent_reference"}}, + input=tool_outputs, + previous_response_id=response.id, + ) + + # Extract final text response + output_text = getattr(response, 'output_text', '') or '' + + if not output_text.strip(): + for item in getattr(response, 'output', []): + try: + item_type = getattr(item, 'type', None) + if item_type == 'message': + for c in getattr(item, 'content', []): + c_type = getattr(c, 'type', None) + if c_type == 'output_text': + output_text = getattr(c, 'text', '') + break + except Exception: + continue + + return output_text or "[No response generated]", all_tool_calls + + +# API Endpoints +@app.get("/health", response_model=HealthResponse) +async def health_check(): + """Health check endpoint.""" + return HealthResponse( + status="healthy", + agent=AGENT_NAME, + project_endpoint=_settings.project_endpoint if _settings else "not initialized" + ) + + +@app.post("/chat", response_model=ChatResponse) +async def chat(request: ChatRequest, api_key: str = Depends(get_api_key)): + """ + Send a message to the Hindsight agent. + + The agent will: + - Use recall to search memories if asking about past info + - Use retain to store new facts proactively + - Use reflect for opinion/synthesis questions + """ + if not _ai_client: + raise HTTPException(status_code=503, detail="Service not initialized") + + try: + response_text, tool_calls = process_chat(request.message) + + return ChatResponse( + response=response_text, + conversation_id=request.conversation_id, + tool_calls=tool_calls + ) + except Exception as e: + logger.error(f"Chat error: {e}", exc_info=True) + # return generic error to client + raise HTTPException(status_code=500, detail="An internal server error occurred.") + + +@app.get("/") +async def root(): + """Root endpoint with API info.""" + return { + "name": "Hindsight Agent API", + "version": "1.0.0", + "endpoints": { + "chat": "POST /chat - Send a message", + "health": "GET /health - Health check" + }, + "docs": "/docs" + } + + +if __name__ == "__main__": + import uvicorn + + port = int(os.environ.get("PORT", 8080)) + host = os.environ.get("HOST", "0.0.0.0") + + uvicorn.run(app, host=host, port=port) diff --git a/hindsight_client.py b/hindsight_client.py new file mode 100644 index 00000000..a2c2e830 --- /dev/null +++ b/hindsight_client.py @@ -0,0 +1,390 @@ +""" +Optimized Hindsight Memory API Client. + +Features: +- Connection pooling via requests.Session +- Exponential backoff retry logic +- Proper timeout handling +- Singleton pattern for efficiency +- Multi-bank support +""" + +import logging +import time +from functools import lru_cache +from typing import Optional, Dict, Any, List + +import requests +from requests.adapters import HTTPAdapter +from urllib3.util.retry import Retry + +logger = logging.getLogger(__name__) + + +class HindsightClientError(Exception): + """Custom exception for Hindsight API errors.""" + pass + + +class HindsightClient: + """ + High-performance client for Hindsight Memory API. + + Uses connection pooling and retry logic for optimal performance. + Thread-safe singleton per base_url. + """ + + _instances: Dict[str, "HindsightClient"] = {} + + def __new__(cls, base_url: str, default_bank_id: str = "hindsight_agent_bank"): + """Singleton pattern - reuse client instances per base_url.""" + if base_url not in cls._instances: + instance = super().__new__(cls) + instance._initialized = False + cls._instances[base_url] = instance + return cls._instances[base_url] + + def __init__(self, base_url: str, default_bank_id: str = "hindsight_agent_bank"): + if self._initialized: + # Update default bank if needed + self.default_bank_id = default_bank_id + return + + self.base_url = base_url.rstrip('/') + self.default_bank_id = default_bank_id + + # Create session with connection pooling + self.session = requests.Session() + + # Configure retry strategy with exponential backoff + retry_strategy = Retry( + total=3, + backoff_factor=0.5, # 0.5, 1.0, 2.0 seconds + status_forcelist=[429, 500, 502, 503, 504], + allowed_methods=["GET", "POST", "PUT", "DELETE"], + raise_on_status=False + ) + + adapter = HTTPAdapter( + max_retries=retry_strategy, + pool_connections=10, + pool_maxsize=20 + ) + + self.session.mount("http://", adapter) + self.session.mount("https://", adapter) + + # Default headers + self.session.headers.update({ + "Content-Type": "application/json", + "Accept": "application/json", + "User-Agent": "HindsightAgent/2.0" + }) + + # Default timeouts + self.default_timeout = 30 + self.reflect_timeout = 120 # Reflect operations take longer + + self._initialized = True + logger.info(f"HindsightClient initialized: {self.base_url}") + + def _get_bank_url(self, bank_id: Optional[str] = None) -> str: + """Get the base URL for a memory bank.""" + target_bank = bank_id or self.default_bank_id + return f"{self.base_url}/v1/default/banks/{target_bank}" + + def _make_request( + self, + method: str, + endpoint: str, + payload: Optional[Dict] = None, + timeout: Optional[int] = None, + bank_id: Optional[str] = None + ) -> Dict[str, Any]: + """ + Make an HTTP request with proper error handling. + + Args: + method: HTTP method (GET, POST, etc.) + endpoint: API endpoint (e.g., /memories/recall) + payload: JSON payload for POST requests + timeout: Request timeout in seconds + bank_id: Target memory bank + + Returns: + JSON response as dict + + Raises: + HindsightClientError: On API errors + """ + url = f"{self._get_bank_url(bank_id)}{endpoint}" + timeout = timeout or self.default_timeout + + try: + if method.upper() == "GET": + response = self.session.get(url, timeout=timeout) + elif method.upper() == "POST": + response = self.session.post(url, json=payload, timeout=timeout) + elif method.upper() == "DELETE": + response = self.session.delete(url, timeout=timeout) + else: + raise ValueError(f"Unsupported method: {method}") + + # Check for errors + if response.status_code >= 400: + error_msg = f"API error {response.status_code}: {response.text[:200]}" + logger.error(error_msg) + return {"error": error_msg, "status_code": response.status_code} + + return response.json() + + except requests.exceptions.Timeout: + error_msg = f"Request timed out after {timeout}s: {url}" + logger.error(error_msg) + return {"error": error_msg, "timeout": True} + + except requests.exceptions.ConnectionError as e: + error_msg = f"Connection error: {str(e)}" + logger.error(error_msg) + return {"error": error_msg, "connection_error": True} + + except json.JSONDecodeError: + error_msg = "Invalid JSON response from API" + logger.error(error_msg) + return {"error": error_msg, "response_text": response.text[:200]} + + except Exception as e: + # Don't leak internal details + logger.exception(f"Unexpected error calling {endpoint}") + return {"error": "An unexpected error occurred."} + + def recall( + self, + query: str, + max_tokens: int = 4096, + budget: str = "mid", + bank_id: Optional[str] = None + ) -> Dict[str, Any]: + """ + Search and retrieve memories from long-term storage. + + Args: + query: Natural language search query + max_tokens: Maximum tokens to return + budget: Search budget (low/mid/high) - affects thoroughness + bank_id: Specific memory bank to query + + Returns: + Dict with 'results' list of matching memories + """ + payload = { + "query": query, + "max_tokens": max_tokens, + "budget": budget + } + + result = self._make_request( + "POST", + "/memories/recall", + payload=payload, + bank_id=bank_id + ) + + # Ensure results key exists + if "error" not in result and "results" not in result: + result["results"] = [] + + return result + + def retain( + self, + content: str, + context: str = "general", + bank_id: Optional[str] = None + ) -> Dict[str, Any]: + """ + Store new information to long-term memory. + + Args: + content: The fact/memory to store + context: Category for the memory + bank_id: Specific memory bank to store in + + Returns: + Dict with storage confirmation + """ + payload = { + "items": [ + {"content": content, "context": context} + ] + } + + return self._make_request( + "POST", + "/memories", + payload=payload, + bank_id=bank_id + ) + + def retain_batch( + self, + items: List[Dict[str, str]], + bank_id: Optional[str] = None + ) -> Dict[str, Any]: + """ + Store multiple memories in a single request. + + Args: + items: List of dicts with 'content' and optional 'context' + bank_id: Specific memory bank to store in + + Returns: + Dict with storage confirmation + """ + # Ensure each item has context + normalized_items = [ + {"content": item["content"], "context": item.get("context", "general")} + for item in items + ] + + payload = {"items": normalized_items} + + return self._make_request( + "POST", + "/memories", + payload=payload, + bank_id=bank_id + ) + + def reflect( + self, + query: str, + budget: str = "low", + bank_id: Optional[str] = None + ) -> Dict[str, Any]: + """ + Synthesize memories to form opinions or answer complex questions. + + Args: + query: Topic to reflect on + budget: Reflection budget (affects depth) + bank_id: Specific memory bank to use + + Returns: + Dict with reflection result + """ + payload = { + "query": query, + "budget": budget + } + + return self._make_request( + "POST", + "/memories/reflect", + payload=payload, + timeout=self.reflect_timeout, + bank_id=bank_id + ) + + def health_check(self) -> bool: + """ + Check if the Hindsight API is reachable. + + Returns: + True if API is healthy, False otherwise + """ + try: + # 1. Simple reachability check + response = self.session.get( + f"{self.base_url}/health", + timeout=5 + ) + if response.status_code != 200: + logger.warning(f"Health check failed with status {response.status_code}") + return False + + # 2. Verify basic auth/connectivity if possible (optional, depending on API) + # For Hindsight, listing banks is a good lightweight check + # response = self.session.get(f"{self.base_url}/v1/default/banks", timeout=5) + # return response.status_code == 200 + + return True + except Exception: + return False + + def list_banks(self) -> Dict[str, Any]: + """ + List all available memory banks. + + Returns: + Dict with list of banks + """ + try: + response = self.session.get( + f"{self.base_url}/v1/default/banks", + timeout=self.default_timeout + ) + response.raise_for_status() + return response.json() + except Exception as e: + return {"error": str(e), "banks": []} + + def close(self): + """Close the HTTP session.""" + self.session.close() + # Remove from instances cache + if self.base_url in self._instances: + del self._instances[self.base_url] + + def __enter__(self): + return self + + def __exit__(self, exc_type, exc_val, exc_tb): + self.close() + + +@lru_cache(maxsize=1) +def get_default_client() -> HindsightClient: + """ + Get a cached default HindsightClient instance. + + Uses configuration from config.py. + """ + from config import get_config + settings = get_config() + return HindsightClient( + base_url=settings.mcp_base_url, + default_bank_id=settings.default_bank_id + ) + + +# Memory bank constants for type safety +class MemoryBanks: + """Available memory banks.""" + USER_PREFERENCES = "user_preferences" + PROJECT_CONTEXT = "project_context" + KNOWLEDGE_BASE = "knowledge_base" + HINDSIGHT_AGENT = "hindsight_agent_bank" + + +if __name__ == "__main__": + # Test the client + from config import get_config + + settings = get_config() + client = HindsightClient(settings.mcp_base_url, settings.default_bank_id) + + print(f"Testing HindsightClient...") + print(f"Base URL: {client.base_url}") + print(f"Default Bank: {client.default_bank_id}") + + # Health check + if client.health_check(): + print("βœ“ API is healthy") + else: + print("βœ— API health check failed") + + # Test recall + print("\nTesting recall...") + result = client.recall("test query", max_tokens=100) + print(f"Recall result: {json.dumps(result, indent=2)[:200]}") diff --git a/infra/agent-api.bicep b/infra/agent-api.bicep new file mode 100644 index 00000000..aa4dea61 --- /dev/null +++ b/infra/agent-api.bicep @@ -0,0 +1,192 @@ +// Hindsight Agent API - Azure Container Apps Infrastructure +// Bicep template for deploying the agent API alongside existing hindsight-api + +@description('Location for all resources') +param location string = 'centralus' + +@description('Resource group containing the AI Project') +param aiProjectResourceGroup string = 'jacob-1216-resource' + +@description('Name of the AI resource for RBAC') +param aiResourceName string = 'jacob-1216-resource' + +@description('Container image tag') +param imageTag string = 'latest' + +// Naming +var containerAppName = 'hindsight-agent-api' +var containerAppEnvName = 'hindsight-env' +var acrName = 'hindsightacr${uniqueString(resourceGroup().id)}' +var logAnalyticsName = 'hindsight-logs' + +// Configuration +// Configuration +param hindsightApiUrl string = 'https://hindsight-api.politebay-1635b4f9.centralus.azurecontainerapps.io' +param projectEndpoint string = 'https://jacob-1216-resource.services.ai.azure.com/api/projects/jacob-1216' +param aiResourceId string = '/subscriptions/${subscription().subscriptionId}/resourceGroups/${aiProjectResourceGroup}/providers/Microsoft.CognitiveServices/accounts/${aiResourceName}' +param allowedOrigins array = [] + +var defaultBankId = 'hindsight_agent_bank' + +// Log Analytics Workspace +resource logAnalytics 'Microsoft.OperationalInsights/workspaces@2022-10-01' = { + name: logAnalyticsName + location: location + properties: { + sku: { + name: 'PerGB2018' + } + retentionInDays: 90 + } +} + +// Container Registry +resource acr 'Microsoft.ContainerRegistry/registries@2023-07-01' = { + name: acrName + location: location + sku: { + name: 'Basic' + } + properties: { + adminUserEnabled: false + } +} + +// Container Apps Environment +resource containerAppEnv 'Microsoft.App/managedEnvironments@2023-05-01' = { + name: containerAppEnvName + location: location + properties: { + appLogsConfiguration: { + destination: 'log-analytics' + logAnalyticsConfiguration: { + customerId: logAnalytics.properties.customerId + sharedKey: logAnalytics.listKeys().primarySharedKey + } + } + } +} + +// Container App - Hindsight Agent API +resource containerApp 'Microsoft.App/containerApps@2023-05-01' = { + name: containerAppName + location: location + identity: { + type: 'SystemAssigned' + } + properties: { + managedEnvironmentId: containerAppEnv.id + configuration: { + ingress: { + external: true + targetPort: 8080 + transport: 'http' + corsPolicy: { + allowedOrigins: empty(allowedOrigins) ? ['*'] : allowedOrigins + allowedMethods: ['GET', 'POST', 'OPTIONS'] + allowedHeaders: ['*'] + allowCredentials: true + } + } + registries: [ + { + server: acr.properties.loginServer + identity: containerApp.identity.principalId + } + ] + } + template: { + containers: [ + { + name: containerAppName + image: '${acr.properties.loginServer}/${containerAppName}:${imageTag}' + resources: { + cpu: json('0.5') + memory: '1Gi' + } + env: [ + { + name: 'HINDSIGHT_PROJECT_ENDPOINT' + value: projectEndpoint + } + { + name: 'HINDSIGHT_MCP_BASE_URL' + value: hindsightApiUrl + } + { + name: 'HINDSIGHT_DEFAULT_BANK_ID' + value: defaultBankId + } + { + name: 'PORT' + value: '8080' + } + ] + probes: [ + { + type: 'Liveness' + httpGet: { + path: '/health' + port: 8080 + } + initialDelaySeconds: 10 + periodSeconds: 30 + } + { + type: 'Readiness' + httpGet: { + path: '/health' + port: 8080 + } + initialDelaySeconds: 5 + periodSeconds: 10 + } + ] + } + ] + scale: { + minReplicas: 1 + maxReplicas: 3 + rules: [ + { + name: 'http-scaling' + http: { + metadata: { + concurrentRequests: '10' + } + } + } + ] + } + } + } +} + +// Role Assignment - Cognitive Services User on AI Project +// This requires the AI resource to be in the same subscription +resource cognitiveServicesRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = { + name: guid(containerApp.id, 'CognitiveServicesUser', aiResourceName) + scope: resource(aiResourceId) + properties: { + roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', 'a97b65f3-24c7-4388-baec-2e87135dc908') // Cognitive Services User + principalId: containerApp.identity.principalId + principalType: 'ServicePrincipal' + } +} + +// Role Assignment - AcrPull for Container App +resource acrPullRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = { + name: guid(containerApp.id, 'AcrPull', acr.id) + scope: acr + properties: { + roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '7f951dda-4ed3-4680-a7ca-43fe172d538d') // AcrPull + principalId: containerApp.identity.principalId + principalType: 'ServicePrincipal' + } +} + +// Outputs +output containerAppUrl string = 'https://${containerApp.properties.configuration.ingress.fqdn}' +output containerAppName string = containerApp.name +output acrLoginServer string = acr.properties.loginServer +output principalId string = containerApp.identity.principalId diff --git a/requirements-agent-api.txt b/requirements-agent-api.txt new file mode 100644 index 00000000..93c254a3 --- /dev/null +++ b/requirements-agent-api.txt @@ -0,0 +1,13 @@ +# Hindsight Agent API Dependencies +fastapi>=0.109.0 +uvicorn[standard]>=0.27.0 +pydantic>=2.0.0 + +# Azure SDKs +azure-ai-projects>=1.0.0b1 +azure-identity>=1.15.0 +openai>=1.0.0 + +# HTTP client with connection pooling +requests>=2.31.0 +urllib3>=2.0.0