vectorize-io · nicoloboschi · Jan 1, 2026 · Jan 1, 2026
diff --git a/README.md b/README.md
@@ -81,7 +81,7 @@ docker run --rm -it --pull always -p 8888:8888 -p 9999:9999 \
   ghcr.io/vectorize-io/hindsight:latest
 ```
 
-You can modify the LLM provider by setting `HINDSIGHT_API_LLM_PROVIDER`. Valid options are `gemini`, `groq`,  `ollama`, and `openai`. The documentation provides more details on [supported models](https://hindsight.vectorize.io/developer/models).
+You can modify the LLM provider by setting `HINDSIGHT_API_LLM_PROVIDER`. Valid options are `openai`, `anthropic`, `gemini`, `groq`, `ollama`, and `lmstudio`. The documentation provides more details on [supported models](https://hindsight.vectorize.io/developer/models).
 
 API: http://localhost:8888  
 UI: http://localhost:9999

diff --git a/hindsight-api/README.md b/hindsight-api/README.md
@@ -80,7 +80,7 @@ Configure via environment variables:
 | Variable | Description | Default |
 |----------|-------------|---------|
 | `HINDSIGHT_API_DATABASE_URL` | PostgreSQL connection string | `pg0` (embedded) |
-| `HINDSIGHT_API_LLM_PROVIDER` | `openai`, `groq`, `gemini`, `ollama` | `openai` |
+| `HINDSIGHT_API_LLM_PROVIDER` | `openai`, `anthropic`, `gemini`, `groq`, `ollama`, `lmstudio` | `openai` |
 | `HINDSIGHT_API_LLM_API_KEY` | API key for LLM provider | - |
 | `HINDSIGHT_API_LLM_MODEL` | Model name | `gpt-4o-mini` |
 | `HINDSIGHT_API_HOST` | Server bind address | `0.0.0.0` |

diff --git a/hindsight-docs/docs/developer/configuration.md b/hindsight-docs/docs/developer/configuration.md
@@ -27,10 +27,12 @@ If not provided, the server uses embedded `pg0` — convenient for development b
 
 | Variable | Description | Default |
 |----------|-------------|---------|
-| `HINDSIGHT_API_LLM_PROVIDER` | Provider: `groq`, `openai`, `gemini`, `ollama` | `openai` |
+| `HINDSIGHT_API_LLM_PROVIDER` | Provider: `openai`, `anthropic`, `gemini`, `groq`, `ollama`, `lmstudio` | `openai` |
 | `HINDSIGHT_API_LLM_API_KEY` | API key for LLM provider | - |
 | `HINDSIGHT_API_LLM_MODEL` | Model name | `gpt-5-mini` |
 | `HINDSIGHT_API_LLM_BASE_URL` | Custom LLM endpoint | Provider default |
+| `HINDSIGHT_API_LLM_MAX_CONCURRENT` | Max concurrent LLM requests | `32` |
+| `HINDSIGHT_API_LLM_TIMEOUT` | LLM request timeout in seconds | `120` |
 
 **Provider Examples**
 
@@ -50,10 +52,20 @@ export HINDSIGHT_API_LLM_PROVIDER=gemini
 export HINDSIGHT_API_LLM_API_KEY=xxxxxxxxxxxx
 export HINDSIGHT_API_LLM_MODEL=gemini-2.0-flash
 
+# Anthropic
+export HINDSIGHT_API_LLM_PROVIDER=anthropic
+export HINDSIGHT_API_LLM_API_KEY=sk-ant-xxxxxxxxxxxx
+export HINDSIGHT_API_LLM_MODEL=claude-sonnet-4-20250514
+
 # Ollama (local, no API key)
 export HINDSIGHT_API_LLM_PROVIDER=ollama
 export HINDSIGHT_API_LLM_BASE_URL=http://localhost:11434/v1
-export HINDSIGHT_API_LLM_MODEL=gpt-oss-20b
+export HINDSIGHT_API_LLM_MODEL=llama3
+
+# LM Studio (local, no API key)
+export HINDSIGHT_API_LLM_PROVIDER=lmstudio
+export HINDSIGHT_API_LLM_BASE_URL=http://localhost:1234/v1
+export HINDSIGHT_API_LLM_MODEL=your-local-model
 
 # OpenAI-compatible endpoint
 export HINDSIGHT_API_LLM_PROVIDER=openai
@@ -109,7 +121,43 @@ export HINDSIGHT_API_RERANKER_TEI_URL=http://localhost:8081
 | `HINDSIGHT_API_HOST` | Bind address | `0.0.0.0` |
 | `HINDSIGHT_API_PORT` | Server port | `8888` |
 | `HINDSIGHT_API_LOG_LEVEL` | Log level: `debug`, `info`, `warning`, `error` | `info` |
-| `HINDSIGHT_API_MCP_ENABLED` | Enable MCP server | `true` |
+| `HINDSIGHT_API_MCP_ENABLED` | Enable MCP server at `/mcp/{bank_id}/` | `true` |
+
+### Retrieval
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `HINDSIGHT_API_GRAPH_RETRIEVER` | Graph retrieval algorithm: `bfs` or `mpfp` | `bfs` |
+
+### Entity Observations
+
+Controls when the system generates entity observations (summaries about entities mentioned in retained content).
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `HINDSIGHT_API_OBSERVATION_MIN_FACTS` | Minimum facts about an entity before generating observations | `5` |
+| `HINDSIGHT_API_OBSERVATION_TOP_ENTITIES` | Max entities to process per retain batch | `5` |
+
+### Local MCP Server
+
+Configuration for the local MCP server (`hindsight-local-mcp` command).
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `HINDSIGHT_API_MCP_LOCAL_BANK_ID` | Memory bank ID for local MCP | `mcp` |
+| `HINDSIGHT_API_MCP_INSTRUCTIONS` | Additional instructions appended to retain/recall tool descriptions | - |
+
+```bash
+# Example: instruct MCP to also store assistant actions
+export HINDSIGHT_API_MCP_INSTRUCTIONS="Also store every action you take, including tool calls and decisions made."
+```
+
+### Performance Optimization
+
+| Variable | Description | Default |
+|----------|-------------|---------|
+| `HINDSIGHT_API_SKIP_LLM_VERIFICATION` | Skip LLM connection check on startup | `false` |
+| `HINDSIGHT_API_LAZY_RERANKER` | Lazy-load reranker model (faster startup) | `false` |
 
 ### Programmatic Configuration
 

diff --git a/hindsight-docs/docs/developer/models.md b/hindsight-docs/docs/developer/models.md
@@ -18,7 +18,7 @@ All local models (embedding, cross-encoder) are automatically downloaded from Hu
 
 Used for fact extraction, entity resolution, opinion generation, and answer synthesis.
 
-**Supported providers:** OpenAI, Gemini, Groq, Ollama, and **any OpenAI-compatible API**
+**Supported providers:** OpenAI, Anthropic, Gemini, Groq, Ollama, LM Studio, and **any OpenAI-compatible API**
 
 :::tip OpenAI-Compatible Providers
 Hindsight works with any provider that exposes an OpenAI-compatible API (e.g., Azure OpenAI). Simply set `HINDSIGHT_API_LLM_PROVIDER=openai` and configure `HINDSIGHT_API_LLM_BASE_URL` to point to your provider's endpoint.
@@ -39,6 +39,8 @@ The following models have been tested and verified to work correctly with Hindsi
 | **OpenAI** | `gpt-4.1-mini` |
 | **OpenAI** | `gpt-4.1-nano` |
 | **OpenAI** | `gpt-4o-mini` |
+| **Anthropic** | `claude-sonnet-4-20250514` |
+| **Anthropic** | `claude-3-5-sonnet-20241022` |
 | **Gemini** | `gemini-3-pro-preview` |
 | **Gemini** | `gemini-2.5-flash` |
 | **Gemini** | `gemini-2.5-flash-lite` |
@@ -67,10 +69,20 @@ export HINDSIGHT_API_LLM_PROVIDER=gemini
 export HINDSIGHT_API_LLM_API_KEY=xxxxxxxxxxxx
 export HINDSIGHT_API_LLM_MODEL=gemini-2.0-flash
 
+# Anthropic
+export HINDSIGHT_API_LLM_PROVIDER=anthropic
+export HINDSIGHT_API_LLM_API_KEY=sk-ant-xxxxxxxxxxxx
+export HINDSIGHT_API_LLM_MODEL=claude-sonnet-4-20250514
+
 # Ollama (local)
 export HINDSIGHT_API_LLM_PROVIDER=ollama
 export HINDSIGHT_API_LLM_BASE_URL=http://localhost:11434/v1
-export HINDSIGHT_API_LLM_MODEL=gpt-oss-20b
+export HINDSIGHT_API_LLM_MODEL=llama3
+
+# LM Studio (local)
+export HINDSIGHT_API_LLM_PROVIDER=lmstudio
+export HINDSIGHT_API_LLM_BASE_URL=http://localhost:1234/v1
+export HINDSIGHT_API_LLM_MODEL=your-local-model
 ```
 
 **Note:** The LLM is the primary bottleneck for retain operations. See [Performance](./performance) for optimization strategies.