Feature: Configurable Embedding Infrastructure — Local (fastembed) + API (OpenAI) with Config Flag

## Overview

Add configurable embedding infrastructure to Hermes Agent, supporting both local models (fastembed) and API-based embedders (OpenAI). This is a shared capability needed by multiple features: cognitive memory recall (#509), semantic codebase search (#489), and future similarity-based operations.

**Parent tracking issue:** #509
**Also enables:** #489 (Semantic Codebase Search)

---

## What to Build

### Embedding Module

New file: `agent/embeddings.py`

```python
from typing import Protocol

class Embedder(Protocol):
    def embed_text(self, text: str) -> list[float]: ...
    def embed_texts(self, texts: list[str]) -> list[list[float]]: ...
    @property
    def dimensions(self) -> int: ...

class FastEmbedEmbedder:
    """Local embeddings via fastembed (all-MiniLM-L6-v2, 384 dims).
    ~100MB model, downloaded on first use.
    No API key needed, private, fast (~5ms per embed).
    """

class OpenAIEmbedder:
    """API embeddings via OpenAI (text-embedding-3-small, 1536 dims).
    Uses existing OpenAI client from config.
    Higher quality but costs $0.02/1M tokens.
    """

def get_embedder(config: dict) -> Embedder:
    """Factory: returns configured embedder based on config.yaml."""
```

### Utility Functions

```python
def cosine_similarity(a: list[float], b: list[float]) -> float:
    """Compute cosine similarity between two vectors."""

def cosine_similarity_matrix(vectors: list[list[float]]) -> list[list[float]]:
    """NxN pairwise similarity matrix for dedup."""
```

### Configuration

```yaml
# ~/.hermes/config.yaml
embeddings:
  provider: "local"          # "local" or "openai"
  model: "all-MiniLM-L6-v2"  # for local
  # model: "text-embedding-3-small"  # for openai
```

---

## Key Design Decisions

- **Lazy initialization** — Model loaded on first embed call, not at startup
- **Batch support** — `embed_texts()` for efficiency (single API call / single model forward pass)
- **Cosine similarity helper** — Utility function for comparing embeddings
- **Dimension-aware** — Embedder reports its dimension count so storage can auto-configure
- **Optional dependency** — fastembed only required when `provider: local`; graceful error otherwise

---

## Dependencies

- `fastembed` (optional) — lightweight, Apache 2.0, ~5MB package + ~100MB model on first use
- `openai` (already a dependency) — for API embeddings
- Add fastembed as optional: `pip install hermes-agent[embeddings]`

---

## Files to Create/Change

- `agent/embeddings.py` (new) — Embedder protocol, FastEmbed + OpenAI implementations, factory
- `pyproject.toml` — Add fastembed as optional dependency
- `tests/test_embeddings.py` (new) — Unit tests with mocked embedders

---

## Acceptance Criteria

- [ ] `get_embedder()` returns correct embedder based on config
- [ ] Local embedder works without API key (fastembed)
- [ ] OpenAI embedder works with existing OpenAI config
- [ ] Batch embedding (`embed_texts`) works for both providers
- [ ] Graceful error if fastembed not installed but local provider configured
- [ ] Cosine similarity utility function included and tested
- [ ] No startup-time impact (lazy loading)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Configurable Embedding Infrastructure — Local (fastembed) + API (OpenAI) with Config Flag #675

Overview

What to Build

Embedding Module

Utility Functions

Configuration

Key Design Decisions

Dependencies

Files to Create/Change

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Configurable Embedding Infrastructure — Local (fastembed) + API (OpenAI) with Config Flag #675

Description

Overview

What to Build

Embedding Module

Utility Functions

Configuration

Key Design Decisions

Dependencies

Files to Create/Change

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions