feat(ai): Librarian Agent — tool-using catalog discovery (AI-Agent-3) by mrviduus · Pull Request #394 · mrviduus/textstack

mrviduus · 2026-06-23T23:48:17Z

Agent 3 of 3 (architect → backend → adversarial QA → fix cycle). Design: docs/04-dev/agents-roadmap.md §4.

Why

Catalog discovery today is keyword search. The Librarian turns a natural-language request — "books like 1984 about surveillance, in English, under 300 pages" — into ranked, reasoned recommendations, reaching outside the library only when it's thin.

What

ReAct loop on the existing AgentLoop (caps MaxSteps=6/CostCapUsd=0.04, persisted transcript). No new framework.
Tools wrap existing search: search_library (FTS provider) + search_library_semantic (AI-057 hybrid = the "books like X" mechanism, no new index) + reused Open Library tools for external discovery.
Constraints (language, "under N pages") are deterministic post-filters over tool metadata (catalog has no page column → approxPages ≈ wordCount/275).
Anti-hallucination (the headline, QA-verified solid): every recommendation must come from a tool_result; library recs are fully re-projected from the retrieved row (model can't rename/fabricate a slug/title/author); external recs are re-projected from the harvested Open Library row (authors/year/pages — not the model's echo); symmetric normalized title matching; empty transcript → zero recs.
Recommend-only — external hits are marked suggestions, no ingest (copyright/scope; HITL ingest deferred).
Endpoint POST /me/librarian (auth, rate-limited 8/min, ≥2-char guard before any model call) → { recommendations[{source, editionId?, title, authors, why, …}], reasoning, usedExternal, runId }.
Telemetry: agent_run (agent=librarian, tool_calls_count). Route librarian.agent → gpt-4.1-mini. No migration.

Eval

LibrarianEvalRunner (10 goldens): recall@k + precision@k + F1 (precision/F1 added in the fix round so an agent that floods the 8-cap with irrelevant-but-real books no longer scores green) + hallucination-free rate (DB slug-existence probe). Admin-runnable: POST /admin/ai-quality/librarian/eval.

QA (adversarial) — applied

0 blockers; the runtime anti-hallucination invariant held under attack. Fixed: eval gameable by flooding (added precision/F1); external authors/year/pages were model-trusted (now re-projected from the OL row); endpoint missing min-query guard; brittle/asymmetric external-title matching (now symmetric normalized key).

Verify

dotnet build green · dotnet format clean · 947 unit + 65 AiEvals tests green (incl. flood-resistance, external re-projection, no-search→empty, keyless-semantic→FTS fallback). Needs a real-model run for live recall/precision on gpt-4.1-mini against a seeded catalog (admin endpoint + key).

Deferred

Ingest/HITL confirmation, SSE streaming, a dedicated book-similarity index, user-library personalization. Agent 2 (Tutor) is the remaining roadmap item.

🤖 Generated with Claude Code

Second agent. NL catalog request ("books like X about Y in English under 300p") -> ranked, reasoned recommendations via a ReAct loop on AgentLoop. Tools wrap existing search: search_library (FTS) + search_library_semantic (AI-057 hybrid = "books like X"), and reuse the Open Library tools for external discovery when the library is thin. Constraints (language/length) post-filtered over tool metadata. Anti-hallucination: every rec must come from a tool_result; library recs fully re-projected from the retrieved row (model can't rename/fabricate), external recs re-projected from the harvested OL row (authors/year/pages, not model echo); symmetric normalized title matching; empty transcript -> zero recs. Recommend-only (no ingest). Endpoint POST /me/librarian (auth, 8/min, >=2-char guard). Eval LibrarianEvalRunner: recall@k + precision@k + F1 (anti-flood) + hallucination-free rate; admin POST /admin/ai-quality/librarian/eval. Route librarian.agent -> gpt-4.1-mini. No migration (reuses agent_run). 947 unit + 65 AiEvals green; build + format clean. Design: docs/04-dev/agents-roadmap.md. Deferred: ingest/HITL, SSE, dedicated similarity index, user-library personalization. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

mrviduus merged commit d9a93c8 into main Jun 24, 2026
5 checks passed

mrviduus deleted the feat/librarian-agent branch June 24, 2026 04:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ai): Librarian Agent — tool-using catalog discovery (AI-Agent-3)#394

feat(ai): Librarian Agent — tool-using catalog discovery (AI-Agent-3)#394
mrviduus merged 1 commit into
mainfrom
feat/librarian-agent

mrviduus commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mrviduus commented Jun 23, 2026

Why

What

Eval

QA (adversarial) — applied

Verify

Deferred

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant