Skip to content

feat(rag): split RAG_API_URL into RAG_REST_URL + RAG_MCP_URL#14

Open
Smilez1985 wants to merge 25 commits into
turmyshevd:mainfrom
Smilez1985:feat/split-rag-rest-mcp-urls
Open

feat(rag): split RAG_API_URL into RAG_REST_URL + RAG_MCP_URL#14
Smilez1985 wants to merge 25 commits into
turmyshevd:mainfrom
Smilez1985:feat/split-rag-rest-mcp-urls

Conversation

@Smilez1985
Copy link
Copy Markdown
Contributor

@Smilez1985 Smilez1985 commented May 9, 2026

Summary

Builds on #10 (REST RAG) and #12/#13 (MCP client + auto-registration). A typical rag-core deployment exposes REST on :8765 and an MCP-SSE gateway on :8766 simultaneously, but the bot's previous config locked itself to one transport at a time because both clients shared a single RAG_API_URL and an RAG_TRANSPORT switch. This separates them:

  • RAG_REST_URL — REST endpoint, used by rag_client.py, the query_rag / persist_to_rag LLM tools, and the /rag Telegram command.
  • RAG_MCP_URL — MCP-SSE base URL, used by rag_mcp_client.py and the auto-registered first-class MCP tools (rag_search etc.).

Set one, the other, or both. Empty → graceful no-op as before.

Backwards compatibility

RAG_API_URL + RAG_TRANSPORT=rest|mcp still honored. A compat shim in config.py maps the legacy URL onto whichever transport RAG_TRANSPORT selects (default rest). Existing single-URL deployments keep working with no env edits. RAG_API_URL is still importable as an alias of RAG_REST_URL for any third-party code reading config.

Why

Verified from the rag-core project's own architecture notes that REST and MCP-SSE run as two independent services on the same host. With the prior config the bot could either query REST or speak MCP — never both at once. Splitting the env var lets the LLM use rag_search (over MCP, where it gets a typed schema) AND a human user use /rag … (over REST, where the curated query path lives) on the same deployment.

Test plan

  • Syntax check on all touched files
  • Three config-resolution scenarios validated:
    • both URLs set → both clients active
    • legacy RAG_API_URL + RAG_TRANSPORT=mcp → routes to MCP
    • legacy RAG_API_URL (no transport, default rest) → routes to REST
  • Live smoke test with both URLs set: rag_client.health() returns OK from REST :8765 AND MCPSSEClient.list_tools() returns 7 tools from MCP :8766 in the same process
  • End-to-end via Telegram: /rag … (REST path) and an LLM-driven rag_search call (MCP path) both succeed without restart between them

🤖 Generated with Claude Code

Smilez1985 and others added 25 commits May 9, 2026 00:09
Adds an inline-keyboard model picker so users can switch LLM backends
from Telegram without SSH or env edits, and makes the choice survive
reboots.

User-facing additions
- New /model command. Without args it shows an inline keyboard with
  every preset (gemini, glm, ollama). With an argument (`/model glm`)
  it falls through to the existing /use logic, so the old behaviour
  still works.
- Tapping the 🦙 ollama row queries the configured Ollama server live
  (`/api/tags` + `/api/show`), filters by capabilities containing
  `tools`, and only lists tool-capable models. If none of the
  installed models advertise tool support, falls back to all of them
  with a warning. Includes `◂ Back` button and a graceful unreachable
  state.
- Selecting a model immediately switches the active LiteLLM model and
  acknowledges with the new model name on the E-Ink face.

Persistence
- `LiteLLMConnector.set_model()` now writes the choice to
  `data/active_model.json` (gitignored). On startup the connector
  restores that selection before falling back to `DEFAULT_LITE_PRESET`,
  so reboots and `systemctl restart` no longer reset the user's pick.

Reliability
- Increase the application's HTTP timeouts via `Application.builder()`
  (`read=60`, `write=60`, `connect=30`, `pool=30`). The Pi Zero 2W's
  WiFi can otherwise time out polling Telegram while a long Ollama
  reply is streaming, surfacing as `httpx.ReadError` / `Timed out`.

Config
- Add `OLLAMA_MODEL` and `OLLAMA_API_BASE` env-driven defaults plus an
  `ollama` entry in `LLM_PRESETS`. Default API base is the placeholder
  `http://ollama-server:11434`; set `OLLAMA_API_BASE` in `.env` to
  point at your actual Ollama host.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`BOT_LANGUAGE` was defined in `config.py` and exposed via `.env`, but
nothing actually injected it into the system prompt — heartbeat
reflections and the SAY: speech bubble would happily drift into
Japanese/Chinese on Qwen-family models because no language was pinned.

Add `_language_directive()` and append it from `build_system_context()`
so the directive is part of every system prompt path (Telegram replies,
heartbeat reflections, SAY: bubble, autonomous output). Codes are
mapped to readable names for the LLM (`de` → "German (Deutsch)" etc.)
for the common languages; unknown codes pass through verbatim.

Default behaviour stays English (`BOT_LANGUAGE=en`). Users can mirror
another language by writing in it — the directive explicitly allows
that override, but blocks drifting into a third language.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`needs_onboarding()` only returned False after the LLM emitted one of
the magic completion phrases ("onboarding complete", "saved to
identity.md", …) inside `check_onboarding_complete`. Several models
(notably the Ollama Qwen family) write IDENTITY.md correctly via the
`write_file` tool but never produce that exact phrase, so BOOTSTRAP.md
stays on disk and the bot retriggers onboarding on every restart.

Add a mtime-based safety net: if `IDENTITY.md` is newer than
`BOOTSTRAP.md`, the LLM has demonstrably already captured the
identity — delete BOOTSTRAP.md and treat onboarding as complete. The
existing magic-phrase path stays intact for cases where the LLM does
say it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The error screen had hardcoded Japanese SAY: strings (e.g.
`システムエラー発生`, `接続タイムアウト`). For owners who don't read
Japanese this just renders as garbled CJK glyphs on the E-Ink panel
and obscures what actually went wrong.

Make the SAY: bubble localizable: a per-language dictionary keyed by
the existing `BOT_LANGUAGE` env var, covering the same five error
categories (default, ratelimit, timeout, auth, syntax, llm). Ships
with `ja`, `en`, `de`, `ru`, `es`, `fr`. Unknown / missing codes fall
back to English.

Behavior preservation: when `BOT_LANGUAGE` is unset, the language
falls back to **Japanese** so existing deployments keep the project's
original cyberpunk aesthetic by default. New users who set
`BOT_LANGUAGE=en` (or any other supported code) get readable text.

The mood / face mapping is unchanged. The English `short_error` codes
in the STATUS: tail are also kept English on purpose — they read like
status codes (`Rate Limited`, `Bad Syntax`).

Also tighten the network branch to also catch the literal `"timed out"`
form (common in `socket.timeout` strings) so transient HTTP errors
classify as `timeout` rather than the default.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…onboarding)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a self-update path so the owner doesn't have to SSH in to pull
new code from upstream and restart the service.

`scripts/auto_update.sh` is the engine:
- Fetches the configured remote/branch, fast-forwards if there are
  new commits, refreshes venv deps when `requirements.txt` changed,
  and restarts the systemd service. Idempotent (no-op when up-to-date)
  and supports `--check` for dry-run. Configurable via env vars
  `OCG_UPDATE_REMOTE`, `OCG_UPDATE_BRANCH`, `OCG_SERVICE`.
- Pre-update tarball backup: `gotchi.db` + `data/` + `.env` go to
  `backups/pre-update-<timestamp>-<sha>.tar.gz`. Rolling retention
  keeps the newest 3 (`OCG_BACKUP_KEEP`, skip with `OCG_NO_BACKUP=1`).
  `backups/` is gitignored.
- Auto-rollback: if the service fails to come back up after the new
  code is pulled, the script does `git reset --hard` to the previous
  HEAD, reinstalls deps if needed, restarts, and exits with code 4
  to flag manual review. Disable with `OCG_NO_ROLLBACK=1`.
- Pre-flight check only blocks on dirty TRACKED files; untracked
  local-only files (drivers, ad-hoc scripts) no longer abort the run.

`/update` Telegram command:
- Owner-only wrapper around the script, so updates can be triggered
  from chat. `/update check` reports whether new commits exist
  without applying them. Reports rollback (exit 4) distinctly so
  the owner sees the upgrade was reverted.

`setup.sh`:
- Adds `/etc/sudoers.d/gotchi-update` so the bot user can
  `systemctl restart gotchi-bot.service` without a password —
  needed by `/update` and the unattended cron path.

For unattended auto-update users can wire the script into cron, e.g.
`0 4 * * 0 /bin/bash /path/to/openclawgotchi/scripts/auto_update.sh
   >> /path/to/openclawgotchi/logs/update.log 2>&1`.

User state (`.env`, `data/*.json`, `.workspace/`) is gitignored, so
`git pull` itself never touches it. The tarball is a second line of
defence against schema migrations that could corrupt the DB.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	.gitignore
#	CHANGELOG.md
#	src/bot/handlers.py
#	src/main.py
Adds optional battery monitoring for the popular Waveshare UPS HAT (C)
that turns the Pi Zero 2W into a portable / battery-backed device.
Many users run openclawgotchi on this HAT, so first-class support is
worthwhile.

- `src/hardware/battery.py`: single-shot reader for the on-board INA219
  over I2C. Returns voltage, current, power and a 0–100 percentage based
  on the 2× 18650 voltage curve (6.0 V empty → 8.4 V full). Auto-detects
  presence; if I2C is disabled or the HAT is absent, every public
  function returns `None` instead of raising — callers can use
  `is_available()` to gate UI. I2C bus / address overridable via env
  (`OCG_UPS_BUS`, `OCG_UPS_ADDR`) for non-default hardware.
- `/battery` Telegram command in `handlers.py`: shows the live reading
  (`🔋 87 % — 8.12 V, +120 mA (charging, 974 mW)`) or a friendly
  "no UPS HAT detected" hint with `i2cdetect` instructions.
- `hardware/system.get_stats_string()` adds a `[BATTERY] …` line when
  present, so heartbeat reflections and the bot's self-awareness pick
  up battery state automatically.
- `smbus2>=0.4.0` added to `requirements.txt`. Pure Python, ~30 KB.
  Removing the line disables battery support entirely (battery.py
  swallows the ImportError).

Setup notes for users: enable I2C on the host (DietPi/Raspberry Pi OS),
add the bot user to the `i2c` group (already part of `setup.sh`'s
`usermod -aG gpio,spi,i2c …`), reboot. `i2cdetect -y 1` should show
0x43 once the HAT is connected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	CHANGELOG.md
#	src/bot/handlers.py
#	src/main.py
…riant

Make the E-Ink display layer driver-aware so users running the
3-color UPS HAT-friendly B-variant of the Waveshare 2.13in V4 panel
get a working display without forking the project.

Selection is opt-in via the new env var:
  OCG_DISPLAY_VARIANT=mono   (default — current behaviour)
  OCG_DISPLAY_VARIANT=b      (3-color B-variant)
  OCG_DISPLAY_VARIANT=auto   (prefer B if its driver is importable)

Changes:

- `src/ui/gotchi_ui.py`: import path picks `epd2in13b_V4` or
  `epd2in13_V4` based on OCG_DISPLAY_VARIANT, sets a module-level
  `EPD_VARIANT_B` flag. `render_ui()`'s init / Clear / display calls
  branch on that flag — B has no partial refresh and `display()`
  takes (black, red); the red layer is fed a blank image so existing
  drawings render unchanged.

- `src/hardware/display.py`: timing knobs scale to variant. B's full
  refresh takes ~15 s, so:
    * `_DISPLAY_BUSY_RETRY_WAIT` jumps from 4 s to 20 s.
    * `_MIN_UPDATE_INTERVAL` becomes 30 s on B (was 0) — debounces
      bursts of identical updates that would block the panel for
      most of a minute. Disabled (0) on mono so behaviour is
      unchanged for the default install.
    * Dedup — skip when (mood, text) match the previous payload.
      Universally beneficial; particularly valuable on B.
    * `FULL_REFRESH_EVERY` ghosting compensation is now no-op on B
      (B always full-refreshes), preserved for mono.

- `src/drivers/epd2in13b_V4.py`: ship the Waveshare reference driver
  for the B variant alongside the existing mono driver. Sourced from
  the Waveshare e-Paper sample repo, MIT-licensed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`HEARTBEAT.md` template is English-only by design (it's the system
instruction the LLM follows). When BOT_LANGUAGE is set to a non-
English locale, the system-level language directive in
`build_system_context()` is correctly applied, but the long English
user prompt — soul + identity + heartbeat template + recent context
— overpowers it and the model writes the reflection in English.

Add a final language reminder right before the LLM is invoked, so the
language pin lives at the end of the user prompt where it's hardest
to override. Mirrors the names map already used by
`_language_directive()` in `llm/prompts.py`. No-op for BOT_LANGUAGE=en
or unset.

Telegram replies were already in the correct language because regular
chat handlers don't have a comparably long instruction template.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bringing the cleaned (server-agnostic, no private-repo refs) version
of the RAG REST client into deploy/all-features so the running bot is
consistent with the upstream PR turmyshevd#10. Mirrors feat/bot-rag-integration
@ b20c55a.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Demonstrates the new B-variant red layer with a real use case:
when a UPS HAT (C) is connected and reports < 20 % charge, the
" | 🔋NN%/X.XXV" suffix in the header stats line renders RED on
the panel instead of black. Healthy batteries / mono panels look
identical to before.

Implementation:
- render_ui() builds a parallel red_image (only on B variant) that
  stays all-white unless an accent is drawn into it.
- A best-effort battery probe (gracefully no-ops when no UPS HAT
  or smbus2 is missing) supplies the suffix.
- Below 20 %: prefix renders black, battery suffix renders into the
  red layer only — the panel composites black + red → suffix shows
  as red text inline.
- Otherwise: single black draw call, red layer stays blank.

Red is treated as an accent, never a background — so the user's
"don't paint the whole display red" rule is honoured.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The display skill currently tells the LLM "Colors: Black & white only".
That's correct for the mono panel but misleading once the B-variant
support lands — the bot needs to understand:

  - which physical panel is in play (selected via OCG_DISPLAY_VARIANT)
  - that there is no `RED:` directive it can emit
  - that red rendering is system-initiated (today: low battery <20 %)
  - the standing rule: red is an accent, never a background

This avoids the failure mode where the LLM, having been told "you can
control colors", asks for or describes red usage that doesn't exist —
or worse, instructs the bot to flood the screen red.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Waveshare's UPS HAT (C) ships with a single 18650 cell (1S), not the
2S pack the original code assumed. A fully-charged 4.2 V cell mapped
to (4.2 − 6.0) / 2.4 = −0.75 → clamped to 0 %, so users always saw
"empty" regardless of actual charge.

Switch the linear voltage→percent map to 1S range:
  empty 3.0 V → 0 %
  full  4.2 V → 100 %

Verified on hardware: 4.09 V → 91 % (consistent with a near-full
cell with ~5 % discharge tolerance).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…amage

An earlier `git checkout feat/bot-rag-integration -- src/...` to bring
the cleaned RAG client onto deploy/all-features replaced the merged
deploy versions of bot/handlers.py, main.py, src/config.py with the
PR turmyshevd#10 branch's narrower versions, dropping cmd_model / cb_model /
cmd_update / cmd_battery and the OLLAMA_* config. The bot crash-looped
on startup (ImportError: cannot import name 'cmd_model').

Restore the merged versions (sourced from 288aa3d, the last good
deploy commit) and re-add the cmd_rag / RAG_* additions on top so all
features coexist. No upstream PR is affected — these files on the
upstream PR branches stay scoped to their feature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two issues seen on B-variant hardware after bot-driven display
updates (FACE: / SAY: from LLM output):

1. Top-left always showed "Gotchi" instead of the configured BOT_NAME
   (e.g. "Clotchi"). Cause: `sudo` strips env_reset Defaults, so the
   subprocess's `os.environ.get("BOT_NAME")` fell back to the literal
   "Gotchi" default. Fix: also propagate BOT_NAME, OWNER_NAME and
   BOT_LANGUAGE through the existing `sudo /usr/bin/env VAR=val ...`
   wrapper that already handles OCG_DISPLAY_VARIANT and friends.

2. Battery suffix in the header stats line pushed the bot name on
   the top-left off-screen / under other content when the line got
   long. Move it into the footer centre instead — the footer has
   spare horizontal space between the status text on the left and
   the XP indicator on the right. On B variant a low-charge battery
   renders into the red layer (red text accent) instead of black;
   above the threshold or on mono panels it stays black.

Status text is now truncated to 30 chars (was 35) so it doesn't
overlap the centred battery cell on long messages.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`apply_auto_mood()` returns a (mood, text) tuple. The text becomes the
footer status_text on the E-Ink panel, while the header simultaneously
renders the same metrics in its always-on stats line
(`T:51°C | Free:79MB | …`). For low RAM / high temp the auto-mood text
read e.g. "Low RAM: 79MB" or "Hot! 51°C" — exactly what the header
already shows, just one frame older. The two numbers drift between
frames and the duplicate crowds an already tight 250×122 layout.

Drop the metric values from the warning text and keep the warning
itself ("RAM low", "Running hot", "OVERHEATING!", "OOM!"). The header
keeps reporting the live numbers; the footer adds the qualitative
warning beside them. No mood mapping changes, no thresholds change,
no API change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single-side pinning (only at the end) wasn't enough — the long English
HEARTBEAT.md template still pulled the model into English on a
fraction of heartbeats even when BOT_LANGUAGE was set. Wrap the prompt
in the configured language at the start AND at the end so whichever
side the model anchors to, the language is set.

The front pin uses the user's language directly (e.g. German), making
the very first thing the model reads a strong directive in the target
language. The end pin restates the requirement just before generation
starts. Both reference each other so it's clear they're the same rule.

Behaviour for `BOT_LANGUAGE=en` or unset: no change, no extra prompt,
no change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lets the bot consume tools advertised by any external MCP server
that speaks the SSE transport, without dragging in the official
`mcp[cli]` Python package (it pulls `cryptography`, `pydantic-settings`,
`starlette`, `uvicorn`, `pyjwt`, `httpx-sse`, `sse-starlette`,
`python-multipart` — non-trivial RAM hit on a 512 MB Pi Zero 2W).

What's added
- src/llm/rag_mcp_client.py — hand-rolled MCP-over-SSE client,
  ~250 LoC, stdlib-only + `requests` (already in the venv via
  litellm). Background thread reads the SSE stream and dispatches
  JSON-RPC responses by id; sync `connect()` / `initialize()` /
  `list_tools()` / `call_tool(name, args)` API. A module-level
  `get_client()` returns a lazy singleton so multiple tool calls
  share one SSE connection.
- Two new LLM tools wired into TOOL_MAP:
    `mcp_list_tools()` — return advertised tool names + descriptions
    `mcp_call_tool(name, arguments)` — invoke by name; arguments is
                                       a JSON object passed as a string
                                       (the LLM emits one).
  Both gracefully no-op when the MCP path isn't configured, returning
  a clear hint instead of raising.

Activation
- Env var `RAG_TRANSPORT=rest|mcp` (default `rest`). When `mcp`,
  `RAG_API_URL` is interpreted as the MCP-SSE base URL (e.g.
  `http://your-rag-host:8766`).
- Reuses `RAG_API_KEY` for optional Bearer auth.
- No new top-level dependencies.

Tested against rag-core's MCP-SSE endpoint
(advertised tools: rag_search, rag_persist, rag_status,
 rag_list_collections, rag_recall_session, rag_session_announce,
 rag_session_forget). `tools/list` returns the catalog; `tools/call`
 dispatches and returns the rendered text content correctly.

Out of scope (separate follow-ups)
- Auto-registration of advertised MCP tools as first-class TOOL_MAP
  entries (each with its own typed JSON schema). Today the LLM has
  to look at `mcp_list_tools` then construct an `mcp_call_tool` call
  itself; auto-registration would let it call them as if native.
- Multi-server support (today: single MCP server via RAG_API_URL).
- Async transport / WebSocket fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…prompt

When RAG_TRANSPORT=mcp and the MCP server is reachable at startup,
discover its advertised tools via tools/list and register each one as
a first-class TOOL_MAP entry with full JSON-Schema. The LLM then calls
e.g. `rag_search(query=..., top_k=3)` directly instead of the two-hop
`mcp_list_tools` → `mcp_call_tool` indirection. Names that collide
with an existing TOOL_MAP entry are skipped. Failures are logged but
never crash the bot.

A new system-prompt section "External Memory (MCP)" lists which tools
were registered and instructs the bot to:
  - search RAG BEFORE answering questions about user preferences,
    project rules, decisions, or past context
  - persist durable lessons via the persist tool
  - optionally announce session context once per conversation

Why: the previous PR exposed `mcp_list_tools` / `mcp_call_tool` as
generic glue, but the LLM wouldn't reach for them on its own — and
even when it did, the two-hop indirection wasted turns. Auto-
registration lets the agent use the RAG as its durable memory the
same way it already uses `remember_fact` / `recall_facts` for the
in-process store; the prompt block tells it WHEN.

Both `mcp_list_tools` / `mcp_call_tool` remain available as fallback
for ad-hoc discovery and tools that show up after bot start.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ransports side-by-side

A typical rag-core deployment exposes REST on :8765 AND an MCP-SSE
gateway on :8766 simultaneously, but the bot's previous config locked
itself to one transport at a time because both clients shared a single
``RAG_API_URL`` and an ``RAG_TRANSPORT`` switch.

This separates them:
- ``RAG_REST_URL`` — REST endpoint, used by ``rag_client.py``,
  ``query_rag`` / ``persist_to_rag`` LLM tools, and ``/rag`` Telegram
  command.
- ``RAG_MCP_URL``  — MCP-SSE base URL, used by ``rag_mcp_client.py``
  and the auto-registered first-class MCP tools (``rag_search`` etc.).

Set one, the other, or both. Empty → degrade gracefully (no-op).

Backwards compat: ``RAG_API_URL`` + ``RAG_TRANSPORT=rest|mcp`` still
honored — a small shim in ``config.py`` maps the legacy URL onto
whichever transport ``RAG_TRANSPORT`` selects (default ``rest``).
Existing single-URL deployments keep working with no env edits.

``RAG_API_URL`` remains importable as an alias of ``RAG_REST_URL``
for any third-party code reading config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Smilez1985
Copy link
Copy Markdown
Contributor Author

Smilez1985 commented May 9, 2026

Quick context on this integration — wanted to be upfront about where it comes from.

The RAG backend I built against is a project I call rag-core: a self-hosted, project-scoped retrieval memory layer with a REST API and an MCP-SSE gateway (Qdrant-backed, multi-collection, frontmatter-aware chunking, designed for AI agents). The repo is currently private while I finish a handful of pre-public-readiness items, but the plan is to open it up before long.

This PR series (#10, #12, #13, #14) is intentionally written against a generic contract — any RAG backend that speaks the documented REST shape, and any MCP-SSE server with tools/list + tools/call. Nothing here pins to rag-core specifically; it just happens to be the implementation I run.

Reason for sending this upstream rather than keeping it on my fork: I use OpenClawGotchi as one of my main development drivers, and the RAG integration has become part of how I work with it day-to-day. Carrying a long-lived branch and re-rebasing on every upstream pull is a fair amount of churn — having it on main would save a meaningful amount of repeat work on my side.

No pressure on accepting; happy to iterate to whatever shape fits best, or to wait if you'd rather see the other end of the wire first. Once rag-core is public I'll ping back here — at that point you'd be welcome to clone the repo and stand up your own instance if you want to see how it pairs with the bot from the other side. (Just to be clear: this isn't an invite to my running instance — it's an invite to the source, so you can run your own.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant