feat(mcp): auto-register MCP tools as first-class + RAG-aware prompt#13
feat(mcp): auto-register MCP tools as first-class + RAG-aware prompt#13Smilez1985 wants to merge 24 commits into
Conversation
Adds an inline-keyboard model picker so users can switch LLM backends from Telegram without SSH or env edits, and makes the choice survive reboots. User-facing additions - New /model command. Without args it shows an inline keyboard with every preset (gemini, glm, ollama). With an argument (`/model glm`) it falls through to the existing /use logic, so the old behaviour still works. - Tapping the 🦙 ollama row queries the configured Ollama server live (`/api/tags` + `/api/show`), filters by capabilities containing `tools`, and only lists tool-capable models. If none of the installed models advertise tool support, falls back to all of them with a warning. Includes `◂ Back` button and a graceful unreachable state. - Selecting a model immediately switches the active LiteLLM model and acknowledges with the new model name on the E-Ink face. Persistence - `LiteLLMConnector.set_model()` now writes the choice to `data/active_model.json` (gitignored). On startup the connector restores that selection before falling back to `DEFAULT_LITE_PRESET`, so reboots and `systemctl restart` no longer reset the user's pick. Reliability - Increase the application's HTTP timeouts via `Application.builder()` (`read=60`, `write=60`, `connect=30`, `pool=30`). The Pi Zero 2W's WiFi can otherwise time out polling Telegram while a long Ollama reply is streaming, surfacing as `httpx.ReadError` / `Timed out`. Config - Add `OLLAMA_MODEL` and `OLLAMA_API_BASE` env-driven defaults plus an `ollama` entry in `LLM_PRESETS`. Default API base is the placeholder `http://ollama-server:11434`; set `OLLAMA_API_BASE` in `.env` to point at your actual Ollama host. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`BOT_LANGUAGE` was defined in `config.py` and exposed via `.env`, but nothing actually injected it into the system prompt — heartbeat reflections and the SAY: speech bubble would happily drift into Japanese/Chinese on Qwen-family models because no language was pinned. Add `_language_directive()` and append it from `build_system_context()` so the directive is part of every system prompt path (Telegram replies, heartbeat reflections, SAY: bubble, autonomous output). Codes are mapped to readable names for the LLM (`de` → "German (Deutsch)" etc.) for the common languages; unknown codes pass through verbatim. Default behaviour stays English (`BOT_LANGUAGE=en`). Users can mirror another language by writing in it — the directive explicitly allows that override, but blocks drifting into a third language. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`needs_onboarding()` only returned False after the LLM emitted one of
the magic completion phrases ("onboarding complete", "saved to
identity.md", …) inside `check_onboarding_complete`. Several models
(notably the Ollama Qwen family) write IDENTITY.md correctly via the
`write_file` tool but never produce that exact phrase, so BOOTSTRAP.md
stays on disk and the bot retriggers onboarding on every restart.
Add a mtime-based safety net: if `IDENTITY.md` is newer than
`BOOTSTRAP.md`, the LLM has demonstrably already captured the
identity — delete BOOTSTRAP.md and treat onboarding as complete. The
existing magic-phrase path stays intact for cases where the LLM does
say it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The error screen had hardcoded Japanese SAY: strings (e.g. `システムエラー発生`, `接続タイムアウト`). For owners who don't read Japanese this just renders as garbled CJK glyphs on the E-Ink panel and obscures what actually went wrong. Make the SAY: bubble localizable: a per-language dictionary keyed by the existing `BOT_LANGUAGE` env var, covering the same five error categories (default, ratelimit, timeout, auth, syntax, llm). Ships with `ja`, `en`, `de`, `ru`, `es`, `fr`. Unknown / missing codes fall back to English. Behavior preservation: when `BOT_LANGUAGE` is unset, the language falls back to **Japanese** so existing deployments keep the project's original cyberpunk aesthetic by default. New users who set `BOT_LANGUAGE=en` (or any other supported code) get readable text. The mood / face mapping is unchanged. The English `short_error` codes in the STATUS: tail are also kept English on purpose — they read like status codes (`Rate Limited`, `Bad Syntax`). Also tighten the network branch to also catch the literal `"timed out"` form (common in `socket.timeout` strings) so transient HTTP errors classify as `timeout` rather than the default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…onboarding) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a self-update path so the owner doesn't have to SSH in to pull new code from upstream and restart the service. `scripts/auto_update.sh` is the engine: - Fetches the configured remote/branch, fast-forwards if there are new commits, refreshes venv deps when `requirements.txt` changed, and restarts the systemd service. Idempotent (no-op when up-to-date) and supports `--check` for dry-run. Configurable via env vars `OCG_UPDATE_REMOTE`, `OCG_UPDATE_BRANCH`, `OCG_SERVICE`. - Pre-update tarball backup: `gotchi.db` + `data/` + `.env` go to `backups/pre-update-<timestamp>-<sha>.tar.gz`. Rolling retention keeps the newest 3 (`OCG_BACKUP_KEEP`, skip with `OCG_NO_BACKUP=1`). `backups/` is gitignored. - Auto-rollback: if the service fails to come back up after the new code is pulled, the script does `git reset --hard` to the previous HEAD, reinstalls deps if needed, restarts, and exits with code 4 to flag manual review. Disable with `OCG_NO_ROLLBACK=1`. - Pre-flight check only blocks on dirty TRACKED files; untracked local-only files (drivers, ad-hoc scripts) no longer abort the run. `/update` Telegram command: - Owner-only wrapper around the script, so updates can be triggered from chat. `/update check` reports whether new commits exist without applying them. Reports rollback (exit 4) distinctly so the owner sees the upgrade was reverted. `setup.sh`: - Adds `/etc/sudoers.d/gotchi-update` so the bot user can `systemctl restart gotchi-bot.service` without a password — needed by `/update` and the unattended cron path. For unattended auto-update users can wire the script into cron, e.g. `0 4 * * 0 /bin/bash /path/to/openclawgotchi/scripts/auto_update.sh >> /path/to/openclawgotchi/logs/update.log 2>&1`. User state (`.env`, `data/*.json`, `.workspace/`) is gitignored, so `git pull` itself never touches it. The tarball is a second line of defence against schema migrations that could corrupt the DB. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # CHANGELOG.md
# Conflicts: # .gitignore # CHANGELOG.md # src/bot/handlers.py # src/main.py
Adds optional battery monitoring for the popular Waveshare UPS HAT (C) that turns the Pi Zero 2W into a portable / battery-backed device. Many users run openclawgotchi on this HAT, so first-class support is worthwhile. - `src/hardware/battery.py`: single-shot reader for the on-board INA219 over I2C. Returns voltage, current, power and a 0–100 percentage based on the 2× 18650 voltage curve (6.0 V empty → 8.4 V full). Auto-detects presence; if I2C is disabled or the HAT is absent, every public function returns `None` instead of raising — callers can use `is_available()` to gate UI. I2C bus / address overridable via env (`OCG_UPS_BUS`, `OCG_UPS_ADDR`) for non-default hardware. - `/battery` Telegram command in `handlers.py`: shows the live reading (`🔋 87 % — 8.12 V, +120 mA (charging, 974 mW)`) or a friendly "no UPS HAT detected" hint with `i2cdetect` instructions. - `hardware/system.get_stats_string()` adds a `[BATTERY] …` line when present, so heartbeat reflections and the bot's self-awareness pick up battery state automatically. - `smbus2>=0.4.0` added to `requirements.txt`. Pure Python, ~30 KB. Removing the line disables battery support entirely (battery.py swallows the ImportError). Setup notes for users: enable I2C on the host (DietPi/Raspberry Pi OS), add the bot user to the `i2c` group (already part of `setup.sh`'s `usermod -aG gpio,spi,i2c …`), reboot. `i2cdetect -y 1` should show 0x43 once the HAT is connected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts: # CHANGELOG.md # src/bot/handlers.py # src/main.py
…riant
Make the E-Ink display layer driver-aware so users running the
3-color UPS HAT-friendly B-variant of the Waveshare 2.13in V4 panel
get a working display without forking the project.
Selection is opt-in via the new env var:
OCG_DISPLAY_VARIANT=mono (default — current behaviour)
OCG_DISPLAY_VARIANT=b (3-color B-variant)
OCG_DISPLAY_VARIANT=auto (prefer B if its driver is importable)
Changes:
- `src/ui/gotchi_ui.py`: import path picks `epd2in13b_V4` or
`epd2in13_V4` based on OCG_DISPLAY_VARIANT, sets a module-level
`EPD_VARIANT_B` flag. `render_ui()`'s init / Clear / display calls
branch on that flag — B has no partial refresh and `display()`
takes (black, red); the red layer is fed a blank image so existing
drawings render unchanged.
- `src/hardware/display.py`: timing knobs scale to variant. B's full
refresh takes ~15 s, so:
* `_DISPLAY_BUSY_RETRY_WAIT` jumps from 4 s to 20 s.
* `_MIN_UPDATE_INTERVAL` becomes 30 s on B (was 0) — debounces
bursts of identical updates that would block the panel for
most of a minute. Disabled (0) on mono so behaviour is
unchanged for the default install.
* Dedup — skip when (mood, text) match the previous payload.
Universally beneficial; particularly valuable on B.
* `FULL_REFRESH_EVERY` ghosting compensation is now no-op on B
(B always full-refreshes), preserved for mono.
- `src/drivers/epd2in13b_V4.py`: ship the Waveshare reference driver
for the B variant alongside the existing mono driver. Sourced from
the Waveshare e-Paper sample repo, MIT-licensed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`HEARTBEAT.md` template is English-only by design (it's the system instruction the LLM follows). When BOT_LANGUAGE is set to a non- English locale, the system-level language directive in `build_system_context()` is correctly applied, but the long English user prompt — soul + identity + heartbeat template + recent context — overpowers it and the model writes the reflection in English. Add a final language reminder right before the LLM is invoked, so the language pin lives at the end of the user prompt where it's hardest to override. Mirrors the names map already used by `_language_directive()` in `llm/prompts.py`. No-op for BOT_LANGUAGE=en or unset. Telegram replies were already in the correct language because regular chat handlers don't have a comparably long instruction template. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bringing the cleaned (server-agnostic, no private-repo refs) version of the RAG REST client into deploy/all-features so the running bot is consistent with the upstream PR turmyshevd#10. Mirrors feat/bot-rag-integration @ b20c55a. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Demonstrates the new B-variant red layer with a real use case: when a UPS HAT (C) is connected and reports < 20 % charge, the " | 🔋NN%/X.XXV" suffix in the header stats line renders RED on the panel instead of black. Healthy batteries / mono panels look identical to before. Implementation: - render_ui() builds a parallel red_image (only on B variant) that stays all-white unless an accent is drawn into it. - A best-effort battery probe (gracefully no-ops when no UPS HAT or smbus2 is missing) supplies the suffix. - Below 20 %: prefix renders black, battery suffix renders into the red layer only — the panel composites black + red → suffix shows as red text inline. - Otherwise: single black draw call, red layer stays blank. Red is treated as an accent, never a background — so the user's "don't paint the whole display red" rule is honoured. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The display skill currently tells the LLM "Colors: Black & white only". That's correct for the mono panel but misleading once the B-variant support lands — the bot needs to understand: - which physical panel is in play (selected via OCG_DISPLAY_VARIANT) - that there is no `RED:` directive it can emit - that red rendering is system-initiated (today: low battery <20 %) - the standing rule: red is an accent, never a background This avoids the failure mode where the LLM, having been told "you can control colors", asks for or describes red usage that doesn't exist — or worse, instructs the bot to flood the screen red. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Waveshare's UPS HAT (C) ships with a single 18650 cell (1S), not the 2S pack the original code assumed. A fully-charged 4.2 V cell mapped to (4.2 − 6.0) / 2.4 = −0.75 → clamped to 0 %, so users always saw "empty" regardless of actual charge. Switch the linear voltage→percent map to 1S range: empty 3.0 V → 0 % full 4.2 V → 100 % Verified on hardware: 4.09 V → 91 % (consistent with a near-full cell with ~5 % discharge tolerance). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…amage An earlier `git checkout feat/bot-rag-integration -- src/...` to bring the cleaned RAG client onto deploy/all-features replaced the merged deploy versions of bot/handlers.py, main.py, src/config.py with the PR turmyshevd#10 branch's narrower versions, dropping cmd_model / cb_model / cmd_update / cmd_battery and the OLLAMA_* config. The bot crash-looped on startup (ImportError: cannot import name 'cmd_model'). Restore the merged versions (sourced from 288aa3d, the last good deploy commit) and re-add the cmd_rag / RAG_* additions on top so all features coexist. No upstream PR is affected — these files on the upstream PR branches stay scoped to their feature. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two issues seen on B-variant hardware after bot-driven display
updates (FACE: / SAY: from LLM output):
1. Top-left always showed "Gotchi" instead of the configured BOT_NAME
(e.g. "Clotchi"). Cause: `sudo` strips env_reset Defaults, so the
subprocess's `os.environ.get("BOT_NAME")` fell back to the literal
"Gotchi" default. Fix: also propagate BOT_NAME, OWNER_NAME and
BOT_LANGUAGE through the existing `sudo /usr/bin/env VAR=val ...`
wrapper that already handles OCG_DISPLAY_VARIANT and friends.
2. Battery suffix in the header stats line pushed the bot name on
the top-left off-screen / under other content when the line got
long. Move it into the footer centre instead — the footer has
spare horizontal space between the status text on the left and
the XP indicator on the right. On B variant a low-charge battery
renders into the red layer (red text accent) instead of black;
above the threshold or on mono panels it stays black.
Status text is now truncated to 30 chars (was 35) so it doesn't
overlap the centred battery cell on long messages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`apply_auto_mood()` returns a (mood, text) tuple. The text becomes the
footer status_text on the E-Ink panel, while the header simultaneously
renders the same metrics in its always-on stats line
(`T:51°C | Free:79MB | …`). For low RAM / high temp the auto-mood text
read e.g. "Low RAM: 79MB" or "Hot! 51°C" — exactly what the header
already shows, just one frame older. The two numbers drift between
frames and the duplicate crowds an already tight 250×122 layout.
Drop the metric values from the warning text and keep the warning
itself ("RAM low", "Running hot", "OVERHEATING!", "OOM!"). The header
keeps reporting the live numbers; the footer adds the qualitative
warning beside them. No mood mapping changes, no thresholds change,
no API change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single-side pinning (only at the end) wasn't enough — the long English HEARTBEAT.md template still pulled the model into English on a fraction of heartbeats even when BOT_LANGUAGE was set. Wrap the prompt in the configured language at the start AND at the end so whichever side the model anchors to, the language is set. The front pin uses the user's language directly (e.g. German), making the very first thing the model reads a strong directive in the target language. The end pin restates the requirement just before generation starts. Both reference each other so it's clear they're the same rule. Behaviour for `BOT_LANGUAGE=en` or unset: no change, no extra prompt, no change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lets the bot consume tools advertised by any external MCP server
that speaks the SSE transport, without dragging in the official
`mcp[cli]` Python package (it pulls `cryptography`, `pydantic-settings`,
`starlette`, `uvicorn`, `pyjwt`, `httpx-sse`, `sse-starlette`,
`python-multipart` — non-trivial RAM hit on a 512 MB Pi Zero 2W).
What's added
- src/llm/rag_mcp_client.py — hand-rolled MCP-over-SSE client,
~250 LoC, stdlib-only + `requests` (already in the venv via
litellm). Background thread reads the SSE stream and dispatches
JSON-RPC responses by id; sync `connect()` / `initialize()` /
`list_tools()` / `call_tool(name, args)` API. A module-level
`get_client()` returns a lazy singleton so multiple tool calls
share one SSE connection.
- Two new LLM tools wired into TOOL_MAP:
`mcp_list_tools()` — return advertised tool names + descriptions
`mcp_call_tool(name, arguments)` — invoke by name; arguments is
a JSON object passed as a string
(the LLM emits one).
Both gracefully no-op when the MCP path isn't configured, returning
a clear hint instead of raising.
Activation
- Env var `RAG_TRANSPORT=rest|mcp` (default `rest`). When `mcp`,
`RAG_API_URL` is interpreted as the MCP-SSE base URL (e.g.
`http://your-rag-host:8766`).
- Reuses `RAG_API_KEY` for optional Bearer auth.
- No new top-level dependencies.
Tested against rag-core's MCP-SSE endpoint
(advertised tools: rag_search, rag_persist, rag_status,
rag_list_collections, rag_recall_session, rag_session_announce,
rag_session_forget). `tools/list` returns the catalog; `tools/call`
dispatches and returns the rendered text content correctly.
Out of scope (separate follow-ups)
- Auto-registration of advertised MCP tools as first-class TOOL_MAP
entries (each with its own typed JSON schema). Today the LLM has
to look at `mcp_list_tools` then construct an `mcp_call_tool` call
itself; auto-registration would let it call them as if native.
- Multi-server support (today: single MCP server via RAG_API_URL).
- Async transport / WebSocket fallback.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…prompt
When RAG_TRANSPORT=mcp and the MCP server is reachable at startup,
discover its advertised tools via tools/list and register each one as
a first-class TOOL_MAP entry with full JSON-Schema. The LLM then calls
e.g. `rag_search(query=..., top_k=3)` directly instead of the two-hop
`mcp_list_tools` → `mcp_call_tool` indirection. Names that collide
with an existing TOOL_MAP entry are skipped. Failures are logged but
never crash the bot.
A new system-prompt section "External Memory (MCP)" lists which tools
were registered and instructs the bot to:
- search RAG BEFORE answering questions about user preferences,
project rules, decisions, or past context
- persist durable lessons via the persist tool
- optionally announce session context once per conversation
Why: the previous PR exposed `mcp_list_tools` / `mcp_call_tool` as
generic glue, but the LLM wouldn't reach for them on its own — and
even when it did, the two-hop indirection wasted turns. Auto-
registration lets the agent use the RAG as its durable memory the
same way it already uses `remember_fact` / `recall_facts` for the
in-process store; the prompt block tells it WHEN.
Both `mcp_list_tools` / `mcp_call_tool` remain available as fallback
for ad-hoc discovery and tools that show up after bot start.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Quick context on this integration — wanted to be upfront about where it comes from. The RAG backend I built against is a project I call rag-core: a self-hosted, project-scoped retrieval memory layer with a REST API and an MCP-SSE gateway (Qdrant-backed, multi-collection, frontmatter-aware chunking, designed for AI agents). The repo is currently private while I finish a handful of pre-public-readiness items, but the plan is to open it up before long. This PR series (#10, #12, #13, #14) is intentionally written against a generic contract — any RAG backend that speaks the documented REST shape, and any MCP-SSE server with Reason for sending this upstream rather than keeping it on my fork: I use OpenClawGotchi as one of my main development drivers, and the RAG integration has become part of how I work with it day-to-day. Carrying a long-lived branch and re-rebasing on every upstream pull is a fair amount of churn — having it on No pressure on accepting; happy to iterate to whatever shape fits best, or to wait if you'd rather see the other end of the wire first. Once rag-core is public I'll ping back here — at that point you'd be welcome to clone the repo and stand up your own instance if you want to see how it pairs with the bot from the other side. (Just to be clear: this isn't an invite to my running instance — it's an invite to the source, so you can run your own.) |
📝 WalkthroughWalkthroughThis PR adds five major interconnected capabilities: external RAG service integration via REST and MCP-over-SSE clients, Ollama LLM backend support with interactive model selection, UPS HAT battery monitoring via I2C, E-Paper display multi-variant support (mono vs 3-color), and an auto-update system with backup/rollback. Language localization, improved onboarding detection, and comprehensive tool wiring complete the feature set. ChangesCore Feature Additions
🎯 4 (Complex) | ⏱️ ~60 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 14
🧹 Nitpick comments (2)
CHANGELOG.md (2)
31-34: 💤 Low valueConsider expanding graceful degradation note to cover auto-registration.
The Notes section mentions graceful degradation for unreachable MCP servers, but per the PR objectives, auto-registration also has graceful behavior: it's idempotent (skips registration if a tool name collides with an existing built-in), and failures are logged but non-fatal. This characteristic isn't captured in line 34.
📋 Suggested enhancement
### Notes - The MCP client uses a sync API throughout — slots into the existing TOOL_MAP dispatcher without async plumbing. - A single background thread reads the SSE stream and routes JSON-RPC responses by id; one connected client is reused per process via a lazy singleton. -- Graceful degradation: when the MCP server is unreachable or `RAG_TRANSPORT` isn't set, the new tools return informative no-op strings; the bot stays alive. +- Graceful degradation: when the MCP server is unreachable or `RAG_TRANSPORT` isn't set, the generic tools return informative no-op strings and auto-registration is a silent no-op; the bot stays alive. Auto-registration is idempotent and skips tool names that collide with existing built-ins.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CHANGELOG.md` around lines 31 - 34, Update the "Notes" section to explicitly mention graceful degradation for auto-registration: state that the auto-registration process for MCP tools is idempotent (it skips registering when a tool name collides with built-ins) and that registration failures are logged but non-fatal; reference the existing terms used in the diff (MCP, RAG_TRANSPORT, TOOL_MAP) so readers understand this behavior applies alongside the unreachable-server/no-op behavior already described.
9-9: 💤 Low valueConsider removing transient implementation detail.
The "~250 LoC" line count is an implementation detail that may change with future edits and doesn't provide actionable information to users reading the changelog.
✂️ Suggested simplification
-- **Bot can also act as a generic MCP client over SSE.** New module `src/llm/rag_mcp_client.py` is a hand-rolled minimal MCP-over-SSE client (~250 LoC) using only the stdlib + `requests` (already in the venv via litellm). Speaks just enough of the MCP spec to do `initialize` + `tools/list` + `tools/call` against any SSE-transport server — no `mcp[cli]` dependency, so the Pi Zero 2W's RAM budget is respected (the official package pulls in `cryptography`, `pydantic-settings`, `starlette`, `uvicorn`, etc.). +- **Bot can also act as a generic MCP client over SSE.** New module `src/llm/rag_mcp_client.py` is a hand-rolled minimal MCP-over-SSE client using only the stdlib + `requests` (already in the venv via litellm). Speaks just enough of the MCP spec to do `initialize` + `tools/list` + `tools/call` against any SSE-transport server — no `mcp[cli]` dependency, so the Pi Zero 2W's RAM budget is respected (the official package pulls in `cryptography`, `pydantic-settings`, `starlette`, `uvicorn`, etc.).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CHANGELOG.md` at line 9, Remove the transient implementation detail "~250 LoC" from the CHANGELOG entry describing the new module; edit the sentence that mentions `src/llm/rag_mcp_client.py` to omit the line-count parenthetical so it reads as a concise feature description (keep mention that it is a minimal hand-rolled MCP-over-SSE client using stdlib + `requests` but drop the volatile LoC figure).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CHANGELOG.md`:
- Around line 7-11: The changelog omits documentation of the MCP
auto-registration feature: when RAG_TRANSPORT=mcp and the MCP server is
reachable at startup the code calls tools/list and auto-registers each
discovered tool into TOOL_MAP so they become first-class callable tools (not
just via mcp_list_tools → mcp_call_tool), and it also emits an "External Memory
(MCP)" system prompt when at least one MCP tool is registered; update the Added
section to mention auto-registration of discovered tools into TOOL_MAP, give an
example (e.g., discovered tool becomes rag_search(query..., top_k=...)), and
note the emitted "External Memory (MCP)" system prompt behavior when MCP tools
are present (include references to mcp_list_tools, mcp_call_tool, TOOL_MAP, and
RAG_TRANSPORT).
In `@src/bot/handlers.py`:
- Around line 1078-1081: The check-only branch in the handler currently treats
any non-zero proc.returncode as "Up-to-date"; update the logic in the check_only
block (where proc.returncode and msg.edit_text are used) to map returncode==0 =>
"🆕 Updates available", returncode==1 => "✅ Up-to-date", and any other non-zero
returncode => "❌ Update check failed (exit X)" (include the actual
proc.returncode in the message) before calling msg.edit_text to display the
status and the captured output variable out.
- Around line 1053-1056: The current guard uses get_admin_id() alone so when it
returns None the check passes and non-owners can run /update; change the logic
in the /update handler to always fetch the owner id (e.g. call get_owner_id())
and enforce that the caller must be the owner unless an admin is explicitly
configured and matches the caller: compute owner_id = get_owner_id() and
admin_id = get_admin_id(), then if user_id != owner_id and (admin_id is None or
user_id != admin_id) call update.message.reply_text("⛔ Owner-only command.") and
return; update references to user_id, get_admin_id, get_owner_id, and
update.message.reply_text in the handler accordingly.
In `@src/bot/heartbeat.py`:
- Around line 315-317: The language-pin added in heartbeat.py currently tells
the model to translate the entire reply, which can localize control tokens like
"FACE:" and "SAY:" and break parse_and_execute_commands and mood routing; change
the prompt text used where the string is built (the language pin around
_lang_name) to explicitly exclude or preserve protocol tokens by instructing
"Translate all user-visible content into {_lang_name}; do NOT translate, alter,
or localize any protocol/control tokens such as FACE:, SAY:, or other command
lines — leave them exactly as-is" (apply the same change to the similar string
at the other occurrence around lines 322-324). Ensure the wording makes control
tokens invariant so parse_and_execute_commands and mood routing receive English
tokens unchanged.
In `@src/bot/onboarding.py`:
- Around line 27-35: The except clause around the filesystem checks for
identity_file and BOOTSTRAP_FILE is too broad; narrow it to only handle
filesystem-related errors by catching OSError (and its subclasses) instead of
Exception. Update the try/except in the onboarding auto-complete block that uses
identity_file.stat(), identity_file.exists(), and BOOTSTRAP_FILE.stat() so it
only catches OSError, and keep the existing log.warning message (including the
exception) for that case; let other exceptions propagate.
In `@src/drivers/epd2in13b_V4.py`:
- Around line 157-159: The fallback currently returns a buffer filled with 0x00
(black) despite the comment saying "blank buffer"; change the return to a white
(blank) buffer by returning bytes filled with 0xFF instead of 0x00 while keeping
the same size calculation using self.width and self.height (i.e.,
(int(self.width/8) * self.height)); update the return in the method that
contains logger.warning("Wrong image dimensions...") so a dimension mismatch
yields a white screen rather than black.
- Around line 79-83: The busy() loop can hang forever if BUSY stays asserted;
add a timeout to busy() (e.g., parameter timeout_ms or module constant
BUSY_TIMEOUT_MS) and track start time (use time.monotonic()); inside the
while(epdconfig.digital_read(self.busy_pin) != 0) loop check elapsed time and if
it exceeds the timeout call epdconfig.delay_ms as before, then log an error
including the timeout and raise a TimeoutError (or return a failure boolean) so
callers know the display never released; ensure time is imported and update any
callers if you change the signature.
In `@src/hardware/display.py`:
- Around line 23-25: The current variant detection sets _VARIANT_B true when
_DISPLAY_VARIANT is "auto", which incorrectly applies B-specific timing/debounce
before the driver resolves; change the condition that defines _VARIANT_B so it
only returns True for explicit B identifiers (e.g., "b", "epd2in13b", "3color",
"tricolor") and excludes "auto" (leave "auto" to be resolved later by the
driver), i.e., update the logic around _DISPLAY_VARIANT and _VARIANT_B to not
treat "auto" as a B variant and ensure any later driver-resolution code can set
the effective variant when known.
In `@src/llm/litellm_connector.py`:
- Line 625: The current user-facing error strings include the raw RAG_API_URL
environment value (os.environ.get('RAG_API_URL')) which can leak secrets; update
both places that return that string to instead return a generic message like
"Error: RAG service unreachable" and move the detailed URL (and any exception)
to internal logs (use the module logger, e.g., logger.error or logger.debug) so
the URL is recorded for debugging but not exposed to users. Ensure you replace
direct interpolations of os.environ.get('RAG_API_URL') in the two return
statements with the generic text and log the sensitive value and exception
context via the logger in the surrounding function(s).
- Around line 1337-1353: The loop currently trusts server-provided tool
descriptors and mutates globals (TOOLS, TOOL_MAP, _TOOL_ICONS,
_MCP_REGISTERED_TOOLS) before validating them; update the registration to first
validate the tool metadata (ensure name is a non-empty string matching allowable
characters, not already in TOOL_MAP, description length <=1024, and inputSchema
is a dict resembling a JSON Schema with at least a "type" key or an object with
"properties"), then attempt to create the wrapper via _make_mcp_tool_wrapper
inside a try/except and only append to TOOLS, set TOOL_MAP[name],
_TOOL_ICONS.setdefault(name, ...), and _MCP_REGISTERED_TOOLS.append(name) after
validation and successful wrapper creation; on validation or wrapper failure
skip the tool and optionally log the reason rather than mutating globals.
In `@src/llm/prompts.py`:
- Around line 55-60: The prompt currently hardcodes tool names `rag_search`,
`rag_persist`, and `rag_session_announce`, which can point to non-existent
tools; update the prompt generation in prompts.py to reference the actual tool
names from the runtime `tools` mapping or to use placeholders populated at build
time (e.g., inject tools['rag_search'] / tools['rag_persist'] or a formatted
`{rag_search_tool}`) instead of literal strings so the directive always matches
available tool identifiers; modify the prompt template lines that mention "call
`rag_search`", "call `rag_persist`", and "If `rag_session_announce` exists" to
use the derived names/placeholder variables and ensure fallback text handles
missing tools gracefully.
In `@src/llm/rag_client.py`:
- Around line 93-96: The current request dict coercion uses int(top_k) directly
which can raise ValueError/TypeError; wrap the conversion of top_k in a
try/except (catching ValueError and TypeError), set a safe default (e.g.,
default_top_k = 5 or RAG_DEFAULT_TOP_K if available) when conversion fails, then
apply max(1, min(converted_top_k, 50)) before inserting into the dict; reference
the top_k variable and the place where the request dict is constructed to locate
and replace the direct int(top_k) usage.
In `@src/llm/rag_mcp_client.py`:
- Around line 90-91: The _responses and _response_events maps are accessed and
mutated from multiple threads without synchronization; add a dedicated
threading.Lock (e.g., self._responses_lock) in the class and use it to guard all
writes/reads to self._responses and self._response_events (creation, deletion,
lookup, and signaling), and when iterating for SSE cleanup take a snapshot
(e.g., acquire lock, copy keys or items, release lock) before iterating to avoid
races and dropped wakeups; update every place that touches
_responses/_response_events (including the SSE cleanup path and the blocks
around the sections noted) to use the lock.
- Around line 305-312: If MCPSSEClient.connect() can succeed but initialize()
may fail leaving background resources, ensure the partially-created client is
closed before returning; in the block that creates the client (MCPSSEClient,
client.connect(), client.initialize(), and setting _singleton), on any exception
call a cleanup on the temporary client (e.g. if hasattr(client, "close") call
client.close(), else if hasattr(client, "disconnect") call client.disconnect(),
or call a generic shutdown method) before returning None, and do not set
_singleton unless initialize() fully succeeded.
---
Nitpick comments:
In `@CHANGELOG.md`:
- Around line 31-34: Update the "Notes" section to explicitly mention graceful
degradation for auto-registration: state that the auto-registration process for
MCP tools is idempotent (it skips registering when a tool name collides with
built-ins) and that registration failures are logged but non-fatal; reference
the existing terms used in the diff (MCP, RAG_TRANSPORT, TOOL_MAP) so readers
understand this behavior applies alongside the unreachable-server/no-op behavior
already described.
- Line 9: Remove the transient implementation detail "~250 LoC" from the
CHANGELOG entry describing the new module; edit the sentence that mentions
`src/llm/rag_mcp_client.py` to omit the line-count parenthetical so it reads as
a concise feature description (keep mention that it is a minimal hand-rolled
MCP-over-SSE client using stdlib + `requests` but drop the volatile LoC figure).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 6078972e-4e95-41a3-9776-d8d5a4d06198
📒 Files selected for processing (21)
.gitignoreCHANGELOG.mdgotchi-skills/display/SKILL.mdrequirements.txtscripts/auto_update.shsetup.shsrc/bot/handlers.pysrc/bot/heartbeat.pysrc/bot/onboarding.pysrc/config.pysrc/drivers/epd2in13b_V4.pysrc/hardware/auto_mood.pysrc/hardware/battery.pysrc/hardware/display.pysrc/hardware/system.pysrc/llm/litellm_connector.pysrc/llm/prompts.pysrc/llm/rag_client.pysrc/llm/rag_mcp_client.pysrc/main.pysrc/ui/gotchi_ui.py
| ### Added | ||
| - **External RAG service integration via REST**: opt-in connector to any RAG (Retrieval-Augmented Generation) backend that exposes a small documented HTTP contract (see `src/llm/rag_client.py` module docstring). New `/rag` Telegram command (`/rag`, `/rag <query>`, `/rag --top N <query>`). LLM tools `query_rag(query, top_k)` and `persist_to_rag(text, title, tags)`. Env vars: `RAG_API_URL` (empty disables), `RAG_API_KEY`, `RAG_DEFAULT_COLLECTIONS`. | ||
| - **Bot can also act as a generic MCP client over SSE.** New module `src/llm/rag_mcp_client.py` is a hand-rolled minimal MCP-over-SSE client (~250 LoC) using only the stdlib + `requests` (already in the venv via litellm). Speaks just enough of the MCP spec to do `initialize` + `tools/list` + `tools/call` against any SSE-transport server — no `mcp[cli]` dependency, so the Pi Zero 2W's RAM budget is respected (the official package pulls in `cryptography`, `pydantic-settings`, `starlette`, `uvicorn`, etc.). | ||
| - **Two new LLM tools for MCP**: `mcp_list_tools()` (discover) and `mcp_call_tool(name, arguments)` (invoke). The agent decides which advertised tool to call. Activates only when `RAG_TRANSPORT=mcp` is set in the environment. | ||
| - New env var `RAG_TRANSPORT=rest|mcp` (default `rest`). When `mcp` is selected, `RAG_API_URL` is interpreted as the MCP-SSE base URL (e.g. `http://your-rag-host:8766`). |
There was a problem hiding this comment.
Document the MCP auto-registration feature.
The changelog describes the two generic tools (mcp_list_tools, mcp_call_tool) but omits the auto-registration capability that is the main point of this PR. According to the PR objectives and test plan, when RAG_TRANSPORT=mcp and the MCP server is reachable at startup, the code discovers advertised tools via tools/list and registers each as a first-class entry in TOOL_MAP, allowing the LLM to call them directly (e.g., rag_search(query=..., top_k=3)) instead of the two-step mcp_list_tools → mcp_call_tool flow.
Additionally, the changelog doesn't mention the "External Memory (MCP)" system prompt section that is emitted when at least one MCP tool is registered, which instructs the agent when and how to use RAG tools.
📝 Suggested addition to document auto-registration
Consider adding a bullet point after line 11:
- New env var `RAG_TRANSPORT=rest|mcp` (default `rest`). When `mcp` is selected, `RAG_API_URL` is interpreted as the MCP-SSE base URL (e.g. `http://your-rag-host:8766`).
+- **MCP tool auto-registration**: When `RAG_TRANSPORT=mcp` and the MCP server is reachable at startup, discovered tools (via `tools/list`) are automatically registered as first-class entries in `TOOL_MAP`. The LLM can call them directly (e.g., `rag_search(query, top_k)`) instead of using the generic two-step `mcp_list_tools` → `mcp_call_tool` flow. Colliding names are skipped; failures are gracefully logged. A "External Memory (MCP)" system prompt section is injected when registration succeeds, guiding the agent to search RAG before answering questions about preferences, project rules, or past context, and to persist durable lessons.
- **`/model` Telegram command**: inline-keyboard model picker. Without args it opens buttons for every preset (gemini, glm, ollama). With an argument (`/model glm`) it falls through to the existing `/use` flow. `/use` and `/switch` remain as text aliases.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@CHANGELOG.md` around lines 7 - 11, The changelog omits documentation of the
MCP auto-registration feature: when RAG_TRANSPORT=mcp and the MCP server is
reachable at startup the code calls tools/list and auto-registers each
discovered tool into TOOL_MAP so they become first-class callable tools (not
just via mcp_list_tools → mcp_call_tool), and it also emits an "External Memory
(MCP)" system prompt when at least one MCP tool is registered; update the Added
section to mention auto-registration of discovered tools into TOOL_MAP, give an
example (e.g., discovered tool becomes rag_search(query..., top_k=...)), and
note the emitted "External Memory (MCP)" system prompt behavior when MCP tools
are present (include references to mcp_list_tools, mcp_call_tool, TOOL_MAP, and
RAG_TRANSPORT).
| admin_id = get_admin_id() | ||
| if admin_id and user_id != admin_id: | ||
| await update.message.reply_text("⛔ Owner-only command.") | ||
| return |
There was a problem hiding this comment.
Enforce owner-only /update even when admin is not configured.
Current check is bypassed when get_admin_id() returns None, which can expose /update to non-owner users in permissive configs.
Proposed fix
admin_id = get_admin_id()
- if admin_id and user_id != admin_id:
+ if admin_id is None:
+ await update.message.reply_text("⛔ Owner is not configured. Set `ALLOWED_USERS` first.")
+ return
+ if user_id != admin_id:
await update.message.reply_text("⛔ Owner-only command.")
return📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| admin_id = get_admin_id() | |
| if admin_id and user_id != admin_id: | |
| await update.message.reply_text("⛔ Owner-only command.") | |
| return | |
| admin_id = get_admin_id() | |
| if admin_id is None: | |
| await update.message.reply_text("⛔ Owner is not configured. Set `ALLOWED_USERS` first.") | |
| return | |
| if user_id != admin_id: | |
| await update.message.reply_text("⛔ Owner-only command.") | |
| return |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/bot/handlers.py` around lines 1053 - 1056, The current guard uses
get_admin_id() alone so when it returns None the check passes and non-owners can
run /update; change the logic in the /update handler to always fetch the owner
id (e.g. call get_owner_id()) and enforce that the caller must be the owner
unless an admin is explicitly configured and matches the caller: compute
owner_id = get_owner_id() and admin_id = get_admin_id(), then if user_id !=
owner_id and (admin_id is None or user_id != admin_id) call
update.message.reply_text("⛔ Owner-only command.") and return; update references
to user_id, get_admin_id, get_owner_id, and update.message.reply_text in the
handler accordingly.
| if check_only: | ||
| status = "🆕 Updates available" if proc.returncode == 0 else "✅ Up-to-date" | ||
| await msg.edit_text(f"{status}\n\n```\n{out}\n```", parse_mode="Markdown") | ||
| return |
There was a problem hiding this comment.
/update check currently masks failures as “up-to-date”.
In check mode, any non-zero exit code is treated as up-to-date. Exit codes like 2/3/4 should be surfaced as errors, not success.
Proposed fix
if check_only:
- status = "🆕 Updates available" if proc.returncode == 0 else "✅ Up-to-date"
- await msg.edit_text(f"{status}\n\n```\n{out}\n```", parse_mode="Markdown")
+ if proc.returncode == 0:
+ status = "🆕 Updates available"
+ elif proc.returncode == 1:
+ status = "✅ Up-to-date"
+ else:
+ status = f"❌ Update check failed (exit {proc.returncode})"
+ await msg.edit_text(f"{status}\n\n```\n{out}\n```", parse_mode="Markdown")
return🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/bot/handlers.py` around lines 1078 - 1081, The check-only branch in the
handler currently treats any non-zero proc.returncode as "Up-to-date"; update
the logic in the check_only block (where proc.returncode and msg.edit_text are
used) to map returncode==0 => "🆕 Updates available", returncode==1 => "✅
Up-to-date", and any other non-zero returncode => "❌ Update check failed (exit
X)" (include the actual proc.returncode in the message) before calling
msg.edit_text to display the status and the captured output variable out.
| f"Schreibe **die GESAMTE Antwort** auf {_lang_name}. Jedes Wort, jeder " | ||
| f"Satz, auch die SAY:- und FACE:-Zeilen am Ende. Die Vorlage unten ist " | ||
| f"nur auf Englisch, weil sie eine Systemanweisung ist — DEINE Antwort " |
There was a problem hiding this comment.
Language pin currently conflicts with command protocol tokens.
These instructions can make the model localize FACE:/SAY: control lines, which breaks parse_and_execute_commands expectations and mood routing.
Proposed fix
- f"Schreibe **die GESAMTE Antwort** auf {_lang_name}. Jedes Wort, jeder "
- f"Satz, auch die SAY:- und FACE:-Zeilen am Ende. Die Vorlage unten ist "
+ f"Schreibe die Antwort auf {_lang_name}. "
+ f"Behalte die Steuer-Tokens **FACE:**, **SAY:** und **STATUS:** exakt so bei "
+ f"(nicht übersetzen). Übersetze nur den freien Textinhalt. Die Vorlage unten ist "
f"nur auf Englisch, weil sie eine Systemanweisung ist — DEINE Antwort "
f"muss auf {_lang_name} sein.\n\n---\n\n"
@@
- f"\n\n[Reflektiere auf {_lang_name}. Antworte auf {_lang_name}. "
- f"Nicht auf Englisch. {_lang_name} only.]"
+ f"\n\n[Reflektiere auf {_lang_name}. Tokens FACE:/SAY:/STATUS: bleiben unverändert.]"
)Also applies to: 322-324
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/bot/heartbeat.py` around lines 315 - 317, The language-pin added in
heartbeat.py currently tells the model to translate the entire reply, which can
localize control tokens like "FACE:" and "SAY:" and break
parse_and_execute_commands and mood routing; change the prompt text used where
the string is built (the language pin around _lang_name) to explicitly exclude
or preserve protocol tokens by instructing "Translate all user-visible content
into {_lang_name}; do NOT translate, alter, or localize any protocol/control
tokens such as FACE:, SAY:, or other command lines — leave them exactly as-is"
(apply the same change to the similar string at the other occurrence around
lines 322-324). Ensure the wording makes control tokens invariant so
parse_and_execute_commands and mood routing receive English tokens unchanged.
| try: | ||
| if identity_file.exists() and ( | ||
| identity_file.stat().st_mtime > BOOTSTRAP_FILE.stat().st_mtime | ||
| ): | ||
| log.info("Auto-completing onboarding: IDENTITY.md newer than BOOTSTRAP.md") | ||
| BOOTSTRAP_FILE.unlink() | ||
| return False | ||
| except Exception as e: | ||
| log.warning(f"Onboarding auto-complete check failed: {e}") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify broad exception usage in onboarding-related code.
rg -n -C3 'except\s+Exception\b' src/bot/onboarding.py src/botRepository: turmyshevd/openclawgotchi
Length of output: 12939
🏁 Script executed:
cat -n src/bot/onboarding.py | sed -n '20,40p'Repository: turmyshevd/openclawgotchi
Length of output: 1218
🏁 Script executed:
head -20 src/bot/onboarding.pyRepository: turmyshevd/openclawgotchi
Length of output: 577
Narrow the broad exception catch in onboarding auto-complete.
At line 34, except Exception is too broad and can mask non-filesystem bugs; only filesystem-related exceptions should be handled here. All operations in the try block are pathlib.Path methods that raise OSError or its subclasses.
Proposed fix
try:
if identity_file.exists() and (
identity_file.stat().st_mtime > BOOTSTRAP_FILE.stat().st_mtime
):
log.info("Auto-completing onboarding: IDENTITY.md newer than BOOTSTRAP.md")
BOOTSTRAP_FILE.unlink()
return False
- except Exception as e:
+ except OSError as e:
log.warning(f"Onboarding auto-complete check failed: {e}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| if identity_file.exists() and ( | |
| identity_file.stat().st_mtime > BOOTSTRAP_FILE.stat().st_mtime | |
| ): | |
| log.info("Auto-completing onboarding: IDENTITY.md newer than BOOTSTRAP.md") | |
| BOOTSTRAP_FILE.unlink() | |
| return False | |
| except Exception as e: | |
| log.warning(f"Onboarding auto-complete check failed: {e}") | |
| try: | |
| if identity_file.exists() and ( | |
| identity_file.stat().st_mtime > BOOTSTRAP_FILE.stat().st_mtime | |
| ): | |
| log.info("Auto-completing onboarding: IDENTITY.md newer than BOOTSTRAP.md") | |
| BOOTSTRAP_FILE.unlink() | |
| return False | |
| except OSError as e: | |
| log.warning(f"Onboarding auto-complete check failed: {e}") |
🧰 Tools
🪛 Ruff (0.15.12)
[warning] 34-34: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/bot/onboarding.py` around lines 27 - 35, The except clause around the
filesystem checks for identity_file and BOOTSTRAP_FILE is too broad; narrow it
to only handle filesystem-related errors by catching OSError (and its
subclasses) instead of Exception. Update the try/except in the onboarding
auto-complete block that uses identity_file.stat(), identity_file.exists(), and
BOOTSTRAP_FILE.stat() so it only catches OSError, and keep the existing
log.warning message (including the exception) for that case; let other
exceptions propagate.
| for tool in tools: | ||
| name = tool.get("name") | ||
| if not name or name in TOOL_MAP: | ||
| continue | ||
| desc = (tool.get("description") or name)[:1024] | ||
| schema = tool.get("inputSchema") or {"type": "object", "properties": {}} | ||
| TOOLS.append({ | ||
| "type": "function", | ||
| "function": { | ||
| "name": name, | ||
| "description": desc, | ||
| "parameters": schema, | ||
| }, | ||
| }) | ||
| TOOL_MAP[name] = _make_mcp_tool_wrapper(name) | ||
| _TOOL_ICONS.setdefault(name, "🔌") | ||
| _MCP_REGISTERED_TOOLS.append(name) |
There was a problem hiding this comment.
Validate MCP tool metadata before mutating TOOLS and TOOL_MAP.
The registration path trusts server-provided name/inputSchema as-is. A malformed descriptor can poison the global tool schema and break tool-calling for the whole session.
🛡️ Suggested fix
registered = 0
for tool in tools:
- name = tool.get("name")
- if not name or name in TOOL_MAP:
+ name = tool.get("name")
+ if not isinstance(name, str):
+ continue
+ name = name.strip()
+ if not name or name in TOOL_MAP:
continue
desc = (tool.get("description") or name)[:1024]
- schema = tool.get("inputSchema") or {"type": "object", "properties": {}}
+ schema = tool.get("inputSchema")
+ if not isinstance(schema, dict):
+ schema = {"type": "object", "properties": {}}
+ elif schema.get("type") not in (None, "object"):
+ log.warning("MCP auto-registration: skipping %s (non-object input schema)", name)
+ continue
+ schema.setdefault("type", "object")
+ schema.setdefault("properties", {})
TOOLS.append({
"type": "function",
"function": {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/llm/litellm_connector.py` around lines 1337 - 1353, The loop currently
trusts server-provided tool descriptors and mutates globals (TOOLS, TOOL_MAP,
_TOOL_ICONS, _MCP_REGISTERED_TOOLS) before validating them; update the
registration to first validate the tool metadata (ensure name is a non-empty
string matching allowable characters, not already in TOOL_MAP, description
length <=1024, and inputSchema is a dict resembling a JSON Schema with at least
a "type" key or an object with "properties"), then attempt to create the wrapper
via _make_mcp_tool_wrapper inside a try/except and only append to TOOLS, set
TOOL_MAP[name], _TOOL_ICONS.setdefault(name, ...), and
_MCP_REGISTERED_TOOLS.append(name) after validation and successful wrapper
creation; on validation or wrapper failure skip the tool and optionally log the
reason rather than mutating globals.
| "- BEFORE answering questions about user preferences, project rules, decisions, or past context: " | ||
| "call `rag_search` (or the equivalent recall tool) to ground your reply in stored knowledge. " | ||
| "Do not assume from memory alone if a relevant rule might exist.\n" | ||
| "- WHEN you learn something durable (preferences, decisions, project facts, hard-won lessons): " | ||
| "call `rag_persist` to save it. Skip casual chat. Prefer short, factual notes with tags.\n" | ||
| "- If `rag_session_announce` exists, you may announce session context once per conversation " |
There was a problem hiding this comment.
Don’t hardcode rag_search / rag_persist in a dynamic MCP-tool prompt.
The directive can instruct calls to non-existent tools when server names differ, even though you already have the real names in tools.
♻️ Suggested fix
- return (
+ search_tool = next((t for t in tools if any(k in t.lower() for k in ("search", "recall", "query"))), None)
+ persist_tool = next((t for t in tools if any(k in t.lower() for k in ("persist", "save", "write", "store"))), None)
+ session_tool = next((t for t in tools if "session" in t.lower() and "announce" in t.lower()), None)
+
+ search_instr = (
+ f"call `{search_tool}` (or the equivalent recall tool)"
+ if search_tool else
+ "call the appropriate memory-search tool"
+ )
+ persist_instr = (
+ f"call `{persist_tool}` to save it"
+ if persist_tool else
+ "call the appropriate memory-persist tool to save it"
+ )
+
+ return (
"\n---\n## External Memory (MCP)\n"
f"You have first-class access to a long-term RAG memory via these tools: {bullet_list}.\n"
"Treat the RAG as your durable memory — your in-process facts and chat history are short-term.\n"
"Use it proactively, not just when the user asks:\n"
"- BEFORE answering questions about user preferences, project rules, decisions, or past context: "
- "call `rag_search` (or the equivalent recall tool) to ground your reply in stored knowledge. "
+ f"{search_instr} to ground your reply in stored knowledge. "
"Do not assume from memory alone if a relevant rule might exist.\n"
"- WHEN you learn something durable (preferences, decisions, project facts, hard-won lessons): "
- "call `rag_persist` to save it. Skip casual chat. Prefer short, factual notes with tags.\n"
- "- If `rag_session_announce` exists, you may announce session context once per conversation "
+ f"{persist_instr}. Skip casual chat. Prefer short, factual notes with tags.\n"
+ f"- If `{session_tool}` exists, you may announce session context once per conversation "
"so the server can route you to the right collections.\n"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/llm/prompts.py` around lines 55 - 60, The prompt currently hardcodes tool
names `rag_search`, `rag_persist`, and `rag_session_announce`, which can point
to non-existent tools; update the prompt generation in prompts.py to reference
the actual tool names from the runtime `tools` mapping or to use placeholders
populated at build time (e.g., inject tools['rag_search'] / tools['rag_persist']
or a formatted `{rag_search_tool}`) instead of literal strings so the directive
always matches available tool identifiers; modify the prompt template lines that
mention "call `rag_search`", "call `rag_persist`", and "If
`rag_session_announce` exists" to use the derived names/placeholder variables
and ensure fallback text handles missing tools gracefully.
| "query": text.strip()[:8192], | ||
| "collections": collections or RAG_DEFAULT_COLLECTIONS, | ||
| "top_k": max(1, min(int(top_k), 50)), | ||
| } |
There was a problem hiding this comment.
Guard top_k coercion to prevent runtime exceptions.
int(top_k) can raise ValueError/TypeError (e.g., malformed tool args), currently uncaught.
Proposed fix
- body = {
+ try:
+ safe_top_k = max(1, min(int(top_k), 50))
+ except (TypeError, ValueError):
+ safe_top_k = 5
+
+ body = {
"query": text.strip()[:8192],
"collections": collections or RAG_DEFAULT_COLLECTIONS,
- "top_k": max(1, min(int(top_k), 50)),
+ "top_k": safe_top_k,
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/llm/rag_client.py` around lines 93 - 96, The current request dict
coercion uses int(top_k) directly which can raise ValueError/TypeError; wrap the
conversion of top_k in a try/except (catching ValueError and TypeError), set a
safe default (e.g., default_top_k = 5 or RAG_DEFAULT_TOP_K if available) when
conversion fails, then apply max(1, min(converted_top_k, 50)) before inserting
into the dict; reference the top_k variable and the place where the request dict
is constructed to locate and replace the direct int(top_k) usage.
| self._responses: dict[int, dict] = {} | ||
| self._response_events: dict[int, threading.Event] = {} |
There was a problem hiding this comment.
Synchronize cross-thread response maps to avoid races.
_responses / _response_events are mutated from multiple threads without locking. This can drop wakeups or throw during iteration in the SSE cleanup path.
Proposed fix (lock shared maps + snapshot iteration)
self._responses: dict[int, dict] = {}
self._response_events: dict[int, threading.Event] = {}
+ self._responses_lock = threading.Lock()
@@
- for evt in self._response_events.values():
+ with self._responses_lock:
+ pending = list(self._response_events.values())
+ for evt in pending:
evt.set()
@@
- if msg_id in self._response_events:
- self._responses[msg_id] = msg
- self._response_events[msg_id].set()
+ with self._responses_lock:
+ evt = self._response_events.get(msg_id)
+ if evt is not None:
+ self._responses[msg_id] = msg
+ evt.set()
@@
- self._response_events[req_id] = evt
+ with self._responses_lock:
+ self._response_events[req_id] = evt
@@
- resp = self._responses.pop(req_id, None)
+ with self._responses_lock:
+ resp = self._responses.pop(req_id, None)
@@
- self._response_events.pop(req_id, None)
+ with self._responses_lock:
+ self._response_events.pop(req_id, None)Also applies to: 150-173, 183-204
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/llm/rag_mcp_client.py` around lines 90 - 91, The _responses and
_response_events maps are accessed and mutated from multiple threads without
synchronization; add a dedicated threading.Lock (e.g., self._responses_lock) in
the class and use it to guard all writes/reads to self._responses and
self._response_events (creation, deletion, lookup, and signaling), and when
iterating for SSE cleanup take a snapshot (e.g., acquire lock, copy keys or
items, release lock) before iterating to avoid races and dropped wakeups; update
every place that touches _responses/_response_events (including the SSE cleanup
path and the blocks around the sections noted) to use the lock.
| try: | ||
| client = MCPSSEClient(base, api_key=api_key) | ||
| client.connect() | ||
| client.initialize() | ||
| _singleton = client | ||
| except Exception as e: | ||
| log.warning(f"MCP client setup failed ({base}): {e}") | ||
| return None |
There was a problem hiding this comment.
Close partially created client when singleton setup fails.
If connect() succeeds but initialize() fails, the temporary client is not closed, leaving resources/background thread alive.
Proposed fix
if _singleton is None:
+ client = None
try:
client = MCPSSEClient(base, api_key=api_key)
client.connect()
client.initialize()
_singleton = client
except Exception as e:
+ if client is not None:
+ try:
+ client.close()
+ except Exception:
+ pass
log.warning(f"MCP client setup failed ({base}): {e}")
return None🧰 Tools
🪛 Ruff (0.15.12)
[warning] 310-310: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/llm/rag_mcp_client.py` around lines 305 - 312, If MCPSSEClient.connect()
can succeed but initialize() may fail leaving background resources, ensure the
partially-created client is closed before returning; in the block that creates
the client (MCPSSEClient, client.connect(), client.initialize(), and setting
_singleton), on any exception call a cleanup on the temporary client (e.g. if
hasattr(client, "close") call client.close(), else if hasattr(client,
"disconnect") call client.disconnect(), or call a generic shutdown method)
before returning None, and do not set _singleton unless initialize() fully
succeeded.
Summary
Builds on #12 (minimal MCP-SSE client). When
RAG_TRANSPORT=mcpand the server is reachable at startup, discovers the server's advertised tools viatools/listand registers each one as a first-class entry in the LLM'sTOOLSschema list andTOOL_MAPdispatcher. The agent calls e.g.rag_search(query=..., top_k=3)directly instead of going through the two-hopmcp_list_tools → mcp_call_toolindirection.A small system-prompt section "External Memory (MCP)" is emitted only when at least one MCP tool was registered. It lists what's available and tells the bot to:
Why
The previous PR exposed
mcp_list_tools/mcp_call_toolas generic glue, but the LLM wouldn't reach for them on its own — and even when it did, the two-hop indirection wasted turns. Auto-registration lets the agent treat the RAG as its durable memory the same way it already usesremember_fact/recall_factsfor the in-process store; the prompt block tells it when to use them. Without the prompt block I observed the bot answering from short-term memory and missing rules that were sitting in the RAG.Behavior
mcp_list_tools/mcp_call_toolfallback path remains usable; the bot stays alive.RAG_TRANSPORT=mcp(same env var as feat(mcp): bot as MCP client over SSE — minimal stdlib client + 2 LLM tools #12). Defaultrestdeployments are unaffected — the directive is only added when registration succeeds.Test plan
TOOL_MAPTOOL_MAP['rag_search'](query='display variant', top_k=2)returns real hitsactive, no traceback, prompt block visible in built system contextrag_searchin logs🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
/batterycommand/ragcommand for memory and search/modelcommandBug Fixes
Documentation