Skip to content

fix(runtime): serve indexed tools when StateView per-server cache is empty (MCP-2083)#635

Merged
Dumbris merged 1 commit into
mainfrom
fix/mcp-2083-approve-clears-tools
Jun 12, 2026
Merged

fix(runtime): serve indexed tools when StateView per-server cache is empty (MCP-2083)#635
Dumbris merged 1 commit into
mainfrom
fix/mcp-2083-approve-clears-tools

Conversation

@Dumbris

@Dumbris Dumbris commented Jun 12, 2026

Copy link
Copy Markdown
Member

Problem (MCP-2083)

Approving a quarantined remote server (repro: com.googleapis.sqladmin/mcp, 15 tools) flips it Enabled/Online but the Tools tab then shows "Tools: 0 / No tools available" — unquarantining appears to drop the per-server tool list.

Root cause (verified against /tmp/sqladmin-scanner-logs.txt)

The Tools tab is served by GET /api/v1/servers/{id}/toolshandleGetServerToolsmgmtSvc.GetServerTools()Runtime.GetServerTools(), which reads solely from the volatile per-server StateView snapshot.

On approve, QuarantineServer(false) reloads config and the server disconnects then reconnects (logs: 3× "Disconnecting" then "Successfully connected" at 09:09:03). The supervisor clears status.Tools = nil on the connection-down event (supervisor.go:961) and the reconnect handler deliberately does not repopulate it ("background indexing will handle it", supervisor.go:945). Re-discovery does re-index the tools (logs: "added: 15 … Successfully indexed … count: 15") — i.e. the durable search index correctly holds 15 tools — but the StateView per-server cache can remain empty, and GetServerTools serves that empty snapshot. The index and upstream both report 15; only StateView is stale.

Fix

Defensive fallback in Runtime.GetServerTools (internal/runtime/runtime.go): when StateView reports zero tools for a known server, serve the server's tools from the authoritative search index (indexManager.GetToolsByServer), normalizing names to the bare tool name so the httpapi enrichment layer still attaches approval status. A populated StateView is still preferred (freshest source); a genuinely empty server still reports empty. This is defense-in-depth at the serving layer and is robust to the exact disconnect/reconnect StateView ordering.

Tests (TDD)

New internal/runtime/get_server_tools_fallback_test.go:

  • TestGetServerTools_FallsBackToIndexWhenStateViewEmpty — reproduces the bug (StateView empty + index populated → was 0, now serves index). Confirmed failing before the fix.
  • TestGetServerTools_PrefersStateViewWhenPopulated — guards against overriding a live StateView with stale index data.
  • TestGetServerTools_EmptyStateViewAndIndexReturnsEmpty — a truly empty server still reports empty.

Verification

  • go build ./...
  • go test ./internal/runtime/... -race ✅ (incl. the TestCalculateToolApprovalHash_Stability canary)
  • go test ./internal/management/... ./internal/httpapi/...
  • ./scripts/run-linter.sh ✅ 0 issues

Scope / coupling

This resolves the user-visible empty-Tools-tab symptom independently. In this repro the 15 tools are no longer blocked after unquarantine (enforceQuarantine becomes false → indexed, blocked: 0), so the serving fallback is sufficient. The trust-model change (auto-approve pending baseline on server-approve) belongs to the cross-linked Ticket A and is intentionally out of scope here.

No user-facing/doc/API/config contract change (the endpoint contract is unchanged — just more correct), so no docs diff.

Related MCP-2083

…empty

Approving (unquarantining) a server triggers a disconnect/reconnect cycle.
The supervisor clears the per-server StateView tool list on the connection-down
event and only repopulates it asynchronously from background discovery, so in
the field the StateView can hold zero tools even though the durable search index
already indexed the server's tools. handleGetServerTools / GetServerTools serve
straight from that volatile StateView snapshot, so the Tools tab showed
"Tools: 0 / No tools available" for a connected server with 15 indexed tools.

Add a defensive fallback in Runtime.GetServerTools: when StateView reports no
tools for a known server, serve the server's tools from the authoritative search
index (normalizing names to the bare tool name to match the approval-record /
enrichment convention). A populated StateView is still preferred (freshest
source) and a genuinely empty server still reports empty.

Related MCP-2083
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying mcpproxy-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 0fac057
Status: ✅  Deploy successful!
Preview URL: https://10096f81.mcpproxy-docs.pages.dev
Branch Preview URL: https://fix-mcp-2083-approve-clears.mcpproxy-docs.pages.dev

View logs

@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 80.95238% with 4 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/runtime/runtime.go 80.95% 2 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@github-actions

Copy link
Copy Markdown

📦 Build Artifacts

Workflow Run: View Run
Branch: fix/mcp-2083-approve-clears-tools

Available Artifacts

  • archive-darwin-amd64 (28 MB)
  • archive-darwin-arm64 (25 MB)
  • archive-linux-amd64 (16 MB)
  • archive-linux-arm64 (14 MB)
  • archive-windows-amd64 (28 MB)
  • archive-windows-arm64 (24 MB)
  • frontend-dist-pr (0 MB)
  • installer-dmg-darwin-amd64 (21 MB)
  • installer-dmg-darwin-arm64 (19 MB)

How to Download

Option 1: GitHub Web UI (easiest)

  1. Go to the workflow run page linked above
  2. Scroll to the bottom "Artifacts" section
  3. Click on the artifact you want to download

Option 2: GitHub CLI

gh run download 27398891079 --repo smart-mcp-proxy/mcpproxy-go

Note: Artifacts expire in 14 days.

@Dumbris Dumbris merged commit f149181 into main Jun 12, 2026
35 checks passed
Dumbris added a commit that referenced this pull request Jun 14, 2026
…-2094) (#637)

The connection-down handler clears the StateView per-server tool set
(status.Tools = nil) on disconnect, but the reconnect handler
deliberately left it empty, relying on background discovery to refill
it. This left a transient/persistent window where StateView reports 0
tools for a connected server that has tools. Consumers that don't use
the #635 read fallback (tray tool counts, SSE servers.changed counts,
health/diagnostics) showed 0 tools after a reconnect/unquarantine.

Fix the root cause so StateView stays the consistent source of truth:

- On reconnect, repopulate StateView.Tools from the retained Supervisor
  snapshot (which keeps tools across a disconnect) instead of waiting
  for background discovery to re-run.
- Drop the size-based guard in RefreshToolsFromDiscovery that skipped
  updates whenever the new set was smaller. It pinned StateView to a
  stale higher count when a server legitimately dropped tools, diverging
  from the snapshot (updated unconditionally). Servers with zero tools
  never reach that loop, so the guard never protected against empty
  discoveries anyway.
- Extract toolInfosFromMetadata so reconcile, discovery refresh, and
  reconnect repopulation produce an identical StateView tool set.

Adds reconnect->StateView-still-populated and shrinking-tool-set tests.

Related: MCP-2083 (PR #635, read-layer symptom fix)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants