Skip to content

fix(runtime): repopulate StateView per-server tools on reconnect (MCP-2094)#637

Merged
Dumbris merged 1 commit into
mainfrom
fix/mcp-2094-stateview-tool-repop
Jun 14, 2026
Merged

fix(runtime): repopulate StateView per-server tools on reconnect (MCP-2094)#637
Dumbris merged 1 commit into
mainfrom
fix/mcp-2094-stateview-tool-repop

Conversation

@Dumbris

@Dumbris Dumbris commented Jun 12, 2026

Copy link
Copy Markdown
Member

Summary

Root-cause fix for the StateView per-server tool repopulation race behind MCP-2083 (PR #635, which fixed the symptom at the read layer — the Tools tab falls back to the bleve index when the StateView per-server cache is empty).

The deeper problem remained: in internal/runtime/supervisor/supervisor.go,

  • the connection-down handler clears status.Tools = nil on disconnect,
  • the reconnect handler deliberately did not repopulate it ("background indexing will handle it"), and
  • RefreshToolsFromDiscovery had a guard that skipped the StateView update whenever the new tool set was smaller than the current one.

Together these left the StateView per-server tool set transiently/persistently empty after a reconnect/unquarantine, so other StateView consumers that don't route through the #635 read fallback — tray tool counts, SSE servers.changed counts, health/diagnostics — could still show 0 tools for a connected server that has tools.

Changes

  • Reconnect repopulation — on a connected=true event, repopulate status.Tools from the retained Supervisor snapshot (the snapshot keeps Tools across a disconnect; only StateView is cleared), so StateView is consistent immediately instead of waiting for background discovery to re-run. Background discovery still overwrites with fresh data afterward.
  • Drop the size-based guard in RefreshToolsFromDiscovery — it pinned StateView to a stale higher count when a server legitimately dropped tools, diverging from the Supervisor snapshot (which is updated unconditionally) and the bleve index. Servers with zero discovered tools never reach that loop (absent from toolsByServer), so the guard never protected against empty/stale discoveries anyway. StateView now mirrors the snapshot last-writer-wins.
  • Extract toolInfosFromMetadata so reconcile, discovery refresh, and reconnect repopulation produce an identical StateView tool set (removes 2 copies of the conversion loop).

Tests

  • TestSupervisor_ReconnectRepopulatesStateViewTools — disconnect clears, reconnect repopulates StateView (was 0, now 2).
  • TestSupervisor_RefreshToolsFromDiscovery_ShrinkingToolSet — a later discovery reporting fewer (non-empty) tools now updates StateView instead of being silently skipped.

Both fail on origin/main and pass with this change.

Verification

go build ./...                                           # ok
go test ./internal/runtime/supervisor/... ./internal/runtime/stateview/... -race   # ok
go test ./internal/runtime/ -race                        # ok (tool-approval canary)
./scripts/run-linter.sh                                  # 0 issues

No user-facing CLI/API/config/docs surface changes (internal runtime consistency fix), so no docs diff required.

Related: MCP-2083 (PR #635)

…-2094)

The connection-down handler clears the StateView per-server tool set
(status.Tools = nil) on disconnect, but the reconnect handler
deliberately left it empty, relying on background discovery to refill
it. This left a transient/persistent window where StateView reports 0
tools for a connected server that has tools. Consumers that don't use
the #635 read fallback (tray tool counts, SSE servers.changed counts,
health/diagnostics) showed 0 tools after a reconnect/unquarantine.

Fix the root cause so StateView stays the consistent source of truth:

- On reconnect, repopulate StateView.Tools from the retained Supervisor
  snapshot (which keeps tools across a disconnect) instead of waiting
  for background discovery to re-run.
- Drop the size-based guard in RefreshToolsFromDiscovery that skipped
  updates whenever the new set was smaller. It pinned StateView to a
  stale higher count when a server legitimately dropped tools, diverging
  from the snapshot (updated unconditionally). Servers with zero tools
  never reach that loop, so the guard never protected against empty
  discoveries anyway.
- Extract toolInfosFromMetadata so reconcile, discovery refresh, and
  reconnect repopulation produce an identical StateView tool set.

Adds reconnect->StateView-still-populated and shrinking-tool-set tests.

Related: MCP-2083 (PR #635, read-layer symptom fix)
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying mcpproxy-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 94c6147
Status: ✅  Deploy successful!
Preview URL: https://19bbb767.mcpproxy-docs.pages.dev
Branch Preview URL: https://fix-mcp-2094-stateview-tool.mcpproxy-docs.pages.dev

View logs

@codecov-commenter

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 91.66667% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/runtime/supervisor/supervisor.go 91.66% 1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@github-actions

Copy link
Copy Markdown

📦 Build Artifacts

Workflow Run: View Run
Branch: fix/mcp-2094-stateview-tool-repop

Available Artifacts

  • archive-darwin-amd64 (28 MB)
  • archive-darwin-arm64 (25 MB)
  • archive-linux-amd64 (16 MB)
  • archive-linux-arm64 (14 MB)
  • archive-windows-amd64 (28 MB)
  • archive-windows-arm64 (24 MB)
  • frontend-dist-pr (0 MB)
  • installer-dmg-darwin-amd64 (21 MB)
  • installer-dmg-darwin-arm64 (19 MB)

How to Download

Option 1: GitHub Web UI (easiest)

  1. Go to the workflow run page linked above
  2. Scroll to the bottom "Artifacts" section
  3. Click on the artifact you want to download

Option 2: GitHub CLI

gh run download 27400596336 --repo smart-mcp-proxy/mcpproxy-go

Note: Artifacts expire in 14 days.

@mcpproxy-gatekeeper mcpproxy-gatekeeper Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gatekeeper approval — Codex review verdict: ACCEPT.

This approval is posted automatically by the MCPProxy Gatekeeper App on behalf of the Codex reviewer (verdict of record lives in the Paperclip review thread). Author≠approver satisfied; QA + CI gates enforced separately.

Auto-approved per Model B (MCP-1249).

@Dumbris Dumbris merged commit 62579bf into main Jun 14, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants