Skip to content

feat(registries): harden registry fetches against SSRF (CWE-918, MCP-1076)#745

Merged
Dumbris merged 3 commits into
mainfrom
security/mcp-1076-registry-ssrf
Jun 22, 2026
Merged

feat(registries): harden registry fetches against SSRF (CWE-918, MCP-1076)#745
Dumbris merged 3 commits into
mainfrom
security/mcp-1076-registry-ssrf

Conversation

@Dumbris

@Dumbris Dumbris commented Jun 22, 2026

Copy link
Copy Markdown
Member

What

Hardens registry URL fetches against SSRF / CodeQL go/request-forgery (alerts #97/#98, MCP-1076). Registry server-list fetches use a user-supplied URL (registry add-source), so a malicious or typo'd source could point the daemon at internal/metadata endpoints (169.254.169.254, RFC1918 ranges).

How

A single isBlockedIP predicate (new internal/registries/ssrf.go) is the source of truth for blocked ranges: loopback, RFC1918 private, RFC6598 CGNAT (100.64/10), link-local incl. the cloud-metadata endpoint, IPv6 unique-local, unspecified, multicast — fails closed on an unparseable IP. It is enforced at three layers:

  1. Dial time — a net.Dialer.Control hook on the shared registry HTTP client checks the actual resolved IP before connect. This is the authoritative guard: it catches hostnames that resolve into blocked ranges and closes the DNS-rebinding TOCTOU window a parse-time check alone cannot.
  2. Fetch pre-flightvalidateRegistryURL rejects a literal-IP host (defense in depth + an explicit barrier at the http.NewRequest sink, alongside the existing scheme/host pin).
  3. add-source / edit-source — literal-IP fail-fast (no DNS, stays pure/offline) so a bad source is refused up front with the stable invalid_registry_url code.

Opt-out

New top-level allow_private_registry_fetch config flag (default false = secure). Set true only to run a trusted registry mirror on an internal/private address. Wired through SetRegistriesFromConfig so config hot-reload keeps the dial guard in sync. This is the "optional allow/deny policy" called for in the issue.

Tests (TDD)

  • TestIsBlockedIP — every blocked range + public allow cases incl. RFC1918/CGNAT boundaries + IPv6 + fail-closed nil.
  • TestHostLiteralBlocked, TestValidateRegistrySourceURL — literal-IP rejection, hostname pass-through, opt-in bypass.
  • TestRegistryGet_BlocksLoopbackWhenGuardActive — end-to-end: guarded fetch refused at dial time, endpoint never reached.
  • TestBuildRegistrySourceEntry_RejectsSSRFLiteralIP — add-source fail-fast.
  • The two registry-add E2E daemon tests opt in via the new config flag.

Verified locally: go test ./internal/registries/... ./internal/config/... -race, registry-add E2E daemon tests, add/edit-source unit tests, golangci-lint v2 (.github/.golangci.yml) 0 issues, gofmt, make swagger regenerated.

Notes

CodeQL #97/#98 are pre-existing alerts on main (created by #572, surfaced — not introduced — by #594). This PR adds the mitigating guard; the alerts can be dismissed as mitigated once merged.

Closes MCP-1076.

…1076)

Registry server-list fetches use a user-supplied URL (`registry add-source`),
which CodeQL flags as go/request-forgery (#97/#98): a malicious or typo'd
registry source could point the daemon at internal/metadata endpoints
(169.254.169.254, RFC1918 ranges).

Add an SSRF guard centered on a single `isBlockedIP` predicate (loopback,
RFC1918, RFC6598 CGNAT, link-local incl. the cloud-metadata endpoint,
IPv6 unique-local, unspecified, multicast; fails closed), applied at three
layers:

- dial time: a net.Dialer Control hook on the shared registry HTTP client
  rejects the actual resolved IP before connect — covers hostnames that resolve
  into blocked ranges and closes the DNS-rebinding TOCTOU window a parse-time
  check alone cannot.
- fetch pre-flight: validateRegistryURL rejects a literal-IP host (defense in
  depth + an explicit barrier at the http.NewRequest sink).
- add-source/edit-source: literal-IP fail-fast (no DNS, stays pure/offline) so a
  bad source is refused up front with the stable invalid_registry_url code.

Opt out with a new top-level `allow_private_registry_fetch` config flag
(default false) for operators who run a trusted registry mirror on an
internal/private address.

TDD: table tests for every blocked/allowed range (incl. CGNAT and RFC1918
boundaries), an end-to-end loopback dial block, add-source literal-IP rejection,
and the two registry-add E2E daemon tests opt in via the new flag.
@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 22, 2026

Copy link
Copy Markdown

Deploying mcpproxy-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 1ae8c20
Status: ✅  Deploy successful!
Preview URL: https://d4c1ef35.mcpproxy-docs.pages.dev
Branch Preview URL: https://security-mcp-1076-registry-s.mcpproxy-docs.pages.dev

View logs

The in-process server tests that fetch a registry served on a loopback httptest
server (consistency_official, mcp_add_from_registry's startTestRegistry helper)
now hit the new SSRF dial guard, because the registries-package test bypass does
not apply to the server package's test binary. Set AllowPrivateRegistryFetch on
their SetRegistriesFromConfig configs — exactly how an operator opts in for a
trusted internal mirror — so the fetch is permitted while the production guard
stays active for the add-source rejection test.

Fixes the linux/arm64 Build Binaries leg (TestCrossSurfaceConsistency_*,
TestHandleUpstreamServers_AddFromRegistry_*).
@Dumbris

Dumbris commented Jun 22, 2026

Copy link
Copy Markdown
Member Author

Changes requested — SSRF guard blocks loopback, breaking internal/server registry-add tests on CI (TestCrossSurfaceConsistency_RegistryAdd, TestHandleUpstreamServers_AddFromRegistry_*, TestCrossSurfaceConsistency_OfficialClassification). The tests use httptest servers on 127.0.0.1. Make the resolver/policy injectable (or add an opt-in allow for trusted/test hosts) so prod stays deny-by-default but tests/local pass. Details in MCP-1076.

@codecov-commenter

codecov-commenter commented Jun 22, 2026

Copy link
Copy Markdown

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 72.13115% with 34 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/registries/ssrf.go 70.40% 24 Missing and 5 partials ⚠️
internal/server/edit_registry_source.go 0.00% 2 Missing and 2 partials ⚠️
internal/registries/httpclient.go 94.11% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown

📦 Build Artifacts

Workflow Run: View Run
Branch: security/mcp-1076-registry-ssrf

Available Artifacts

  • archive-darwin-amd64 (28 MB)
  • archive-darwin-arm64 (25 MB)
  • archive-linux-amd64 (16 MB)
  • archive-linux-arm64 (14 MB)
  • archive-windows-amd64 (28 MB)
  • archive-windows-arm64 (25 MB)
  • frontend-dist-pr (0 MB)
  • installer-dmg-darwin-amd64 (21 MB)
  • installer-dmg-darwin-arm64 (19 MB)

How to Download

Option 1: GitHub Web UI (easiest)

  1. Go to the workflow run page linked above
  2. Scroll to the bottom "Artifacts" section
  3. Click on the artifact you want to download

Option 2: GitHub CLI

gh run download 27937618138 --repo smart-mcp-proxy/mcpproxy-go

Note: Artifacts expire in 14 days.

CodexReviewer (round 2) found a real gap: the registry client clones
http.DefaultTransport, which keeps Proxy: ProxyFromEnvironment. With
HTTP_PROXY/HTTPS_PROXY set, the dialer connects to the PROXY as the first hop,
so registryDialControl only ever validates the proxy's IP — never the real
target. A hostname resolving to loopback/private/link-local (incl.
169.254.169.254) could still be fetched through a proxy.

Add an application-layer guard (guardRegistryTargetHost) that resolves the
target host and rejects the fetch if ANY resolved IP is in a blocked range. It
runs before the request in registryGet, independent of the transport/proxy, so
it holds in the proxy case; the dial-time Control stays as defense-in-depth for
the direct path and DNS-rebind TOCTOU. Fail-open on resolver error (a flaky
resolver must not break every fetch); literal IPs already covered pre-flight;
relaxed by the allow_private_registry_fetch opt-in.

Regression coverage: a public-looking hostname resolving to 169.254.169.254 is
refused with HTTP(S)_PROXY set; plus direct predicate tests (mixed
public+private resolution blocked, all-public allowed, resolver-error fail-open,
public literal allowed). Resolver is injectable via a package var for
deterministic, network-free tests.
@Dumbris Dumbris closed this Jun 22, 2026
@Dumbris Dumbris reopened this Jun 22, 2026

@mcpproxy-gatekeeper mcpproxy-gatekeeper Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Gatekeeper approval — MCP-1076 SSRF hardening. KimiReviewer verdict ACCEPT on head 1ae8c20 (model-diverse): three-layer defense (literal-IP pre-flight + app-layer DNS resolution that holds under HTTP_PROXY + dial-time control for DNS-rebinding), with a proxy-bypass regression test covering exactly the path CodexReviewer flagged in round 1. CI fully green (arm64 + Unit + E2E + CodeQL). Deny-by-default; opt-in via allow_private_registry_fetch. Author (BackendEngineer) ≠ approver.

@Dumbris Dumbris merged commit 89f06b5 into main Jun 22, 2026
74 checks passed
Dumbris added a commit that referenced this pull request Jun 24, 2026
The allow_private_registry_fetch flag is all-or-nothing: enabling it lifts
the registry SSRF guard for every non-routable range at once, including the
169.254.169.254 cloud-metadata endpoint. Enabling it for a localhost dev
registry therefore also re-opens the cloud-metadata SSRF vector.

Document this sharp edge in the config doc comment, docs/configuration.md,
and docs/registries.md, plus note the flag only takes effect on daemon
(re)start / config reload.

Related #745
Related MCP-3206
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants